Everything MySQL

A great place to be!

Basics of MySQL failover (using replication)

For easy MySQL failover, replication is a great solution. This post will cover the basics of performing a failover in case your master dies.

First, setup one-way replication (or two-way, but don't plan to use both servers for writes at the same time). Next, you'll want to direct all activity, or at least the writes, to the master. If the master dies, there are two major concerns:

  1. Redirecting the clients to the slave. There are several ways to handle this, such as Heartbeat or MySQL's JDBC driver.
  2. Checking if the slave is caught up. This is trickier. If the master's binary logs are still available, then you can do SHOW SLAVE STATUS on the slave and compare the READ_MASTER_LOG_POS to the master's binary log. For example, if SHOW SLAVE STATUS shows:
    Master_Log_File: localhost-bin.000051
    Read_Master_Log_Pos: 605

    Then look at the master's binary log files. A long listing will do the trick:

    ls -l *bin*
    -rw-rw---- 1 mysql mysql 698 2009-10-08 18:24 localhost-bin.000051


    Note that the binary log has a size of 698 bytes. Yet the slave only read up to byte position 605. So you can manually execute the activity that the slave missed:
    mysqlbinlog --start-position=605 localhost-bin.000051 | mysql -h slave-host-name

You may want to manually get the slave up-to-date before failing over. Or you can keep the slave read-only until you catch the slave up, then turn off read-only by doing SET GLOBAL read_only=0;

You may want to setup replication in both directions (sometimes called "master-master" or two-way replication). This is a tricky setup if both servers are actively doing updates. But if you know that only one of the servers is updating at any one time, then it works quite well. Once you've failed-over to the slave, it is now acting as a master. Any updates done on this server can be replicated back to the other server when it comes back up.

This solution is not perfect and may not be suitable for all situations. There is potential data loss (possibly unknown data loss) if the binary logs are no longer available on the master. For some scenarios it is better to have additional means of failover such as MySQL Cluster or DRBD for localized failures.

Views: 11342

Comment

You need to be a member of Everything MySQL to add comments!

Join Everything MySQL

Comment by Chris on October 22, 2009 at 11:05am
Not sure I agree with manual intervention here. Heartbeat, keepalive or any other automated mechanism would be optimal for a few reasons:

1. Lets say you don't want to be woken up at some awful time of night or morning.
2. Lets say you are the only DBA and you are in a place with no internet access
3. Lets say you have 50, 100, 1000, 5000 servers... multiply that by the number of probable system failures and you have an administrative nightmare.

Automated fail over is key to limit customer facing downtime plus heartbeat and keepalive are great tools with the right implementation logic.
Comment by Sheeri Cabral on October 21, 2009 at 1:09pm
we've found that fully automated solutions end up having their own problems, and that the best solution for most customers ends up involving manual intervention. Basically, have a master/master setup, where there are service names (separate A records that can be moved). Writes go to only one server (the "primary" server) and there's a standby (the "secondary" server).

If the primary dies, or needs to be taken offline for maintenance, the secondary server can be promoted easily by moving the A records. No application issues, no heartbeat issues (it's finicky software!), and it only takes about a minute to change networking on both servers.

RSS

© 2024   Created by Chris.   Powered by

Badges  |  Report an Issue  |  Terms of Service