MySQL replication is a great feature: it's easy to setup and relatively easy to manage, it can provide scalabilty and availability, a master can serve dozens of slaves, etc. However, sometimes people have the problem that the slaves can't keep up; they get further and further behind, which is a major problem. So this post talks about why that happens (sometimes) and what you can do about it.
Here are a few solutions to this problem. If you think of some that I forgot, please leave a comment!
- The most common reason that a slave cannot keep up with the master is the SQL Thread. There are 2 threads on the slave that handle replication: the IO Thread which connects to the master and pulls down the binary log events, and the SQL Thread which executes these events on the slave. But the master might have dozens of concurrent threads executing inserts/updates/deletes. So when they get to the slave, they are run in a single thread which may not be able to do the same throughput. There's a good reason for the single thread on the slave - the slave must ensure that the statements execute in the same order they did on the master, and if it was multi-threaded, there'd be no guarantee of execution order.
- Another reason a slave can't keep up could be locking contention. This usually happens if you're using MyISAM (on master/slave or on slave only). MyISAM uses table-level locking with locking selects, so if a user is querying the table on the slave, the SQL thread will block trying to execute updates.
- Yet a third reason could be the hardware on the slave is insufficient. Sometimes people think "Oh, it's just a slave" and they use cheaper/older hardware. In reality, the slave often needs a bigger machine (more RAM, faster disk, not necessarily more CPUs but fast CPU). This is especially the case if you're using the slave for reads as well as making it do replication.
- There are MySQL settings that could be used on the slave to make it faster, but sacrifice integrity. Two useful ones are: innodb_flush_log_at_trx_commit = 0 and delay_key_write=1
- Change the schema on the slave. Possibilities could include: different storage engine, less indexes, no foreign key or unique constraints
- Consider buying bigger hardware for the slave
- Break up the slave into separate slaves. Use replicate_do_db or related settings to replicate different databases to separate slaves. This may add complexity to the application though.
MySQL is working on this. For example, see Worklog #4648
. And of course MySQL is open source, so feel free to implement a better solution that would work for your application.