How mysqldump locks tables with different options
The mysqldump program, which ships with MySQL, takes data from existing tables and turns it into SQL statements which can then be used to re-create the data later. It has several different options which dictate its behavior for collecting data. Some of the most important change its locking behaviour; depending on the options passed to the mysqldump command, it may need to lock tables for the entire duration of the dumping process, briefly at the beginning, or not at all.
When using mysqldump for backups with a MySQL replication slave, the master-data option is often required in order to save the slave replication log file name and position in the backup. By default, the master-data option will execute a FLUSH TABLES WITH READ LOCK command—which disallows writes to any table—and will keep the read lock for the entire duration of the dump to ensure that the replication position won’t change during the dump. This can be problematic, because any clients trying to write to tables during the dump process are locked out. With MyISAM, this is necessary, but with InnoDB it isn’t.
With a little help from InnoDB’s transaction and multi-versioning support, it’s possible to keep the read lock for a very short period of time—just long enough to quiesce the system and check the replication status. InnoDB guarantees that a transaction will see a consistent view of the database after a transaction has started. If all of the tables being dumped are InnoDB, using the single-transaction flag to mysqldump will instruct it to only keep a read lock long enough to gather the replication state and start a transaction. After the transaction has been started, the read lock is released so that the dump and other clients can continue without locking each other out.
On the topic of consistency for a dump, there is an important distinction between the lock-all-tables and lock-tables options: With the lock-all-tables option, mysqldump will use the same type of global read lock it uses when gathering the replication position. This can be used to ensure a consistent dump when not using replication. The lock-tables option, however, issues a LOCK TABLES query for each database it’s dumping. If you choose multiple databases to dump, each database will be locked and unlocked in turn. This introduces a race condition: modifications can be made to database A while database B is locked and being dumped. If you’re using InnoDB, single-transaction is a much better option.
Most people worry about the total time it takes to perform a backup, but the most important timing aspect of any backup is not the time it takes to create the backup, but the time it takes to restore it. Nonetheless, Maatkit includes mk-parallel-dump and mk-parallel-restore which offer a nice speed boost over the traditional MySQL tools.
