Making MySQL more usable: InnoDB save/restore buffer pool patch
A feature that I and many others have wanted for years is to be able to save the contents of InnoDB’s buffer pool on shutdown, and restore it on startup. Why? After the system has been running for some time, it has spent a lot of effort perfecting the contents of the buffer pool, and exactly the right pages are now cached. When you shut down, that cache is lost, and on restart, InnoDB will have to start over from scratch. This process is often referred to as “warming up” the caches. This can cause a simple and quick server restart to have an effect on your production system for hours, or in the worst case days.
A good analogy to this cold startup process is that of starting from a dead stop in manual transmission car: While 5th gear is great for cruising at highway speeds, it’s not really suited for getting you up to speed; that’s what 1st through 4th gear are for. In most cars that aren’t too underpowered, with enough finessing and cajoling you can start from a dead stop in any gear… revving the engine up high to keep it from stalling, and gently easing the clutch in to get rolling… but it’s not pleasant. Besides the unpleasantness, it also takes much longer to get up to highway speeds compared to progressing through the gears in the normal way. Additionally, you may start off rolling just fine, but some time before you’ve gotten up to speed, you hit a small hill—which in the correct gear would be no problem. In the wrong gear, the car slows down, starts chugging, and eventually the engine stalls.
Most database systems behave much like cars with only 5th gear. When tuned properly and appropriately sized, they cruise along perfectly well at “highway speeds”; when the caches are “hot” and finely tuned. They are, in the end, largely designed to operate from an in-memory cache; cache misses, and thus disk reads, should be a relative rarity. If you have too many cache misses, performance is terrible, and the system gets backed up and potentially overloaded. We’ve seen all too many times the situation that occurs when a server (say, a read slave from a pool of many) is rebooted with cold caches, and then immediately put into service in a busy environment. All hell breaks loose.
I have written a proof of concept patch (for InnoDB plugin for MySQL 5.1) which adds a facility (admittedly quite hacky in this draft) to save the list of cached pages to a text file on shutdown. Then, on startup, it restores those pages back into the cache. Since it saves the pages by page number, rather than by saving the contents of the pages, it is perfectly safe1 and fairly efficient2. Note that you should probably also use the information_schema patch to be able to see that this patch is actually working.
While this feature isn’t nearly as sexy to show demos of in a static blog, it has a lot of potential in a production environment. I intend to do a lot more development on it, to provide at least:
- A facility for changing the file name/path for the save file
- The ability to skip the restore step, optionally
- The ability to restore only part of the cache before accepting user connections, deferring the remainder of the cache to a background thread which continues running after startup
Some other ideas I’ve had:
- Use multiple threads to read the pages, which should make much better use of RAIDs
- Add runtime commands to save/restore the buffer pool state from a running server, without shutting down
- Add an option to save some of the statistics about the pages so that they can be restored “smarter” rather than just have all pages stuffed back in the cache with equal priority
Does this sound like a useful feature? What other features should it have? Do you have problems with cold startup in production?
1 In the worst case, if the file is complete gibberish, the requested pages either won’t exist, or will contain irrelevant data. Either case should do no harm. It wouldn’t be a bad idea to do some more niceties around the file format: a checksum, magic at the beginning of the file, etc. The current file format especially is very much a proof of concept.
2 There’s still a lot of work to do to make things more efficient: ordering pages, using multiple threads to read from different tablespaces, subdividing reads for a tablespace into multiple threads, the possibilities are endless.

I a public note (in addition to the conversations we’ve had on this in the past) that I think it’s a good idea and at least worthwhile pursuing to the point where people can try it in production to assess the expected benefits.
It would be important to also get this patch working for the InnoDB 5.1 plugin.
Nice patch! I can’t tell from the patch if it will write the pages in the buffer pool in the order they would be evicted from the LRU algorithm? In which case, there might still be a little bit of cache ’settling’ upon restore.
The runtime commands could be very helpful for those using DRBD & MySQL where cold caches (filesystem, storage engines) is a real hard thing to factor into recovery time.
Arjen,
This patch is actually against the InnoDB plugin for 5.1. The only thing indicating that would be the URL of the patch, but I’ll clarify the post. I should try backporting it to the MySQL 5.0 tree… but since the page organization in memory has changed a bit, it wouldn’t be a perfectly simple thing.
Morgan,
You’re quite correct, there would be a bit of settling upon restore. As per my comment above about saving statistics to make things “smarter” I think it’s solvable and might be worth doing. If nothing else, the benefit of saving the LRU info would be that the “restored” pages wouldn’t compete as much with brand new pages coming in to solve random queries.
Regards,
Jeremy
You guys rock!
Awesome! We already have clients using 5.1
Does this patch lead the way for a plugin that could show the *contents* of the cache? that’d be pretty cool, to be able to see how the cache changes over time…….and to see if certain queries are in the cache when you expect them to be, etc.
Sheeri,
The last patch I released, information_schema patch, as linked and discussed briefly in the post, does so.
Regards,
Jeremy
Excellent… this would be especially helpful when you need to bounce the server for a quick change.
hi,Jeremy
there is a problem with it, when i shutdown mysqld and change the value of innodb_buffer_pool_size, mysqld can’t start and restore the buffer pool,some of the error log like this:
081008 9:53:39InnoDB: Warning: difficult to find free blocks from
InnoDB: the buffer pool (584 search iterations)! Consider
InnoDB: increasing the buffer pool size.
InnoDB: It is also possible that in your Unix version
InnoDB: fsync is very slow, or completely frozen inside
InnoDB: the OS kernel. Then upgrading to a newer version
InnoDB: of your operating system may help. Look at the
InnoDB: number of fsyncs in diagnostic info below.
InnoDB: Pending flushes (fsync) log: 0; buffer pool: 0
InnoDB: 210393 OS file reads, 7 OS file writes, 7 OS fsyncs
InnoDB: Starting InnoDB Monitor to print further
InnoDB: diagnostics to the standard output.
thanks much
DO WANT!!!!!!!!
This would be a reason to upgrade
There are so many patches and forks now that I can’t brain no more.
yejr,
I wouldn’t be surprised if this is a bug, if you’ve reduced the size of the buffer pool and then tried to restore state from a time when the buffer pool was larger. I’ll check into what we can do to be smarter there. Likely case would be to save the size of the buffer pool in the state file so that some intelligent decisions can be made on restart.
Regards,
Jeremy
hi,Jeremy
yes,u’r right.after i remove ib_buf_pool_state file,it is ok
Jeremy
there is a new problem, when i set innodb_buffer_pool_size as 14G, and fully fill to pool by mysqldump all database, shutdown mysqld, the size of ib_buf_pool_state is about 19Mb.
mysqldump -f -q test sbtest | gzip > /data/test.sql.gz
sbtest table fill in with 100,000,000 rows by sysbench.
ls -l ib_buf_pool_state
-rw-rw—- 1 mysql mysql 19545127 Oct 8 14:34 ib_buf_pool_state
wc -l ib_buf_pool_state
917439 ib_buf_pool_state
it took more than two hour to start mysqld, a so long time to me.
some information about my server:
dell 2950, Intel(R) Xeon(R) CPU E5335 @ 2.00GHz * 7, 16Gb Ram, MD3000(dell) storage array
thanks much
very cool. this morning I thought about the exact same thing.
sounded like a relatively easy hack, nice to see someone already thought about it.
any chance of this getting into official mysql code at some point?
Something similar to the functionality of your patch was integrated into the XtraDB-9 release:
http://www.mysqlperformanceblog.com/2010/01/20/xtradb-feature-save-restore-buffer-pool/
This patch was the basis for the patch that was integrated into xtradb.