Archive for the 'Usability' Category

Making MySQL more usable: Stored Procedure call stack on error

MySQL’s Stored Procedure implementation is far from perfect. One of the major problems has been in debugging tools and information available while writing and using the procedures you’ve written. In particular there is no real way to return customized error messages, e.g. RAISE or SIGNAL, and no way to find out what error actually occurred, e.g. accessing SQLSTATE. However, even with the standard error messages that MySQL already produces, there is another major problem: as a caller of the stored procedure, you can’t tell where the error actually came from.

In order to demonstrate this, let’s first create a few stored procedures that call each other in order to make things interesting:

use test;

delimiter ;;
drop procedure if exists s_test;;
create procedure s_test ()
begin
  call s_test_foo();
end;;

drop procedure if exists s_test_foo;;
create procedure s_test_foo ()
begin
  call s_test_bar();
end;;

drop procedure if exists s_test_bar;;
create procedure s_test_bar ()
begin
  select 1 from blarbo;
end;;

delimiter ;

Notice that the final stored procedure tries to select from a table that we haven’t mentioned yet; that’s because that table doesn’t exist. This should generate an error for us reliably. Here’s what happens when we call the stored procedure:

call test.s_test();

ERROR 1146 (42S02): Table 'test.blarbo' doesn't exist

Now that’s not very helpful: the user didn’t try to do anything with test.blarbo so he has no idea what has just happened! Technically it would be much better if the s_test_bar procedure could have generated a custom error message… I’ll leave that for a future post.

In the past I’ve discussed a patch to add a CALLER() function, and this patch is a natural evolution of that code. We’ve written a proof of concept patch to add a SHOW ERROR STACK TRACE command. This works by saving the call stack, which is tracked using the code from the old CALLER() patch, whenever an error occurs. Let’s try it out, after receiving any error from a SQL statement, you may run SHOW ERROR STACK TRACE as follows:

SHOW ERROR STACK TRACE;

+-------+----------------------+
| Depth | Query                |
+-------+----------------------+
|     0 | select 1 from blarbo |
|     1 | call s_test_bar()    |
|     2 | call s_test_foo()    |
|     3 | call test.s_test()   |
+-------+----------------------+
4 rows in set (0.00 sec)

That’s certainly more useful! I’ve still got a few ideas on how to improve it even more:

  • Add something like e.g. SHOW ERROR VARIABLES FOR 1 which could return the contents of all local variables from a particular stack level from the stack trace.
  • Add something like e.g. SHOW ERROR CONTEXT FOR 1 which could return the few lines of code before and after the error from a particular stack level from the stack trace.

Essentially, it would be great if you could do most of what gdb is capable of when debugging C code. I hope you like it!

Max allowed packet and connection pooling

The MySQL configuration variable max_allowed_packet controls how large a single MySQL protocol packet (not a network/TCP packet) can be. This variable can be changed to solve “Packet too large” errors typically caused by excessively long query strings, such as large multi-row INSERT statements. It’s both a session and global variable meaning that different sessions can set their own limits for for max_allowed_packet.

It’s often assumed that changing global variables in MySQL means that the value changes for each connection. However, in the case of max_allowed_packet and many others, the local setting of the variable is inherited from the global variable when a connection is first established. This has special implications when using connection pooling.

In a connection pooling environment, connections are pooled on the application server. When the application needs a connection to the MySQL server, it borrows it from the pool, and returns it when finished. Typically when a connection is borrowed from the pool, the connection pool library automatically calls its API’s variant of the “change user” API function (which uses the COM_CHANGE_USER protocol command) to reset some connection specific parameters.

The session-level max_allowed_packet setting for a connection is normally inherited from the global setting when a client first connects, but it isn’t copied when the user is changed. This means that even though the global max_allowed_packet may have been changed, the applications which are using those connections may not pick up the change until the connection pool’s connections are forced to reconnect, usually by restarting the application server.

Making MySQL more usable: InnoDB save/restore buffer pool patch

A feature that I and many others have wanted for years is to be able to save the contents of InnoDB’s buffer pool on shutdown, and restore it on startup. Why? After the system has been running for some time, it has spent a lot of effort perfecting the contents of the buffer pool, and exactly the right pages are now cached. When you shut down, that cache is lost, and on restart, InnoDB will have to start over from scratch. This process is often referred to as “warming up” the caches. This can cause a simple and quick server restart to have an effect on your production system for hours, or in the worst case days.

A good analogy to this cold startup process is that of starting from a dead stop in manual transmission car: While 5th gear is great for cruising at highway speeds, it’s not really suited for getting you up to speed; that’s what 1st through 4th gear are for. In most cars that aren’t too underpowered, with enough finessing and cajoling you can start from a dead stop in any gear… revving the engine up high to keep it from stalling, and gently easing the clutch in to get rolling… but it’s not pleasant. Besides the unpleasantness, it also takes much longer to get up to highway speeds compared to progressing through the gears in the normal way. Additionally, you may start off rolling just fine, but some time before you’ve gotten up to speed, you hit a small hill—which in the correct gear would be no problem. In the wrong gear, the car slows down, starts chugging, and eventually the engine stalls.

Most database systems behave much like cars with only 5th gear. When tuned properly and appropriately sized, they cruise along perfectly well at “highway speeds”; when the caches are “hot” and finely tuned. They are, in the end, largely designed to operate from an in-memory cache; cache misses, and thus disk reads, should be a relative rarity. If you have too many cache misses, performance is terrible, and the system gets backed up and potentially overloaded. We’ve seen all too many times the situation that occurs when a server (say, a read slave from a pool of many) is rebooted with cold caches, and then immediately put into service in a busy environment. All hell breaks loose.

I have written a proof of concept patch (for InnoDB plugin for MySQL 5.1) which adds a facility (admittedly quite hacky in this draft) to save the list of cached pages to a text file on shutdown. Then, on startup, it restores those pages back into the cache. Since it saves the pages by page number, rather than by saving the contents of the pages, it is perfectly safe1 and fairly efficient2. Note that you should probably also use the information_schema patch to be able to see that this patch is actually working.

While this feature isn’t nearly as sexy to show demos of in a static blog, it has a lot of potential in a production environment. I intend to do a lot more development on it, to provide at least:

  • A facility for changing the file name/path for the save file
  • The ability to skip the restore step, optionally
  • The ability to restore only part of the cache before accepting user connections, deferring the remainder of the cache to a background thread which continues running after startup

Some other ideas I’ve had:

  • Use multiple threads to read the pages, which should make much better use of RAIDs
  • Add runtime commands to save/restore the buffer pool state from a running server, without shutting down
  • Add an option to save some of the statistics about the pages so that they can be restored “smarter” rather than just have all pages stuffed back in the cache with equal priority

Does this sound like a useful feature? What other features should it have? Do you have problems with cold startup in production?

1 In the worst case, if the file is complete gibberish, the requested pages either won’t exist, or will contain irrelevant data. Either case should do no harm. It wouldn’t be a bad idea to do some more niceties around the file format: a checksum, magic at the beginning of the file, etc. The current file format especially is very much a proof of concept.

2 There’s still a lot of work to do to make things more efficient: ordering pages, using multiple threads to read from different tablespaces, subdividing reads for a tablespace into multiple threads, the possibilities are endless.

Making MySQL more usable: InnoDB information_schema patches

Lately I’ve been working on a series of patches to increase the usability of InnoDB, based on the InnoDB plugin for MySQL 5.1. I have a lot of ideas still, but I’ve currently gotten two proof of concept patches working. Both of the patches require a small patch to relocate the buf_chunk_struct struct definition which was inadvertently defined in buf0buf.c rather than buf0buf.h.

Most patches are considered rough draft, proof of concept quality at best. They are suitable for testing, playing around, using them to gain insight about your data, but not for production use. Use at your own risk!

Using information_schema to view the buffer pool contents

This patch creates several new information_schema tables to provide visibility into the contents of the InnoDB buffer pool. The new tables are:

  • innodb_buffer_pool_pages
  • innodb_buffer_pool_pages_index
  • innodb_buffer_pool_pages_blob

Note that accessing these tables is not cheap, as the buffer pool must be scanned.

innodb_buffer_pool_pages

This table provides some generic information about all pages present in the buffer pool, and is primarily useful for getting statistics about the number and types of pages present at any given point in time. For example:

SELECT
  page_type,
  count(*) AS nr
FROM information_schema.innodb_buffer_pool_pages
GROUP BY page_type;

+-----------+------+
| page_type | nr   |
+-----------+------+
| allocated | 4051 |
| bitmap    |    2 |
| blob      |    5 |
| fsp_hdr   |    2 |
| index     |   14 |
| inode     |    2 |
| sys       |    3 |
| trx_sys   |    1 |
| undo_log  |   15 |
+-----------+------+

innodb_buffer_pool_pages_index

This table gives you detailed statistics on index pages, which for InnoDB means all data (as part of the primary key) and all defined indexes. For example (some columns removed for clarity):

SELECT *
FROM information_schema.innodb_buffer_pool_pages_index;

+-------------+------------------+------------+----------+---------+--------+-----------+--------------+...
| schema_name | table_name       | index_name | space_id | page_no | n_recs | data_size | lru_position |...
+-------------+------------------+------------+----------+---------+--------+-----------+--------------+...
| NULL        | SYS_IBUF_TABLE_0 | CLUST_IND  |        0 |       4 |      0 |         0 |            6 |...
| NULL        | SYS_INDEXES      | CLUST_IND  |        0 |      11 |      8 |       536 |            7 |...
| NULL        | SYS_TABLES       | CLUST_IND  |        0 |       8 |      5 |       351 |           12 |...
| NULL        | SYS_COLUMNS      | CLUST_IND  |        0 |      10 |     24 |      1436 |           14 |...
| NULL        | SYS_TABLES       | ID_IND     |        0 |       9 |      5 |       126 |           15 |...
| NULL        | SYS_FIELDS       | CLUST_IND  |        0 |      12 |      8 |       315 |           16 |...
| NULL        | SYS_FOREIGN      | ID_IND     |        0 |      46 |      0 |         0 |           50 |...
| NULL        | SYS_FOREIGN      | FOR_IND    |        0 |      47 |      0 |         0 |           51 |...
| NULL        | SYS_FOREIGN      | REF_IND    |        0 |      48 |      0 |         0 |           52 |...
| NULL        | SYS_FOREIGN_COLS | ID_IND     |        0 |      49 |      0 |         0 |           53 |...
| test        | bt               | PRIMARY    |       26 |       3 |      3 |        39 |           66 |...
| test        | bt               | PRIMARY    |       26 |       7 |      1 |      8124 |           69 |...
| test        | bt               | PRIMARY    |       26 |       9 |      2 |     16248 |           71 |...
| test        | bt               | PRIMARY    |       26 |       6 |      2 |     16248 |           74 |...
+-------------+------------------+------------+----------+---------+--------+-----------+--------------+...

Besides the obvious value of this, notice that you can see InnoDB’s internal tables (with schema_name as NULL) as well!

innodb_buffer_pool_pages_blob

This table gives you detailed statistics on BLOB-storage pages, which store parts of BLOBs that overflow in-row storage (another post on that later, but suffice it to say, if the BLOB is larger than 8KB… sort of). For example:

SELECT *
FROM information_schema.innodb_buffer_pool_pages_blob;

+----------+---------+------------+----------+--------------+--------------+-----------+------------+
| space_id | page_no | compressed | part_len | next_page_no | lru_position | fix_count | flush_type |
+----------+---------+------------+----------+--------------+--------------+-----------+------------+
|       26 |       5 |          0 |     9232 |            0 |           72 |         0 |          0 |
|       26 |      10 |          0 |     9232 |            0 |           73 |         0 |          0 |
|       26 |       8 |          0 |     9232 |            0 |           75 |         0 |          0 |
|       26 |      11 |          0 |     9232 |            0 |           76 |         0 |          0 |
|       26 |       4 |          0 |     9232 |            0 |           77 |         0 |          0 |
+----------+---------+------------+----------+--------------+--------------+-----------+------------+

We’d love to get the table name and PRIMARY KEY value associated with each BLOB-storage page, but I’m not sure if that’s possible.

An example of the power of these tables

Have you ever wondered what your page fill rate is? Probably not, because you didn’t realize it was either interesting or important, but we worry about these things (and often justifiably so). In short, page fill rate is the percentage of each page that is filled with data. Since InnoDB only caches whole pages, if your average page fill rate is 75%, you would be wasting 25% of the space in each page. That would mean that your table is spread across more pages than necessary, which will waste both disk space and memory you’ve allocated to the buffer pool.

Here’s a simple way to figure out the average page fill rate of pages currently in cache:

SELECT
  schema_name,
  table_name,
  index_name,
  COUNT(*) AS n_pages,
  ROUND(AVG(n_recs), 2) AS recs_per_page,
  ROUND(SUM(data_size)/1048576, 2) AS total_data_size_M,
  ROUND(AVG(data_size/(
    SELECT variable_value
    FROM information_schema.global_status
    WHERE variable_name
    LIKE 'innodb_page_size'
  )) * 100, 2) AS page_fill_pct
FROM
  information_schema.innodb_buffer_pool_pages_index
WHERE 1
AND schema_name IS NOT NULL
GROUP BY
  schema_name,
  table_name,
  index_name
ORDER BY
  total_data_size_M DESC;

+-------------+------------+-----------------+---------+---------------+-------------------+---------------+
| schema_name | table_name | index_name      | n_pages | recs_per_page | total_data_size_M | page_fill_pct |
+-------------+------------+-----------------+---------+---------------+-------------------+---------------+
| test        | n          | GEN_CLUST_INDEX |    4079 |        252.55 |             58.56 |         91.88 |
| test        | bt         | PRIMARY         |       2 |          2.50 |              0.02 |         49.70 |
| test        | p          | PRIMARY         |       1 |          2.00 |              0.00 |          0.52 |
| test        | p          | c               |       1 |          2.00 |              0.00 |          0.37 |
+-------------+------------+-----------------+---------+---------------+-------------------+---------------+

This is showing some statistics for each index from each table in the cache: the number of pages in the cache, the average number of records per page for those cached pages, the total number of megabytes of cache used, and the page fill rate in percent. There is currently no other way to calculate this information from any statistics or data that InnoDB provides.

What else?

What ideas do you have? What do you think about these features? Leave a comment to let us know!