Making MySQL more usable: InnoDB information_schema patches

Lately I’ve been working on a series of patches to increase the usability of InnoDB, based on the InnoDB plugin for MySQL 5.1. I have a lot of ideas still, but I’ve currently gotten two proof of concept patches working. Both of the patches require a small patch to relocate the buf_chunk_struct struct definition which was inadvertently defined in buf0buf.c rather than buf0buf.h.

Most patches are considered rough draft, proof of concept quality at best. They are suitable for testing, playing around, using them to gain insight about your data, but not for production use. Use at your own risk!

Using information_schema to view the buffer pool contents

This patch creates several new information_schema tables to provide visibility into the contents of the InnoDB buffer pool. The new tables are:

  • innodb_buffer_pool_pages
  • innodb_buffer_pool_pages_index
  • innodb_buffer_pool_pages_blob

Note that accessing these tables is not cheap, as the buffer pool must be scanned.

innodb_buffer_pool_pages

This table provides some generic information about all pages present in the buffer pool, and is primarily useful for getting statistics about the number and types of pages present at any given point in time. For example:

SELECT
  page_type,
  count(*) AS nr
FROM information_schema.innodb_buffer_pool_pages
GROUP BY page_type;

+-----------+------+
| page_type | nr   |
+-----------+------+
| allocated | 4051 |
| bitmap    |    2 |
| blob      |    5 |
| fsp_hdr   |    2 |
| index     |   14 |
| inode     |    2 |
| sys       |    3 |
| trx_sys   |    1 |
| undo_log  |   15 |
+-----------+------+

innodb_buffer_pool_pages_index

This table gives you detailed statistics on index pages, which for InnoDB means all data (as part of the primary key) and all defined indexes. For example (some columns removed for clarity):

SELECT *
FROM information_schema.innodb_buffer_pool_pages_index;

+-------------+------------------+------------+----------+---------+--------+-----------+--------------+...
| schema_name | table_name       | index_name | space_id | page_no | n_recs | data_size | lru_position |...
+-------------+------------------+------------+----------+---------+--------+-----------+--------------+...
| NULL        | SYS_IBUF_TABLE_0 | CLUST_IND  |        0 |       4 |      0 |         0 |            6 |...
| NULL        | SYS_INDEXES      | CLUST_IND  |        0 |      11 |      8 |       536 |            7 |...
| NULL        | SYS_TABLES       | CLUST_IND  |        0 |       8 |      5 |       351 |           12 |...
| NULL        | SYS_COLUMNS      | CLUST_IND  |        0 |      10 |     24 |      1436 |           14 |...
| NULL        | SYS_TABLES       | ID_IND     |        0 |       9 |      5 |       126 |           15 |...
| NULL        | SYS_FIELDS       | CLUST_IND  |        0 |      12 |      8 |       315 |           16 |...
| NULL        | SYS_FOREIGN      | ID_IND     |        0 |      46 |      0 |         0 |           50 |...
| NULL        | SYS_FOREIGN      | FOR_IND    |        0 |      47 |      0 |         0 |           51 |...
| NULL        | SYS_FOREIGN      | REF_IND    |        0 |      48 |      0 |         0 |           52 |...
| NULL        | SYS_FOREIGN_COLS | ID_IND     |        0 |      49 |      0 |         0 |           53 |...
| test        | bt               | PRIMARY    |       26 |       3 |      3 |        39 |           66 |...
| test        | bt               | PRIMARY    |       26 |       7 |      1 |      8124 |           69 |...
| test        | bt               | PRIMARY    |       26 |       9 |      2 |     16248 |           71 |...
| test        | bt               | PRIMARY    |       26 |       6 |      2 |     16248 |           74 |...
+-------------+------------------+------------+----------+---------+--------+-----------+--------------+...

Besides the obvious value of this, notice that you can see InnoDB’s internal tables (with schema_name as NULL) as well!

innodb_buffer_pool_pages_blob

This table gives you detailed statistics on BLOB-storage pages, which store parts of BLOBs that overflow in-row storage (another post on that later, but suffice it to say, if the BLOB is larger than 8KB… sort of). For example:

SELECT *
FROM information_schema.innodb_buffer_pool_pages_blob;

+----------+---------+------------+----------+--------------+--------------+-----------+------------+
| space_id | page_no | compressed | part_len | next_page_no | lru_position | fix_count | flush_type |
+----------+---------+------------+----------+--------------+--------------+-----------+------------+
|       26 |       5 |          0 |     9232 |            0 |           72 |         0 |          0 |
|       26 |      10 |          0 |     9232 |            0 |           73 |         0 |          0 |
|       26 |       8 |          0 |     9232 |            0 |           75 |         0 |          0 |
|       26 |      11 |          0 |     9232 |            0 |           76 |         0 |          0 |
|       26 |       4 |          0 |     9232 |            0 |           77 |         0 |          0 |
+----------+---------+------------+----------+--------------+--------------+-----------+------------+

We’d love to get the table name and PRIMARY KEY value associated with each BLOB-storage page, but I’m not sure if that’s possible.

An example of the power of these tables

Have you ever wondered what your page fill rate is? Probably not, because you didn’t realize it was either interesting or important, but we worry about these things (and often justifiably so). In short, page fill rate is the percentage of each page that is filled with data. Since InnoDB only caches whole pages, if your average page fill rate is 75%, you would be wasting 25% of the space in each page. That would mean that your table is spread across more pages than necessary, which will waste both disk space and memory you’ve allocated to the buffer pool.

Here’s a simple way to figure out the average page fill rate of pages currently in cache:

SELECT
  schema_name,
  table_name,
  index_name,
  COUNT(*) AS n_pages,
  ROUND(AVG(n_recs), 2) AS recs_per_page,
  ROUND(SUM(data_size)/1048576, 2) AS total_data_size_M,
  ROUND(AVG(data_size/(
    SELECT variable_value
    FROM information_schema.global_status
    WHERE variable_name
    LIKE 'innodb_page_size'
  )) * 100, 2) AS page_fill_pct
FROM
  information_schema.innodb_buffer_pool_pages_index
WHERE 1
AND schema_name IS NOT NULL
GROUP BY
  schema_name,
  table_name,
  index_name
ORDER BY
  total_data_size_M DESC;

+-------------+------------+-----------------+---------+---------------+-------------------+---------------+
| schema_name | table_name | index_name      | n_pages | recs_per_page | total_data_size_M | page_fill_pct |
+-------------+------------+-----------------+---------+---------------+-------------------+---------------+
| test        | n          | GEN_CLUST_INDEX |    4079 |        252.55 |             58.56 |         91.88 |
| test        | bt         | PRIMARY         |       2 |          2.50 |              0.02 |         49.70 |
| test        | p          | PRIMARY         |       1 |          2.00 |              0.00 |          0.52 |
| test        | p          | c               |       1 |          2.00 |              0.00 |          0.37 |
+-------------+------------+-----------------+---------+---------------+-------------------+---------------+

This is showing some statistics for each index from each table in the cache: the number of pages in the cache, the average number of records per page for those cached pages, the total number of megabytes of cache used, and the page fill rate in percent. There is currently no other way to calculate this information from any statistics or data that InnoDB provides.

What else?

What ideas do you have? What do you think about these features? Leave a comment to let us know!

2 comments:

  1. Mark Callaghan
    July 30, 2008 @ 06:13 PDT

    This is great. I will be checking out the patch soon.

  2. Proven Scaling Blog » Making MySQL more usable: InnoDB save/restore buffer pool patch
    October 6, 2008 @ 17:38 PDT (Pingback)

    [...] the pages, it is perfectly safe1 and fairly efficient2. Note that you should probably also use the information_schema patch to be able to see that this patch is actually [...]

Write a comment: