Archive for July, 2008

Understanding MySQL version numbers

We provide quite a few different MySQL releases on the Proven Scaling Mirror. Since MySQL 5.0 is split into “Community” and “Enterprise”, navigating all of those different releases can be very confusing, making it difficult to choose which release is right for you.

There are three main branches of MySQL being produced by Sun/MySQL: Community, Enterprise MRU, and Enterprise QSP. All Community and Enterprise releases are distributed under the terms of the GPLv2 license. The changes between each release are tracked in the MySQL manual. The difference between these branches are subtle but important.

MySQL Community is a branch of MySQL that is released less frequently, but also contains contributions from developers outside of Sun/MySQL. So far the only notable feature in the Community release is SHOW PROFILE, a query-profiling feature originally developed by Proven Scaling. Unless you want the SHOW PROFILE feature, there is currently no reason to use this release instead of a newer Enterprise release.

MySQL Enterprise MRU (commonly called just MySQL Enterprise) is a “Monthly Rapid Update” that is regularly updated with bug fixes and changes that were developed by MySQL during the previous month. According to MySQL’s guidance, MRU releases should only be used when a critical fix is needed that hasn’t yet been included in a QSP release. MRU releases don’t have any special naming scheme. At the time of this post the version number of the most recent release is 5.0.66a. You can find the source or binaries of this version on the Proven Scaling mirror.

MySQL Enterprise QSP is a “Quarterly Service Pack” which is released once per quarter. This release contains all the changes in the MRU releases since the previous QSP release. QSP releases can also contain critical bug fixes from MRU releases with higher version numbers than the QSP has. For instance, 5.0.56sp1 contains fixes from 5.0.58 — this can make it somewhat more difficult to understand whether or not you’re affected by a particular bug. Other critical fixes may only be in MRU releases until the next scheduled QSP release. You can find the source or binaries of this version on the Proven Scaling mirror.

The format of the release notes pages in the manual has changed between 5.0.56sp1 and 5.0.60sp1. In version 5.0.56sp1 the release notes page contained all the changes in the previous MRU releases since the last QSP. As of 5.0.60sp1 the release notes page1 contains only the changes back ported from the next MRU release. It can be confusing when comparing release notes between versions. Be sure to read the paragraph at the top of the release notes page to see what is actually contained within.

1 Note that 5.0.60sp1 has no changes from 5.0.60. It was merely recompiled with a new version number and released as 5.0.60sp1.

Making MySQL more usable: InnoDB information_schema patches

Lately I’ve been working on a series of patches to increase the usability of InnoDB, based on the InnoDB plugin for MySQL 5.1. I have a lot of ideas still, but I’ve currently gotten two proof of concept patches working. Both of the patches require a small patch to relocate the buf_chunk_struct struct definition which was inadvertently defined in buf0buf.c rather than buf0buf.h.

Most patches are considered rough draft, proof of concept quality at best. They are suitable for testing, playing around, using them to gain insight about your data, but not for production use. Use at your own risk!

Using information_schema to view the buffer pool contents

This patch creates several new information_schema tables to provide visibility into the contents of the InnoDB buffer pool. The new tables are:

  • innodb_buffer_pool_pages
  • innodb_buffer_pool_pages_index
  • innodb_buffer_pool_pages_blob

Note that accessing these tables is not cheap, as the buffer pool must be scanned.

innodb_buffer_pool_pages

This table provides some generic information about all pages present in the buffer pool, and is primarily useful for getting statistics about the number and types of pages present at any given point in time. For example:

SELECT
  page_type,
  count(*) AS nr
FROM information_schema.innodb_buffer_pool_pages
GROUP BY page_type;

+-----------+------+
| page_type | nr   |
+-----------+------+
| allocated | 4051 |
| bitmap    |    2 |
| blob      |    5 |
| fsp_hdr   |    2 |
| index     |   14 |
| inode     |    2 |
| sys       |    3 |
| trx_sys   |    1 |
| undo_log  |   15 |
+-----------+------+

innodb_buffer_pool_pages_index

This table gives you detailed statistics on index pages, which for InnoDB means all data (as part of the primary key) and all defined indexes. For example (some columns removed for clarity):

SELECT *
FROM information_schema.innodb_buffer_pool_pages_index;

+-------------+------------------+------------+----------+---------+--------+-----------+--------------+...
| schema_name | table_name       | index_name | space_id | page_no | n_recs | data_size | lru_position |...
+-------------+------------------+------------+----------+---------+--------+-----------+--------------+...
| NULL        | SYS_IBUF_TABLE_0 | CLUST_IND  |        0 |       4 |      0 |         0 |            6 |...
| NULL        | SYS_INDEXES      | CLUST_IND  |        0 |      11 |      8 |       536 |            7 |...
| NULL        | SYS_TABLES       | CLUST_IND  |        0 |       8 |      5 |       351 |           12 |...
| NULL        | SYS_COLUMNS      | CLUST_IND  |        0 |      10 |     24 |      1436 |           14 |...
| NULL        | SYS_TABLES       | ID_IND     |        0 |       9 |      5 |       126 |           15 |...
| NULL        | SYS_FIELDS       | CLUST_IND  |        0 |      12 |      8 |       315 |           16 |...
| NULL        | SYS_FOREIGN      | ID_IND     |        0 |      46 |      0 |         0 |           50 |...
| NULL        | SYS_FOREIGN      | FOR_IND    |        0 |      47 |      0 |         0 |           51 |...
| NULL        | SYS_FOREIGN      | REF_IND    |        0 |      48 |      0 |         0 |           52 |...
| NULL        | SYS_FOREIGN_COLS | ID_IND     |        0 |      49 |      0 |         0 |           53 |...
| test        | bt               | PRIMARY    |       26 |       3 |      3 |        39 |           66 |...
| test        | bt               | PRIMARY    |       26 |       7 |      1 |      8124 |           69 |...
| test        | bt               | PRIMARY    |       26 |       9 |      2 |     16248 |           71 |...
| test        | bt               | PRIMARY    |       26 |       6 |      2 |     16248 |           74 |...
+-------------+------------------+------------+----------+---------+--------+-----------+--------------+...

Besides the obvious value of this, notice that you can see InnoDB’s internal tables (with schema_name as NULL) as well!

innodb_buffer_pool_pages_blob

This table gives you detailed statistics on BLOB-storage pages, which store parts of BLOBs that overflow in-row storage (another post on that later, but suffice it to say, if the BLOB is larger than 8KB… sort of). For example:

SELECT *
FROM information_schema.innodb_buffer_pool_pages_blob;

+----------+---------+------------+----------+--------------+--------------+-----------+------------+
| space_id | page_no | compressed | part_len | next_page_no | lru_position | fix_count | flush_type |
+----------+---------+------------+----------+--------------+--------------+-----------+------------+
|       26 |       5 |          0 |     9232 |            0 |           72 |         0 |          0 |
|       26 |      10 |          0 |     9232 |            0 |           73 |         0 |          0 |
|       26 |       8 |          0 |     9232 |            0 |           75 |         0 |          0 |
|       26 |      11 |          0 |     9232 |            0 |           76 |         0 |          0 |
|       26 |       4 |          0 |     9232 |            0 |           77 |         0 |          0 |
+----------+---------+------------+----------+--------------+--------------+-----------+------------+

We’d love to get the table name and PRIMARY KEY value associated with each BLOB-storage page, but I’m not sure if that’s possible.

An example of the power of these tables

Have you ever wondered what your page fill rate is? Probably not, because you didn’t realize it was either interesting or important, but we worry about these things (and often justifiably so). In short, page fill rate is the percentage of each page that is filled with data. Since InnoDB only caches whole pages, if your average page fill rate is 75%, you would be wasting 25% of the space in each page. That would mean that your table is spread across more pages than necessary, which will waste both disk space and memory you’ve allocated to the buffer pool.

Here’s a simple way to figure out the average page fill rate of pages currently in cache:

SELECT
  schema_name,
  table_name,
  index_name,
  COUNT(*) AS n_pages,
  ROUND(AVG(n_recs), 2) AS recs_per_page,
  ROUND(SUM(data_size)/1048576, 2) AS total_data_size_M,
  ROUND(AVG(data_size/(
    SELECT variable_value
    FROM information_schema.global_status
    WHERE variable_name
    LIKE 'innodb_page_size'
  )) * 100, 2) AS page_fill_pct
FROM
  information_schema.innodb_buffer_pool_pages_index
WHERE 1
AND schema_name IS NOT NULL
GROUP BY
  schema_name,
  table_name,
  index_name
ORDER BY
  total_data_size_M DESC;

+-------------+------------+-----------------+---------+---------------+-------------------+---------------+
| schema_name | table_name | index_name      | n_pages | recs_per_page | total_data_size_M | page_fill_pct |
+-------------+------------+-----------------+---------+---------------+-------------------+---------------+
| test        | n          | GEN_CLUST_INDEX |    4079 |        252.55 |             58.56 |         91.88 |
| test        | bt         | PRIMARY         |       2 |          2.50 |              0.02 |         49.70 |
| test        | p          | PRIMARY         |       1 |          2.00 |              0.00 |          0.52 |
| test        | p          | c               |       1 |          2.00 |              0.00 |          0.37 |
+-------------+------------+-----------------+---------+---------------+-------------------+---------------+

This is showing some statistics for each index from each table in the cache: the number of pages in the cache, the average number of records per page for those cached pages, the total number of megabytes of cache used, and the page fill rate in percent. There is currently no other way to calculate this information from any statistics or data that InnoDB provides.

What else?

What ideas do you have? What do you think about these features? Leave a comment to let us know!

Announcing the official Proven Scaling Blog

Welcome to Proven Scaling’s new official home on the web. We’ve been meaning to set up a company blog for quite a while, but since we already have personal blogs, it kept getting pushed to the back burner. Finally, with much ado, it’s up!

A little about Proven Scaling and our history

Proven Scaling was founded by Jeremy Cole and Eric Bergen in May, 2006 in order to provide our skills in MySQL architecture, performance, reliability, and availability to a larger audience. Currently Proven Scaling consists of:

  • Jeremy Cole — Jeremy left Yahoo! after 1.5 years, a few months ahead of Eric, to kickstart Proven Scaling in July, 2006. Previous to Yahoo!, Jeremy worked for MySQL AB for 4.5 years as tech writer, trainer, consultant, and occasional developer. Jeremy has a personal blog at jcole.us.
  • Eric Bergen — Eric left his 1.5 years at Yahoo! to join Jeremy in September, 2006. Previous to Yahoo!, Eric was responsible for MySQL and related systems at a major domain registrar. Eric has a personal blog at ebergen.net
  • Mike Griffiths — Based in London, UK, Mike joined Proven Scaling in November, 2007. Before joining Proven Scaling, Mike worked at Yahoo! for 7 years, responsible for MySQL in Europe, Asia and beyond. Mike has so far steadfastly refused to join the blogosphere, arguing that no-one’s really interested in his thoughts on sport, DIY, perl hacking, warm English beer and the other things which take up his leisure time.

A lot more than consulting…

At Proven Scaling we spend a lot time working on various software projects. Jeremy and Eric have been publishing their own patches to MySQL and other projects for many years. One of the primary reasons for forming a company together in the structure that we have is to ensure that we maintain the hardware resources, ability, and time available to work on these projects and provide them to the public in as quick and efficient manner as possible. In addition to continuing our previous work, Proven Scaling has been working on several new independent projects to solve problems in the MySQL space.

We needed to place to distribute that software along with other widely and not-so-widely available software packages, so we host a software mirror at mirror.provenscaling.com. This site hosts hard to find copies of various software packages we use every day, in addition to RedHat-friendly (and CentOS friendly) yum repositories for many packages. In fact, one of the reasons we are starting this blog is so that we have an official place to keep you updated as to the contents of our mirror site. As we update the mirror with new releases of existing software and entirely new software, we will post entries describing the updates.

What will we blog about?

You can get some idea of the topics you might see here from Jeremy and Eric’s MySQL posts on their personal blogs: MySQL on jcole.us and MySQL on ebergen.net.

In addition to the updates described above, some of the topics we are especially interested in writing about are:

  • Announcements of new Proven Scaling software and new releases of our software
  • New patches or ideas for patches to MySQL and other software
  • Deep dissections of various parts of MySQL and InnoDB
  • Knowledge gained from working with our customers, such as advice for real-world scenarios or unexpected situations
  • Many more exciting ideas we’re working on, which we will share when they are ready!

Sounds good?

Go ahead and subscribe to our RSS feed, or bookmark this page. Keep an eye on the author names above, so that you know who’s writing.