As part of upgrading a few database servers from MySQL 4.0 to MySQL 5.0, I ran a benchmark that measures the time to reload all tables of a database. The number of sessions used to perform the reload was configurable, and the performance difference between MySQL 4 and 5 was the greatest when 1 session (no concurrency) was used.
This was quite a shock. I first assumed the problem was caused by InnoDB and the new support for row_format=compact. I even removed support for this feature and reran the tests. That made MySQL5 faster, but not much faster.
I then began using oprofile which is amazing for performance debugging and found that MySQLParse was the top consumer of CPU time. Unfortunately, this is one large function a huge switch statement generated by Bison. Fortunately, I was able to use the rdtsc instruction to isolate the problem.
The grammar for MySQL5 has changed significantly from MySQL4 in a few places. There is good reason for this change as the SQL supported by MySQL5 is much closer to the SQL standard than it used to be. One unexpected effect of this change is that much more CPU time is used to parse expressions (see http://bugs.mysql.com/bug.php?id=29921).
I found a colleague who gets interrupted at work less frequently than I do and asked him to modify the grammar and get back the performance. I think we have something that works. The worst case performance difference for MySQL5 has been cut in half for MyISAM and almost in half for InnoDB.
The bug report has many more numbers. The number of seconds to reload a database with MyISAM tables are listed below. MySQL5 is 58% slower without the fix and 29% slower with it.
MySQL 4.0.26 - 10061 seconds
MySQL 5.0.37 (original) - 15808 seconds
MySQL 5.0.37 (fixed) - 12995 seconds
View comments