- Our database files are about 15% smaller thanks to the compact row format new with InnoDB in 5.0.
- The data in our 5.0 databases was exactly the same as the data in our 4.0 databases, with two exceptions. I know because we checked many times. This is a big deal. The exceptions are one application we were unwilling to change to fix differences in rounding arithmetic expressions and the least significant bits of the values of Double columns.
- The parser in MySQL 5 is much faster than it used to be. We measured a big performance regression in our database reload benchmark. MySQL worked with us to fix this. Had this been a closed-source RDBMS, the collaboration would not have been possible.
- InnoDB has begun to fix the concurrency bottlenecks in 5.0 and we expect much better performance on large SMPs in the near future.
- Primaries are much more crash proof with the use of XA internally to synchronize commit between a storage engine and the binlog.
- I haven't noticed a big performance difference between 4.0 and 5.0. The codepath in 5.0 is longer as there are many more features there, and this makes some things a bit slower. But 5.0 has new features that make some queries faster.
- We had to frequently convert a database from 4.0 to 5.0 and we did this by dumping the tables with mysqldump from 4.0 and reloading into 5.0. We ended up writing Python utilities to do this with configurable levels of parallelism. The tool used a SQLite database to checkpoint the progress of the dump and restore. Maatkit provides similar features.
- Primarily because of the use of Decimal rather than Double for some arithmetic operations and numeric literals and a change in blank trimming, there were differences in column values on 5.0 slaves compared to 4.0 masters. We built a few tools to identify all differences between tables. Computing table checksums is only part of the problem. Once you know that tables are different, you need to determine where and why. In most of these cases, the SQL statements contained in the binlog were used to identify the applications that needed to be updated. I am not sure how we would have done this were row based replication used. Again, Maatkit provides similar functionality, but I am not sure if it identifies the rows that differ.
- Some queries that worked in 4.0 did not parse in 5.0 because the SQL syntax for 5.0 was improved to be more standards compliant. This is a great thing to do, but it required some work for us to identify and fix the broken queries.
- You cannot replicate from a 5.0 master to a 4.0 slave. If you are paranoid during an upgrade, you want to preserve a downgrade path. We built a tool for this.
- Condition pushdown, new in 5.0, is a big deal for us. We hope for more enhancements to the table handler interface. (hint -- aggregation pushdown)
Saturday, February 2, 2008
We (many people) just upgraded a MySQL deployment from 4.0.26 to 5.0.37. It took a long time to get it ready, but everything works and there was minimal impact on the database users. I will share some of the fun here.