Saturday, February 2, 2008

Upgrading from MySQL 4.0 to MySQL 5.0

We (many people) just upgraded a MySQL deployment from 4.0.26 to 5.0.37. It took a long time to get it ready, but everything works and there was minimal impact on the database users. I will share some of the fun here.
  • Our database files are about 15% smaller thanks to the compact row format new with InnoDB in 5.0.
  • The data in our 5.0 databases was exactly the same as the data in our 4.0 databases, with two exceptions. I know because we checked many times. This is a big deal. The exceptions are one application we were unwilling to change to fix differences in rounding arithmetic expressions and the least significant bits of the values of Double columns.
  • The parser in MySQL 5 is much faster than it used to be. We measured a big performance regression in our database reload benchmark. MySQL worked with us to fix this. Had this been a closed-source RDBMS, the collaboration would not have been possible.
  • InnoDB has begun to fix the concurrency bottlenecks in 5.0 and we expect much better performance on large SMPs in the near future.
  • Primaries are much more crash proof with the use of XA internally to synchronize commit between a storage engine and the binlog.
  • I haven't noticed a big performance difference between 4.0 and 5.0. The codepath in 5.0 is longer as there are many more features there, and this makes some things a bit slower. But 5.0 has new features that make some queries faster.
  • We had to frequently convert a database from 4.0 to 5.0 and we did this by dumping the tables with mysqldump from 4.0 and reloading into 5.0. We ended up writing Python utilities to do this with configurable levels of parallelism. The tool used a SQLite database to checkpoint the progress of the dump and restore. Maatkit provides similar features.
  • Primarily because of the use of Decimal rather than Double for some arithmetic operations and numeric literals and a change in blank trimming, there were differences in column values on 5.0 slaves compared to 4.0 masters. We built a few tools to identify all differences between tables. Computing table checksums is only part of the problem. Once you know that tables are different, you need to determine where and why. In most of these cases, the SQL statements contained in the binlog were used to identify the applications that needed to be updated. I am not sure how we would have done this were row based replication used. Again, Maatkit provides similar functionality, but I am not sure if it identifies the rows that differ.
  • Some queries that worked in 4.0 did not parse in 5.0 because the SQL syntax for 5.0 was improved to be more standards compliant. This is a great thing to do, but it required some work for us to identify and fix the broken queries.
  • You cannot replicate from a 5.0 master to a 4.0 slave. If you are paranoid during an upgrade, you want to preserve a downgrade path. We built a tool for this.
  • Condition pushdown, new in 5.0, is a big deal for us. We hope for more enhancements to the table handler interface. (hint -- aggregation pushdown)

5 comments:

  1. Why didn't you go through 4.1 as the manual strongly recommends?

    ReplyDelete
  2. Because I don't want to double the cost of the upgrade (in time and effort). It takes a long time to qualify a new binary, determine whether it can be trusted, find all of the new bugs and fix them or work around them. What problems do you think that would have resolved?

    ReplyDelete
  3. I have dozen of clients with me on server using 4.0 if if I upgade it to 5.0 what problem do I face.
    Do i have to unstall 4.0 completely and then install 5.0.?
    Do my existing databaces supported..?
    What steps should be taken to minimized the down time..?
    Any other issue I face let me know so I will prepare my self for that..
    ........Thanks waiting for reply...........

    ReplyDelete
  4. @anon - The problems I faced were described above. Read the upgrade notes, they describe other problems. Dump and reload the database. You may have to correct some SQL statements to work with the 5.0 parser.

    ReplyDelete
  5. I know this is old post but I have question:
    Hi Mark, You said

    "You cannot replicate from a 5.0 master to a 4.0 slave."

    But do I can replication from 4.0 master to a 5.0 slave?

    I have read http://dev.mysql.com/doc/refman/4.1/en/replication-compatibility.html

    and it's only said "MySQL supports replication from one major version to the next higher major version. you can replicate from a master running MySQL 4.0 to a slave running MySQL 4.1, from a master running MySQL 4.1 to a slave running MySQL 5.0,"

    and not said "you can replicate from a master running MySQL 4.0 to a slave running MySQL 5.0 or 5.1 or 5.5 or 5.6"

    I need a confirmation that replication can be done from slave that has 2+ version higher, not just 1 next version above,

    thanks before
    Amin Yuliastanto

    ReplyDelete

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.