The MySQL team has been very productive recently with many bug fixes, many features implemented, a great 5.5 release and what appears to be an even better 5.6 release. The changes in 5.6 are a really big deal. I can't wait to stop porting rpl_transaction_enabled to get crash-proof slave state.
The value-added community continues to push the state of the art for those who can't wait for the great new features to be GA or for those with special problems. I am interested in trying parallel replication apply from Tungsten, working with Monty Program to improve monitoring in MariaDB and working with Percona to improve InnoDB quality-of-service for high-throughput OLTP.
I have begun to catch up on my reading to figure out what has changed for 5.6. I probably need another week to finish reading the many useful blogs and presentations. DimitriK published a 5-part performance report that I have yet to start (1, 2, 3, 4, 5). I am happy to find that few more MySQL developers at Oracle have begun blogging (Oystein, Didrik, Luis, Olav). How do I type the accented "O" in Oystein?
First the InnoDB changes:
- Information schema system tables - InnoDB has moved a lot of data to IS tables. The ones listed here are not the most interesting to me but the migration in general makes life easier for me. I don't know if these are new in 5.6, but they are more interesting to me than the ones described in the InnoDB blog:
- No more kernel_mutex - Wow! This is a huge deal for multi-core performance and will enable even more improvements in future releases. As part of the admission control feature we have been trying to reduce kernel_mutex contention and noticed that 5.5 already was much better for that.
- Persistent index cardinality statistics - when a server is restarted all index cardinality stats are computed in MySQL 5.1 and without the Facebook patch, this is serialized because of LOCK_open. That can create too many stalls on restart. I assume this allows a DBA to populate the stats table manually which can be very useful for a deployment with many scale-out slaves to prevent query plans from changing between slaves. Other details are at Oystein's blog.
- Multi-threaded purge - I want this right now. This and parallel replication apply make it possible to deploy large databases on slaves using disk setups that match what is available on a master. Otherwise you must read Yoshinori's slides to find the workaround.
- Data dictionary LRU helps if you have a lot (or too many) tables.
- Page cleaner thread - This is a good step but I wonder if it is enough and need to read code or wait for a Percona performance report. Has anything been done to prevent stalls when the async flush tries to flush too many pages in one call? The problem is that all dirty pages from an extent are flushed when at least one page with a too-old LSN is in the extent. Flushing neighbor pages can increase the number of pages flushed by up to 64X. I have seen benchmark servers stall for 60+ seconds while 200,000+ pages were flushed when the async limit was reached. This can be very painful in servers that are able to cache the database and fill with many dirty pages.
- memcached API for InnoDB - this should enable HandlerSocket like performance while supporting an API that is already supported by most clients (PHP, Java, Python, ...). I think that non-SQL interfaces to InnoDB will be a big deal.
- metrics table for InnoDB counters - I spent too much time adding counters to SHOW STATUS for InnoDB. Now you don't have to.
The InnoDB team has done a great job of making it easy to understand the changes. Getting docs for the replication changes required a bit more searching as they were not referenced from the main Oracle announcement. I added links to worklog entries and blogs by Mats and Luis. I am very interested in many of the replication changes that match and exceed what was in the Google patch.
- parallel replication apply - I am confused. The labs page and worklog state that this is for RBR. I want parallel apply for SBR to overcome slave replication lag from IO bound slaves. Parallel apply provides parallel IO requests. Per the blog by Luis, I think this supports SBR.
- system tables for slave state - I don't know if WL2775 describes what was implemented. Feedback from gmaxia makes me wish for more docs. Hopefully I can stop porting rpl_transaction_enabled and begin to use this instead. I need to read the blog by Mats.
- replication checksums - I want binlog event checksums. I have not needed them for a long time since a certain bug was fixed but I would rather not worry about that problem again. Mats also wrote about this.
- informational log events - I need the original SQL including its query comment in the binlog if I am to use RBR. With this feature I have one less excuse for not trying RBR.
- remote binlog backup - We already have this in the 5.1 Facebook patch thanks to a backport by Harrison. It lets you archive the binlog almost as soon as it is written.
- universal group identifiers - This might be the equivalent of global transaction IDs from the Google patch.
- optimized RBR logging - This will be a big deal for tables with BLOB columns that get frequent updates to non-BLOB columns. I have a few of those.
- Time delayed replication - I don't need this but many others will.
There also appear to be many useful changes in the Performance Schema. Alas, it isn't easy to figure out what has changed from reading the big announcement. I haven't been a fan of the P_S given my primary need is for aggregated workload stats (per-table, per-index, per-user) and that the user_stats patch from Facebook, Percona, MariaDB and Google has had that for many years in a form that is trivial to use (no setup) with very low overhead (it is always enabled in my benchmarks) and will always be enabled in production. It also is very easy to use -- select * from easy_to_remember_table_name. I hope the P_S can also provide that. Mark and Marc have begun to write more about this. I hope that continues.