These tests used sysbench with 128M rows in one table. I compared MySQL 5.6.10 (orig5610 below), MySQL 5.1.63 (orig5163 below) and MySQL 5.1.63 with the Facebook patch (fb5163 below). For all tests the sysbench clients ran on the same host as mysqld. The sysbench workload is to update one row by primary key per query. The database table is about 29GB on disk. The binlog and InnoDB doublewrite buffer were disabled during the test. Fsync on commit was also disabled (innodb_flush_log_at_trx_commit=2 was used). The test server has fast storage and can do 100 microsecond reads of 16kb pages for a single-threaded read-only workload. Write latency is worse when many random writes are done.
I looked at PMP stack traces during the test and I don't think the problem is InnoDB. My guess is that the problem is similar to what was measured on the read-only tests -- more code was added to 5.6 that makes query optimization or execution slower.
Tests were first run for an IO-bound configuration using a 4G InnoDB buffer. Results for orig5163 are probably worse because of the overhead from computing InnoDB checksums as described in the read-only post. Enabling the perf schema with default options costs about 10% of peak performance for MySQL 5.6. MySQL 5.1 is able to match 5.6 performance when the Facebook patch is used because of the innodb_fast_free_list option that improves the efficiency of finding clean pages from the LRU tail. In the results below iocap means innodb_io_capacity and ffl means innodb_fast_free_list. When not specified the server used 8192 for innodb_io_capacity and innodb_lru_scan_depth.
- 4160 QPS - fb5163, iocap=1024, ffl=1
- 4107 QPS - orig5610, performance_schema not compiled
- 4100 QPS - fb5163, iocap=1024, ffl=0
- 3909 QPS - orig5610, performance_schema=ON, default options
- 3596 QPS - fb5163
- 3352 QPS - orig5163
Tests were repeated using a 64GB InnoDB buffer pool. The table was cached by InnoDB prior to the test but the adaptive hash index was cold so most UPDATE statements spent some time updating the AHI. Unlike the IO-bound tests, 5.6 was slower than 5.1 and again the perf schema has a significant overhead -- 9.4%.
- 9231 QPS - orig5163
- 9089 QPS - fb5163
- 7810 QPS - orig5610, performance_schema not compiled
- 7365 QPS - orig5610, performance_schema=ON, default options