Online DDL for InnoDB is another great feature new in MySQL 5.6. I don't think it is ready to replace online schema change frameworks (from Facebook, Percona Toolkit and Openark) for workloads that require minimal downtime for busy tables - but work can be done on InnoDB to change my opinion.
InnoDB supports online DDL by logging changes to the altered table when the alter is in progress. The maximum size of the log is determined by the my.cnf variable innodb_online_alter_log_max_size. The error DB_ONLINE_LOG_TOO_BIG is raised if the log gets full.
Online DDL for InnoDB blocks all concurrent inserts/updates/deletes while the log is being applied during online add index. I don't think this is documented and it should be. This can block concurrent changes for too long. I hope the code can be changed to apply most of the logged changes before getting exclusive access. In Facebook OSC most logged changes were applied in the replay step so that the cutover step, which gets exclusive access, would be fast.
I confirmed the current code blocks while applying the log by adding a call to sleep in the function row_log_apply just after the call to row_lock_x_lock(dict_index_get_lock(index)).
I created a feature request for this.
Note - I was wrong about the duration of locking. Looks like this is very online and likely to replace many uses of the OSC frameworks. See the comments below.
Tuesday, February 26, 2013
Sunday, February 24, 2013
MySQL 5.6: IO-bound, update-only workloads
The third performance test I am doing compares MySQL 5.6 (5.6.10) with MySQL 5.1 for an update-only workload with an IO-bound database. The configuration is similar to what I used for the IO-bound & read-only tests. The performance summary is that for my test servers:
In MySQL 5.1 and 5.5 the main background IO thread (srv_master_thread) supported furious flushing when needed. That thread had a loop from which background IO would be scheduled (write back dirty pages, do reads for insert buffer merges) and there was a one second sleep at the start of the loop but the sleep would be skipped when the previous loop iteration flushed many dirty pages. I use furious flushing to describe the InnoDB behavior when the sleep is frequently skipped. Each iteration of the loop would do about innodb_io_capacity disk requests (note that innodb_io_capacity limit was fuzzy, it might try to do twice the rate, but probably not 10X the rate). Because sleep could be skipped the innodb_io_capacity limit didn't really set the IOPs rate for background IO (despite what the docs state) but this was usually a good thing on servers that can do a lot of IOPs. The alternative would be to make InnoDB respect the limit and then set innodb_io_capacity to a large value and I think that is a bad idea without support for real AIO from InnoDB -- which is now in MySQL 5.6.
Several things have changed in MySQL 5.6.10:
8 16 32 64 128 256 concurrent clients
15134 19623 18521 14804 9730 5898 fb5163
10802 12980 13140 11337 11822 6284 orig5163
17649 22993 17281 15066 14899 14907 orig5610+hack
8317 8054 9379 11091 12684 14488 orig5610+bp8
4695 5393 6436 7378 8632 8951 orig5610+bp1
7402 7890 7058 7379 7893 8011 orig5610+bp1+lru4k
18505 25492 10042 11387 12858 14388 orig5610+bp8+lru4k
12965 12101 5134 4637 5078 5202 orig5610+bp1+lru8k
18579 24612 10566 11385 12571 13700 orig5610+bp8+lru8k
- MySQL 5.6 is always slower <= 64 threads
- MySQL 5.6 is a bit faster at >= 128 threads with 1 buffer pool instance.
- MySQL 5.6 is a lot faster at >= 128 threads with 8 buffer pool instances.
In MySQL 5.1 and 5.5 the main background IO thread (srv_master_thread) supported furious flushing when needed. That thread had a loop from which background IO would be scheduled (write back dirty pages, do reads for insert buffer merges) and there was a one second sleep at the start of the loop but the sleep would be skipped when the previous loop iteration flushed many dirty pages. I use furious flushing to describe the InnoDB behavior when the sleep is frequently skipped. Each iteration of the loop would do about innodb_io_capacity disk requests (note that innodb_io_capacity limit was fuzzy, it might try to do twice the rate, but probably not 10X the rate). Because sleep could be skipped the innodb_io_capacity limit didn't really set the IOPs rate for background IO (despite what the docs state) but this was usually a good thing on servers that can do a lot of IOPs. The alternative would be to make InnoDB respect the limit and then set innodb_io_capacity to a large value and I think that is a bad idea without support for real AIO from InnoDB -- which is now in MySQL 5.6.
Several things have changed in MySQL 5.6.10:
- flushing of dirty pages from the tail of the LRU used to be done by foreground threads (threads that handle query processing) via buf_flush_free_margin and a thread would attempt to flush many dirty pages at a time. Note that clean pages at the end of the LRU can quickly be moved to the free list but dirty pages must first be flushed. In MySQL 5.6 foreground threads will try to move one page at a time via buf_flush_single_page_from_LRU and hope that the page cleaner thread does the rest of the work.
- the page cleaner thread (buf_flush_page_cleaner_thread) doesn't do furious flushing. It sleeps so that it won't run more than once per second. In theory this means that the documented behavior for innodb_io_capacity is more likely to be correct. Sleep is done if any of the following are true 1) the server is not idle 2) there are pending background reads 3) pages were not flushed on the previous loop iteration. Note that the first condition is always true on a busy server, so sleep is not skipped on a busy server.
- when the page cleaner flushes dirty pages from the end of the LRU it does not use innodb_io_capacity to determine how much work to do. Follow the call chain from buf_flush_LRU_tail to buf_flush_LRU. It looks like InnoDB will do up to ~1000 page writes per buffer pool instance. So the trick to getting a higher rate of page flushes from the LRU is to use more buffer pool instances. But that is not the real solution.
- fb5163 - MySQL 5.1.63 + the Facebook patch, iocap=1000, itc=0
- orig5163 - MySQL 5.1.63, iocap=1000, itc=0
- orig5610+hack - MySQL 5.6.10, iocap=1000, bpi=8 and a hack to get the page cleaner thread to do furious flushing when needed.
- orig5610+bp8 - MySQL 5.6.10, iocap=1000, bpi=8, itc=0
- orig5610+bp1 - MySQL 5.6.10, iocap=1000, bpi=1, itc=0
- orig5610+bp1+lru4k - MySQL 5.6.10, bpi=1, itc=0, iocap=lru=4k
- orig5610+bp8+lru4k - MySQL 5.6.10, bpi=8, itc=0, iocap=lru=4k
- orig5610+bp1+lru8k - MySQL 5.6.10, bpi=1, itc=0, iocap=lru=8k
- orig5610+bp8+lru8k - MySQL 5.6.10, bpi=8, itc=0, iocap=lru=8k
8 16 32 64 128 256 concurrent clients
15134 19623 18521 14804 9730 5898 fb5163
10802 12980 13140 11337 11822 6284 orig5163
17649 22993 17281 15066 14899 14907 orig5610+hack
8317 8054 9379 11091 12684 14488 orig5610+bp8
4695 5393 6436 7378 8632 8951 orig5610+bp1
7402 7890 7058 7379 7893 8011 orig5610+bp1+lru4k
18505 25492 10042 11387 12858 14388 orig5610+bp8+lru4k
12965 12101 5134 4637 5078 5202 orig5610+bp1+lru8k
18579 24612 10566 11385 12571 13700 orig5610+bp8+lru8k
There are a few obvious problems. QPS falls quickly with concurrency for MySQL 5.1.63 because of mutex contention. QPS is much worse for MySQL 5.6 because of stalls on LRU flushing but using more buffer pool instances helps for the reason described above. From PMP I see that foreground threads are all stuck in buf_flush_single_page_from_LRU. Using a larger value for innodb_io_capacity does not help.
I repeated tests using innodb_thread_concurrency=32. It fixes the problems that occur at high concurrency for MySQL 5.1.
updates/second
8 16 32 64 128 256 concurrent clients
15092 19577 18506 17678 17101 16541 fb5163
10669 12929 12994 12745 12617 12254 orig5163
4697 5385 6475 6301 6208 6083 orig5610+bp1
18650 8094 9696 9660 9548 9548 orig5610+bp8+lru2k
16917 22861 17425 17164 17201 17006 orig5610+bp1+hack
16971 24206 9484 9379 9286 9258 orig5610+bp8+hack
I have spent a lot of time working on the LRU flushing code for MySQL 5.1. That includes the innodb_fast_free_list option which allows MySQL 5.1 + the Facebook patch to almost match MySQL 5.6 on IO-bound & read-only workloads. Pages were moved from the LRU to the free list on demand in MySQL 5.1 & 5.5 when foreground threads needed a free page for a disk read and the free list was empty. Unfortunately the code in buf_flush_free_margin to do that work wasn't efficient. MySQL 5.6 might be more efficient given that most of the work will now be done by the page cleaner, a background thread. However this adds the risk that it won't keep up with demand. For example it is possible today for the page cleaner thread to be sleeping when the free list is empty and there are many dirty pages at the end of the LRU. That is not a good state for a high-perf server.
With the Facebook patch, using a large amount of memory with xtrabackup required the innodb_fast_free_list option or recovery was much too slow. I wonder if a similar problem exists in MySQL 5.6.
I repeated tests using innodb_thread_concurrency=32. It fixes the problems that occur at high concurrency for MySQL 5.1.
updates/second
8 16 32 64 128 256 concurrent clients
15092 19577 18506 17678 17101 16541 fb5163
10669 12929 12994 12745 12617 12254 orig5163
4697 5385 6475 6301 6208 6083 orig5610+bp1
18650 8094 9696 9660 9548 9548 orig5610+bp8+lru2k
16917 22861 17425 17164 17201 17006 orig5610+bp1+hack
16971 24206 9484 9379 9286 9258 orig5610+bp8+hack
With the Facebook patch, using a large amount of memory with xtrabackup required the innodb_fast_free_list option or recovery was much too slow. I wonder if a similar problem exists in MySQL 5.6.
Thursday, February 21, 2013
MySQL 5.6 is much faster on IO-bound, read-only workloads
A lot of work was done to make InnoDB faster for MySQL 5.6. The results are especially obvious for IO-bound read-only workloads. I have many performance tests to run and started with the easy ones. Yesterday I published results for a cached read-only workload and found a few problems that can and should be fixed (bugs 66473 & 68413). Even on this workload, there is a significant performance overhead from enabling the PS.
The improvement on this workload from MySQL 5.6 is amazing. The benefit from the innodb_fast_free_list option in the FB patch is also nice but I don't know yet whether that change is needed in 5.6. I have a lot of work to do to publish changes from the FB patch to make the launchpad branch current.
The results below are QPS from sysbench where each query fetches 1 row by primary key. The database has 8 tables with 16M rows/table and the total database size is ~32GB. The innodb_buffer_pool_size is 4GB. The test server has flash storage devices. The test used 8 sysbench processes running on one host and 1 mysqld process on another host. Each sysbench process was limited to queries from one table and each test server has 24 cores (12 real X 2 with HT).
This is a graph of the results for all binaries except orig5610+ps. I tested 9 binaries. All used jemalloc.
8 16 32 64 128 256 concurrent clients
26650 52007 85561 100739 92439 100941 orig5610+bp1+fastcs
26276 51899 83320 87192 77942 82560 fb5163+ffl+fastcs
22329 41153 72793 81506 75005 80943 fb5163+ffl
22726 42127 53933 48098 45310 43364 fb5163
21986 40370 43958 38453 36765 34036 orig5163
The improvement on this workload from MySQL 5.6 is amazing. The benefit from the innodb_fast_free_list option in the FB patch is also nice but I don't know yet whether that change is needed in 5.6. I have a lot of work to do to publish changes from the FB patch to make the launchpad branch current.
The results below are QPS from sysbench where each query fetches 1 row by primary key. The database has 8 tables with 16M rows/table and the total database size is ~32GB. The innodb_buffer_pool_size is 4GB. The test server has flash storage devices. The test used 8 sysbench processes running on one host and 1 mysqld process on another host. Each sysbench process was limited to queries from one table and each test server has 24 cores (12 real X 2 with HT).
This is a graph of the results for all binaries except orig5610+ps. I tested 9 binaries. All used jemalloc.
- orig5610+bp8+fastcs - unmodified MySQL 5.6.10 with innodb_buffer_pool_instances=8, metadata_locks_hash_instances=256, innodb_checksum_algorithm=crc32 and the PS disabled
- orig5610+bp8 - same as orig5610+bp8+fastcs except innodb_checksum_algorithm=innodb
- orig5610+bp1+fastcs - same as orig5610+bp8+fastcs except innodb_buffer_pool_instances=1
- orig5610+bp1 - same as orig5610+bp8 except innodb_buffer_pool_instances=1
- orig5610+ps - same as orig5610+bp1 except the PS was enabled
- fb5163+ffl+fastcs - MySQL 5.1.63 + the Facebook patch, innodb_fast_free_list=1, innodb_fast_checksums=1
- fb5163+ffl - MySQL 5.1.63 + the Facebook patch with innodb_fast_free_list=1
- fb5163 - the same as above except innodb_fast_free_list=0
- orig5163 - unmodified MySQL 5.1.63
8 16 32 64 128 256 concurrent clients
26747 52738 86175 109836 108605 114046 orig5610+bp8+fastcs
23844 48049 81434 102006 103290 108152 orig5610+bp826650 52007 85561 100739 92439 100941 orig5610+bp1+fastcs
24446 47464 79788 96777 92565 100211 orig5610+bp1
23429 46270 74870 87650 84117 93382 orig5610+ps26276 51899 83320 87192 77942 82560 fb5163+ffl+fastcs
22329 41153 72793 81506 75005 80943 fb5163+ffl
22726 42127 53933 48098 45310 43364 fb5163
21986 40370 43958 38453 36765 34036 orig5163
Friday, February 1, 2013
Main memory DBMS
I just listened to a talk at the New England Database Summit and learned about a new DBMS that supports extremely high QPS for data in main memory sharded over several nodes with synchronous commit across multiple copies to support HA. Performance is much better when the custom API is used to batch transactions. It has begun to support data on disk.
Now that I think about it, was this a talk about MySQL Cluster?
Note that one does not simply put data on disk. When disk is flash SSD then compression is a really big deal. When disk is much slower than optimizations to reduce IOPS are very important (InnoDB insert buffer, TokuDB fractal tree to reduce random writes).
Now that I think about it, was this a talk about MySQL Cluster?
Note that one does not simply put data on disk. When disk is flash SSD then compression is a really big deal. When disk is much slower than optimizations to reduce IOPS are very important (InnoDB insert buffer, TokuDB fractal tree to reduce random writes).
Subscribe to:
Posts (Atom)

