Tuesday, February 26, 2013

MySQL 5.6: online DDL for busy tables

Online DDL for InnoDB is another great feature new in MySQL 5.6. I don't think it is ready to replace online schema change frameworks (from Facebook, Percona Toolkit and Openark) for workloads that require minimal downtime for busy tables - but work can be done on InnoDB to change my opinion.

InnoDB supports online DDL by logging changes to the altered table when the alter is in progress. The maximum size of the log is determined by the my.cnf variable innodb_online_alter_log_max_size. The error DB_ONLINE_LOG_TOO_BIG is raised if the log gets full.

Online DDL for InnoDB blocks all concurrent inserts/updates/deletes while the log is being applied during online add index. I don't think this is documented and it should be. This can block concurrent changes for too long. I hope the code can be changed to apply most of the logged changes before getting exclusive access. In Facebook OSC most logged changes were applied in the replay step so that the cutover step, which gets exclusive access, would be fast.

I confirmed the current code blocks while applying the log by adding a call to sleep in the function row_log_apply just after the call to row_lock_x_lock(dict_index_get_lock(index)).

I created a feature request for this.

Note - I was wrong about the duration of locking. Looks like this is very online and likely to replace many uses of the OSC frameworks. See the comments below.

9 comments:

  1. Hi, Mark

    I've read the source code of Online DDL in MySQL 5.6. I think what you claim is not true--"Online DDL for InnoDB blocks all concurrent inserts/updates/deletes while the log is being applied during online add index."

    In InnoDB, it only lock the index when applying the last row log block, not whole of the row logs. So it blocks only a little time.

    ReplyDelete
  2. It is very possible that I am wrong. This behavior is not documented -- it should be -- so I am reading a lot of code that is new to me. The Facebook OSC docs make it very clear when & where locks are taken. The MySQL docs don't do that yet.

    It would help if someone explained where the log was applied prior to taking the X lock.

    ReplyDelete
  3. What I see is that:
    1) row_log_apply gets the X lock and then calls row_log_apply_ops
    2) row_log_apply_ops holds the X lock per block of logged changes, so it will release the X lock, read a block of logged changed, get the X lock, apply the changes, repeat

    But it would be nice to provide a few more details about how the fancy new things in MySQL behave.

    ReplyDelete
  4. In Online Add/Drop index, InnoDB acquires the index lock for two reasons:

    First, when finish applying a row log block and want to read the next row log block, it will acquire the index lock to check whether the next row log block is also the last one, if not, it will release the index lock and apply the change in this block(concurrent dml is allowed,new row log is appended to the last row log block which we have not accessed yet);

    Second, when the row log block is the last block, it will apply the change in this last row log block with the index lock(X mode, concurrent dml is blocked);

    So, concurrent dml is only blocked when InnoDB apply the last row log block, i think the block time is very trivial.

    ReplyDelete
  5. Mark, thank you for testing and reviewing my code. Dengcheng, thank you for saying what I was planning to say. :-)

    Maybe it is an overkill to hold the index->lock for such a long time, especially for the whole duration of the last block. My initial idea was to hold the index->lock for very little time for the first blocks, but maybe there was some bug that I fixed by holding the lock for the whole block application time.

    In any case, we could investigate if we could release and acquire the index->lock during the application of the last block. The reason why I wanted to hold the lock continuously for the last block was to avoid a starvation-style problem where the apply thread (which may have to do random I/O) would not keep up with log accumulation (sequential writes). Can you file a bug for this, Mark?

    There are two more difficult-to-fix issues that we were made aware of by Shlomi

    In the commit_inplace phase of online ALTER TABLE, we have a problem with the meta-data lock (MDL) upgrade. During the upgrade, others (even auto-commit read-only transactions) cannot access the table:

    con1>ALTER TABLE t …;
    con2>BEGIN; INSERT INTO t VALUES(1);-- do not commit
    con1>-- about to enter commit_inplace; wants to upgrade MDL, but cannot because con2 is holding conflicting MDL
    con3,4,5,…,N>SELECT * FROM t;-- blocks because con1 wants to upgrade MDL
    con1>--MDL upgrade timeout; the ALTER TABLE will fail and be rolled back

    Another problem: If you enable persistent index cardinality statistics, the statistics will be calculated in the commit_inplace phase, while holding the exclusive meta-data lock. Theoretically, we could calculate the statistics ‘for free’ while loading the sorted index entries.

    ReplyDelete
  6. Marko - I updated the original text to make it clear that I was wrong about the duration of locking. I updated the feature request to indicate this. My only request at this point is an Oracle blog post explaining the implementation in more (but not too much) detail.

    What is the size of a block? Is it determined by innodb_sort_buffer_size which is 1M by default?

    ReplyDelete
  7. Marko - I will wait until we test online DDL with real workloads before offering more feedback on this.

    ReplyDelete
  8. Yes, the online modification log block size is the same as innodb_sort_buffer_size (srv_sort_buf_size). We could introduce a separate configuration parameter for it.

    Currently I am between projects (having worked mostly on online ALTER TABLE for the past 2 years), and I wanted to use this spare time to warm up old patches that have been collecting bit-rot. :) I will try to blog before the next big project swallows me in the next couple of months.

    ReplyDelete
  9. I agree that a blog explaining the online DDL process in more detail would be very helpful. I'd also like to better understand the difference between the 5.6 implementation and Percona's OSC tool.

    ReplyDelete

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.