- replay all binlog events from a transaction
- commit the transaction to the storage engines that participated
- write new state to the relay-log.info file
This is bug 26540 if you want to express interest in a fix. This is also fixed in some cases for InnoDB by transactional replication which has made it into Percona. By some cases, I mean that it does not protect transactions that update MyISAM tables, at least not in the Google patch. I have not reviewed the Percona code.
This problem gets more interesting for master-master replication. In that case a server writes a binlog and a relay log and the update sequence is:
- replay all binlog events from a transaction
- XA prepare for the binlog
- XA prepare for InnoDB (assuming InnoDB is used)
- write the XID to the binlog (commit)
- commit the transaction to the storage engines that participated
- write new state to the relay-log.info file
- before step 4 - This is not a problem. The prepared InnoDB transaction is rolled back during crash recovery and then run when the slave SQL thread starts.
- between step 4 and step 5 - This is a problem. The prepared InnoDB transaction is committed during crash recovery but relay-log.info is not updated. Note that transactional replication does not correct the mismatch so the last transaction will be run again when the slave SQL thread starts. Running the same transaction multiple times may cause replication to halt or may corrupt your database.
- between step 5 and step 6 - This problem is fixed by transactional replication.

3 comments:
how do I be surprised and totally not surprised at the same time.... it's inexcusable these days to have such a giant hole for corruption.
Determined not to have such things for Drizzle.
Hi Mark,
We solve this problem in Tungsten Replicator by updating an extra table as part of the same database transaction on the slave, which removes ambiguity about whether the salve updated. Maybe I'm missing something but couldn't MySQL use the same approach internally? It seems as if a lot of the trouble comes from before and after updates across stores.
If so please don't implement it too quickly as I want to make some house payments first. ;)
Cheers, Robert
@Robert - there is a whole lot of value to be added on the path between anything we do in 5.0.37 and anything that a customer can depend on when deployed in production.
Tables would be good. You still need to be sure that any files that store binlog events are updated safely and can deal with partial updates.
Post a Comment