Big downtime gets a lot of attention in the MySQL world. There will be some downtime when you replace a failed master. With GTID in MariaDB and MySQL that time will soon be much smaller. There might be lost transactions if you use asynchronous replication. You can also lose transactions with synchronous replication depending on how you define lose. I don't think this gets sufficient appreciation in the database community. If the higher commit latency from sync replication prevents your data service from keeping up with demand then update requests will timeout and requested changes will not be done. This is one form of small downtime. Whether or not you consider this to be a lost transaction it is definitely an example of lousy quality of service.
My future project, MarkDB, might have a mode where it never loses a transaction. This is really easy to implement. Just return an error on calls to COMMIT.
View comments