Do you know SQL or do you
NoSQL? MySQL has been very popular for internet-scale deployments. But times have changed and there are alternatives. The alternatives either out-scale or out-avail MySQL and this is more important than providing the features of an RDBMS for many applications. My prediction is that there will be much less usage of MySQL for internet-scale applications in the future if we do not make big changes.
What are the problems and what can we do to fix them? From my perspective there are two problems:
- MySQL is not efficient on modern hardware (multicore, many disk IOPs)
- Replication is very expensive to manage
We are in the process of fixing the first problem for InnoDB and Percona has binaries you can use in production today that make things much better. However many problems remain that limit throughput on servers with 8+ cores and there is little visible work in progress to fix them (MyISAM, query cache, LOCK_table, ...). This is a serious issue as 8 cores is or will soon be the new common box in the datacenter and price/performance comparisons will get much worse for MySQL.
Replication requires much more work. I want more automation and more flexibility.
The lack of automation is apparent when you consider the replication related errors that require manual intervention. These errors are frequent or constant when you run a large number of MySQL servers. It is very expensive to support MySQL in this environment. Actions that must be automated include:
- the promotion of a slave to a master after the failure of the master
- failover of slaves to the new master
I also want the flexibility to extend replication. I have participated in the development of many replication enhancements (semi-sync, mirror binlog, global group IDs) and that effort has been incredibly difficult. I am still amazed at what Wei and Justin were able to accomplish. I doubt that anyone would ever volunteer for such a project (I was paid). The code is not fun to modify.
I have more ideas to improve replication but it isn't clear to me that I can afford the cost to modify the replication code in official MySQL. But then I looked at the code for
Drizzle. Wow! The code is clean, easy to read and easy to modify. So I still have hope for MySQL-related technology in the datacenter, but in the form of Drizzle.
I agree with your points, but I would add that it is possible that with the replication plug-in work occuring in MySQL server that you might (and I stress might) get what you want from replication in MySQL server down the road. Granted there is much work outside the replication plug-ins in the core code that would be required.
ReplyDeleteThat being said, what the Drizzle people are doing is amazing and it will do one of two things: force MySQL's hand to accelerate the discussed changes and end up with two relevant database systems or leave MySQL in a cloud of dust and a footnote in history after another 10 years. I hope for the former as I think competition is good for both parties and I would be sad to see MySQL go away. It has been a part of my life in one form or fashion for ten years and I don't want to see it relegated to obscurity.
@Keith -- I have ignored the work by official MySQL because it is a long way from being GA. When that work is done it will be possible to do interesting things. But what is the release schedule for these changes? It was scheduled for 6.0 and then possibly for 5.4 (maybe).
ReplyDeleteI have also ignored Continuent. But I am sure we will get a comment about that.
MySQL will be around for a very long time whether they remain popular for internet-scale apps. They still have a big opportunity in the enterprise and SMB markets. But I don't want to limit my MySQL skills to that.
Well, part of the change with moving to 5.4 is a the change of development cycle. Although not announced together I think it they (MySQL 5.4 and the new development model) are tied in intimately.
ReplyDeleteOnly time will tell if this is a really good change (to a new development model). I doubt it could be worse than what was in place previously. Ideally this will allow the replication work to be integrated faster without waiting for feature X, Y or Z to mature. That is if the new replication features are ready for prime time. Ah speculation!
I hope that works out. The replication team has more people now who have done interesting work on replication outside of MySQL. It will be nice to see them repeat some of that within MySQL.
ReplyDeleteIs drizzle going to support slave to master promotion?
ReplyDeleteMaybe you could use Zookeeper to keep configs for picking the new master.
I think Drizzle can support anything as long as someone implements it. So the issue is finding someone to do the work.
ReplyDeleteAfter speaking to Monty Program AB, I think that some of the goodness and flexibility of Drizzle is possible in MariaDB if we change it to provide an interface at the point at which binlog events are written on the master. This is what Drizzle does with the Replicator and Applier interfaces.
Hi Mark,
ReplyDeleteSpeaking as the main designer of Continuent Tungsten, I'm definitely with you on the importance of replication. We've been working on addressing the limitations of MySQL replication along two general lines of attack:
1.) Replace the replication mechanism with a more robust external mechanism. Of course we build out over the existing binlog, so there's a limit to what we can fix. Semi-sync replication is a good example of something that really needs a database fix. However, there are number of features like backup integration and table integrity checking that can be done well or even far better using an external daemon such as we have developed.
2.) Add overall management. We have been working very hard on something we call the Tungsten Manager that I keep threatening to publish. It adds cluster management, for example with automated failover operations and broadcast commands. That addresses the large-scale management costs.
What we can't do or rather what we have not done yet is wade in and fix the binlog mechanism itself. I have some opinions about how this should be done that I'll post on my blog, but it's not the work of a day or even a year to get that problem solved.
That said, I don't agree that MySQL is doomed to irrelevance. There are still no relational alternatives to speak of that are as cost effective for scale-out as MySQL and the NoSQL alternatives are still pretty immature, as is Drizzle. There's time to fix this but it needs attention. Thanks for keeping everyone focused on this problem.
Cheers, Robert