Would 10X faster be enough to get you to switch? Do you need something that is faster for insert-only workloads or must it also be faster for updates and deletes?
Papers have been written that describe both the theory and practice of building systems that support high rates of inserts, updates and deletes. Some of this technology may eventually appear in a MySQL storage engine:
- Log-Structured Merge-Tree provides a framework for evaluating performance.
- Bigtable describes a similar approach for avoiding random IO during updates.
- ROSE describes how to combine these techniques with compression for modern CPUs (check out the authors)
- Graefe describes improvements that can be made for b-tree indexes


FWIW, Hypertable (an open source implementation similar to Bigtable) alpha achieved well over 1M inserts/s of 1TB data (randomly ordered by primary key, replicated 3-way, so about 3TB data was written to disks) sustained on much cheaper commodity hardware with JBOD (Just a Bunch Of Disks (4 7.2K RPM SATA per node with onboard controllers), not RAIDed) over 9 nodes. The performance is expected to double in beta.
ReplyDeleteUpdate/deletes have the same performance as inserts in Hypertable.
More info on that at http://hypertable.org/documentation.html
ReplyDeleteIs there a mature technology that supports HA ? (except MySQL Cluster)
ReplyDeleteHypertable writers are planning to implement the minimum code to support the loss of one node.
What about PBXT? One of the expected benefits of its architecture seems to be better performance for write operations ... Unfortunately, there are not many PBXT benchmarks for insert/update/delete operations ...
ReplyDeletePBXT might be better. My vague memory is that many but not all of the persistent structures to be updated are copy-on-write.
ReplyDelete