Wednesday, March 7, 2012

Modern platforms: flash, disk, log-structured, update-in-place

Someone who knows a lot about storage asked me whether we can spend IOPS to improve compression.

When deploying a pure-flash server I want a database engine that optimizes for compression as more compression means less flash must be purchased. The engine can do extra IOPS in search of more compression. Column-wise storage is an example of a feature that can improve compression at the cost of extra disk reads.

When deploying a pure-disk server I want a database engine that has optimizations to reduce disk IO. The InnoDB insert buffer reduces disk reads done for secondary index maintenance. TokuDB and LSM trees eliminate random disk writes.

Is there one database engine that optimizes for compression and saves IOPS? Is this an example where one size does not fit all? A log-structured engine can save IOPS by doing fewer random writes. But it will also use more disk space because old versions of data are not compacted immediately. An update-in-place engine might get less compression than possible because writes are frequent and compression must be fast. Using InnoDB as an example it gets less than half of the compression rate that is feasible for data that I like. Support for prefix compression and larger pages (32kb or 64kb) will improve this but InnoDB will always get less compression than possible because compressed pages are rounded up 2kb, 4kb, 8kb with key_block_size. I think that is a problem any time block compression is used for an update-in-place engine.

4 comments:

  1. Do you have a preferred column-oriented engine for MySQL?

    ReplyDelete
    Replies
    1. I am not aware of a column-store for MySQL suitable for OLTP

      Delete
    2. Mark: What about Infobright and InfiniDB; aren't they suitable for OLTP?

      Delete

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.