MySQL uses SQL for data and name-value pairs for configuration files. Cassandra uses XML for configuration files and something closer to name-value pairs for data (or name-value-value-... pairs). Why does it use a stronger data model for configuration than for data?
While I am writing this in jest I think this is an interesting question.
Saturday, March 20, 2010
Subscribe to:
Post Comments (Atom)


this doesn't address ur question -
ReplyDeletebut i am continuously reminded of the talk by Phil Bernstein at Stanford (he was talking about their new transactional storage engine for flash). His introduction went something like .. - 'as u all know every database has a key-value store at it's core .. recently there's been a lot of interest in this area ..'
Cassandra was written using Java, so it is natural to use XML.
ReplyDeleteI think config and data model are two separate issues. Data model is determined by your system design goal, such as choices of ACID, CAP, performance...often with compromises.
ReplyDeleteBecause they have different purposes? One for high-speed/scalable/high-available data storage, and the other is just a file which is read at start up etc. So, they can have different forms.
ReplyDelete@Mikiya Okuno - now we are going off topic.
ReplyDeleteI don't get "NoSQL == high-speed/scalable/high-available". Some NoSQL systems have some of these attributes. Perhaps a few have all of them.
You don't get those features simply by not implementing SQL. You definitely don't get one of those features (high-speed) by writing a server in Erlang.
While I am fully aware that 'single-node performance != scalable' I fear that lousy single-node performance forces me to use 5X or 10X too many server nodes and that too is not scalable.
You don't get high availability when your server isn't crash safe.
I don't think a system that requires explicit sharding is scalable. Perhaps 'can be scaled' is a better description. It can be made to work with MySQL but there is some cost. Similar costs will be paid in NoSQL.
XML config files are the worst, XML is a terrible format to read for humans but good for machines.
ReplyDeleteHi Mark,
ReplyDelete> I don't think a system that requires explicit sharding is scalable. Perhaps 'can be scaled' is a better description. It can be made to work with MySQL but there is some cost.
I would suggest using SPIDER storage engine for this purpose. It shards rows amongst remote MySQL servers, and does not require re-writing applications. According to Kentoku, an author of SPIDER, he will show a presentation at UC which describes how to reshard when a new server is added.
Did he wear the spiderman costume at last year's conference?
ReplyDeleteYes, he did! The most significant obstacle for him is a language barrier, as Giuseppe said at: http://datacharmer.blogspot.com/2009/04/test-driving-spider-storage-engine.html
ReplyDelete