Thursday, October 2, 2008

Innovation: MySQL Cluster, Oracle Exadata, H-Store

MySQL Cluster continues to be ahead of the game, but we don't always hear about it as Cluster developers are either too busy or too modest.

Oracle announced Exadata last week. It provides extremely high-performance disk access and runs software on the storage servers to perform selection and projection at the storage server. Oracle has discovered the value of condition pushdown. It is only a matter of time before they figure out the value of batch key access and query fragment evaluation. I wonder if you can buy the Exadata hardware without Oracle software? And I wonder when MySQL will use Cluster as the basis for a data warehouse server.

Mikael Ronstrom described work in progress to support multi-threaded Cluster servers. This has many things in common with H-Store which is a radically different design for a DBMS to scale on multi-core architectures. Well, it is different when you don't consider Cluster.

9 comments:

  1. Condition pushdown was the first thing I thought of as well.

    It works well in the Exadata situation if you're data is larger than memory but otherwise you want to keep it in-core.

    Kevin

    ReplyDelete
  2. it'll be *very* interesting when we get varsized disk data and indexes on disk.

    Although BKA and multithreaded alone will possibly open a huge market and make a bunch of unmodified apps (or various web apps) a lot more deployable on NDB.

    (not to mention online add node, table reorg, all the online DDL we have and ndb$info).

    ReplyDelete
  3. @kevin: it was a while ago when I tested it, but I found that condition pushdown can actually slow things down. I did not do enough research to hand you a pattern, but I got confirmation from the cluster devs that the condition evaluation inside ndbd is not nearly as optimized as what happens inside mysqld. Of course, if you can cut a significant chunk of data and avoid it going over the wire then it's great but there certainly is a trade off regarding the complexity of the condition you want to check.

    Another thing that occurred to me is that right now, engine condition pushdown is an all/nothing thing. When I think about it, it is something I would like to attach to something of a finer granularity than the session, like, a particular query, or even a particular where clause or maybe even some of its inidvidual expressions (although this last thing would likely be hard to implement on the mysqlds side of things)

    Thoughts?

    ReplyDelete
  4. I hope that ndbd evaluates filter predicates quickly enough that we don't have to worry about hints, but it would be easy to add them.

    ReplyDelete
  5. "I hope that ndbd evaluates filter predicates quickly enough that we don't have to worry about hints"

    I guess you should try for every case. My point was mainly, don't blindly enable pushdown, you may be unpleasantly surprised.

    For details on what I checked:

    http://docs.sun.com/source/820-5417/ccsg-performance.html#ccsg-performance-data-access-pushdown-using

    (I checked other things too, such as

    IN (pk1, pk2, ...., pk100) and these also showed less favourable results than I'd expected)

    ReplyDelete
  6. We had problems with pushdown and extremely large (thousands of values) in lists. But that is up to the storage engine to fix and decide to push or not push.

    ReplyDelete
  7. I think this needs a comment:

    "It is only a matter of time before they figure out the value of batch key access and query fragment evaluation."

    Hmm.
    Don't you think Oracle knows a lot about database technology?
    Or you meant the *real* value, i.e.
    how to make customers pay a lot of money for this technology?

    And you are implying mySQL Cluster
    people figured that out?

    Interesting...

    ReplyDelete
  8. MySQL Cluster technology is world-class (best-in-breed, ...). I don't know who pays what for it.

    I have great respect for the developers at Oracle -- they too are world-class. Despite being hampered at times by legacy software and design decisions, they continue to do very clever things. Exadata and RAC compensate for the difficulty of scaling out a shared-disk server. The performance for Exadata is amazing. We can wait to see what the price/performance is for it.

    From what I have read, Oracle+Exadata can push predicates or bloom filters to the storage nodes. Can it also push a set of index key values in one RPC? With that it would be equivalent to batch key access as done in MySQL. Beyond that MySQL has work in progress to push query fragments to the data nodes -- http://forge.mysql.com/worklog/task.php?id=4292

    ReplyDelete
  9. I find it pretty amazing how the same basic SW technology is being reinvented over
    and over again and repackaged on new hardware and under different names. Take virtualization for example. Or perhaps
    VSAM in this case. I guess a lot of that
    just depends on what we call disk/storage.
    I don't see anything wrong with that, but just why focus on "technology" so much?
    We are not talking about new particle colliders or propulsion engines here.

    Can Oracle push a bunch of index keys
    to an Exadata cell in one RPC ?
    RPC is not used in Exadata, strictly speaking, but other than that,
    I'd say yes it can.
    This does not imply the implementation, performance properties, restrictions etc are similar to MySQL.
    In fact I don't know much about mySQL in general let alone how it does index key pushing.

    Price/performance...
    Is a low price/performance metric good or bad?
    If price is zero, then performance is irrelevant. Can't be less than zero, unless the customer is getting paid for using the product.

    When we pay more for oil & gas every year,
    what do we sell back? Marlboro? Movies? Coca Cola?
    If we try so hard to lower prices for technology, we have to compensate elsewhere,
    otherwise you know what will happen.

    ReplyDelete

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.