Tuesday, April 14, 2009

Talks at the (free) MySQL Camp

David Lutz will talk about Predicting Performance with Queueing Models at the (free) MySQL Camp on Thursday at 2PM. I will be there. Good performance testing includes tests and an explanation of the results. The test results are frequently much better when the explanation includes a performance model. I usually skip the model (math is hard). My published results would be more useful were I to include the models. I have a few books by Neil Gunther gathering dust at home that I really need to read.

The v3 Google patch now includes support for row-change logging. The code has a few bugs that we are fixing ASAP but it is ready for testing. There is a talk on the this at the regular conference. We can try it out at the MySQL Hackfest at the (free) MySQL Camp on Monday morning. Row-change logging generates one text line per changed row that describes the change. It is easy to parse and can be used to:
  • Maintain a copy of the MySQL table in another RDBMS or in a scalable data structure such as HBase or Hypertable.
  • Implement a change notification service.
  • Maintain materialized views.

6 comments:

  1. How is row-change different from RBR binlogs?

    ReplyDelete
  2. They are designed so that parsing them is easy and accurate. For example, they include column names and types. I can provide more details after the talk.

    I wish we didn't have to re-implement RBR, but we use 5.0 and RBR needs to change a few things to make parsing easy and accurate.

    ReplyDelete
  3. Ok, look forward to it. Thanks.

    Although including the column definitions with every row-change record seems like too much overhead. Wouldn't it make sense to export out the type info one time?

    Also, now that I think about it, maybe we need a plugin mechanism to Mysql RBR bin-logging. That will make it easy to splice in something like row-change or customize bin-logging.

    AD

    ReplyDelete
  4. Enabling both the binlog (log-slave-updates) and row-change logging on a slave has no measurable impact on CPU and IO load on servers that I watch. So I don't think this will be a problem even for servers that are very busy. It should only be used on a slave (to be explained at the talk).

    In the long run, there should be an option with RBR in official MySQL to get output that is easy to parse.

    ReplyDelete
  5. I will also try to be there. I actually had this as
    my major for a while in my Ph.D Thesis. However haven't
    done so much on it the last 10 years so will be nice
    with a refresher.

    ReplyDelete
  6. This may be a bit wordy for a blog but I think you are touching on very important things.

    Quoting Raj Jain in The Art of Computer Systems Performance Analysis:

    "Until proven guilty, every person should be presumed innocent. The performance counterpart of this statement is *until validated, all evaluation results are suspect.* This leads us to the following three rules of validation: (quoting only one)

    "Do not trust results of a measurement [a.k.a. performance testing] until they have been validated by simulation or analytical modeling."

    In other words, two techniques may be used simultaneously to verify & validate each other.

    "Queuing models" being another name for analytical modeling, the subject of David Lutz's lightning-fast talk (imho) next week.

    Neil Gunther says similar things in his books, articles and blog posts. Namely, if the data (test measurements) do not follow the small handful of queuing laws, something went wrong with the test or the data collection - and needs investigation.

    The foremost couple of queueing laws being Little's Law and the Utilization Law.

    Blogs being short and not to steal the speaker's thunder, you'll see these laws in action and I believe carry from it simple but deep, useful techniques.

    Very sorry I won't be there next week.

    ReplyDelete

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.