Tuesday, March 17, 2009

Linear scale-up, in case you missed it

A system displays linear scale-up when it has twice the throughput with twice the resources. I spend a lot of time testing MySQL on SMP servers to validate the performance features we add to it. I rarely get linear scale-up from it. Performance usually reaches a maximum or declines when the server is saturated. Yet here is a result that displays linear scale-up up to a point. I never get results like this. How was it done?
  1. Determine the maximum throughput (MaxTP) for the server with N clients
  2. ReducedMaxTP = MaxTP * 0.75
  3. PerClientTP = ReducedMaxTP / N
  4. Run tests for 1 to N concurrent clients where each client generates PerClientTP transactions per second
At this point, the graph of throughput versus concurrent clients should display linear scale-up. That is for X concurrent clients the system should perform X * PerClientTP transactions per second.

But what does it mean? I think this is the result of a test with a lot of think time. I also want to know how the system performs with less think time.

Update:

Using this approach I can demonstrate linear scale-up on CPU bound workloads for a 16-core server with sysbench readonly. This requires the Google or Percona patches. The results are here.

2 comments:

  1. Some of the performance characteristics is a result of how Solaris/SPARC behaves with respect to its design of mutex/condvars.

    IIRC, when a mutex is unlocked or a condvar signalled, the releasing thread chooses the waiter that it will wake up and ownership of the mutex is passed if possible. Execution continues until the thread completes its quantum and/or it enters the kernel for I/O. At that time, if the thread chosen to be awakened is of higher priority, context is switched and the other thread will execute the remainder of the current quantum, otherwise, the thread is scheduled to be run on an adjacent processor. This reduces the amount of user/kernel switches which occur and fit well with Solaris's 1:1 scheduling (they abandoned their M:N scheduling a few years back which would have allowed the context switch to immediately occur in userland).
    Sun has a paper on their design somewhere - and this is just based upon what little tidbits that I remember so I may be wildly inaccurate.
    To anyone who is more knowledgeable, please correct me where I am wrong.

    ReplyDelete
  2. The steps that I listed enable me to show linear scale-up independent of the choice of server and OS.

    ReplyDelete

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.