Sunday, December 14, 2008

Make MySQL faster in one hour

There may be a simple way to improve InnoDB performance on SMP servers. The benefit is not as great as that obtained by using the smpfix from Google or Percona, but the change is much simpler. It is also portable so it makes MySQL faster on non-x86 platforms.

I want to hear from people who test this with real workloads on servers with 8+ cores or who do any type of testing on platforms other than Linux/x86. The patch for MySQL 5.0 is at code.google.com.

The change is to replace the mutex_t used in the InnoDB rw_lock_struct with a pthread_mutex_t. Calls to lock, unlock, create and destroy rw_lock_struct::mutex in sync0rw.c must also be updated.

InnoDB implements a mutex (mutex_t) and a read-write lock (rw_lock_struct). Both of these spin when a lock cannot be granted. On my platforms, the code spins for about 4 microseconds and then the thread waits on a condition variable. rw_lock_struct uses mutex_t to protect its internal state. I think that InnoDB is faster on SMP when pthread_mutex_t is used in place of mutex_t for rw_lock_struct::mutex. The following describes the overhead from the use of the InnoDB mutex when there is contention. A thread that must sleep waiting for a lock does:
  • spin for a few microseconds trying to get the lock
  • reserve a slot in the sync array (one lock/unlock of the global sync array pthread_mutex_t)
  • reset an event (lock/unlock the event pthread_mutex_t)
  • wait on the event (lock/unlock the global sync array pthread_mutex_t, lock the event pthread_mutex_t, wait on a pthread_cond_t)
There are 4 pthread_mutex_lock calls and 3 pthread_mutex unlock calls on this codepath and 2 of the lock calls are for a global mutex which can be another source of mutex contention. All of this can be replaced with the pair pthread_mutex_lock/pthread_mutex_unlock when rw_lock_struct::mutex is changed to use a pthread mutex.

Of course, you shouldn't take my word for it so I will provide a few results. These were measured on an 8-core x86 server that used Linux 2.6. Three mysqld binaries were tested:
  • base - MySQL 5.0.37 and the Google patch excluding the smpfix changes
  • smpfix+tcmalloc - MySQL 5.0.37 and the Google patch including the smpfix changes and linked with tcmalloc
  • pthread_mutex - base with rw_lock_struct::mutex changed to use pthread_mutex_t
Results for sysbench --test=oltp --oltp-read-only. This displays transactions per second for sysbench run with 1, 2, 4, 8, 16, 32 and 64 concurrent users.
 


Results for sysbench --test=oltp --oltp-read-write. This displays transactions per second for sysbench run with 1, 2, 4, 8, 16, 32 and 64 concurrent connections.
Results for concurrent queries. Each query is a primary key - foreign key join between tables that each have 2M rows. Too long means it ran for 10s of minutes and I killed it. This displays the time in seconds to complete the query for 1, 2, 4, 8 and 16 concurrent users.

Binary 1 user 2 users 4 users 8 users 16 users
base 2.6 3.9 8.1 182.5 Too long
smpfix+tcmalloc 2.6 3.7 4.9 7.6 15.2
pthread_mutex 2.5 3.7 9.1 27.8 58.6

Results for concurrent inserts. Each user does a sequence of insert statements to a different table. Too long means it ran for 10s of minutes and I killed it. This displays the time in seconds to complete the inserts for 1, 2, 4, 8 and 16 concurrent users.

Binary 1 user 2 users 4 users 8 users 16 users
base 15.5 32.4 78.2 Too long Too long
smpfix+tcmalloc 12.6 21.5 40.5 112.4 232.9
pthread_mutex 13.5 23.8 76.0 378.7 Too long

9 comments:

  1. Interesting...just wondering - how is "throughput" measured? ... please add unity to x-axis...

    Roland

    ReplyDelete
  2. Throughput for sysbench is transactions per second. The concurrent join and insert tests report the number of seconds to complete the SQL statements.

    ReplyDelete
  3. The graphics show reduced throughput with pthread_mutex_t. If the numbers in the tables are seconds, then they also show that using pthread_mutex_t makes it slower. Am I reading the data incorrectly? If not, what is the point?

    ReplyDelete
  4. The graphics show that sysbench throughput for pthread_mutex_t almost matches that for smpfix+tcmalloc while performance for the base case drops dramatically at 8 concurrent users.

    ReplyDelete
  5. Mark,

    Is there any reason that smpfix+tcmalloc+pthread_mutex_t were not tested together?

    ReplyDelete
  6. smpfix and pthread_mutex_t are mutually exclusive. pthread_mutex_t is an attempt to get some of the benefit provided by smpfix with a much simpler code change.

    ReplyDelete
  7. "Mark Callaghan showed us how to make MySQL faster in one hour. Nice stuff. And real purty charts, too."

    Log Buffer #128

    ReplyDelete
  8. How to make MySql much faster in 3 seconds: change InnoDb with MyISAM :))

    ReplyDelete

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.