Thursday, May 28, 2009

InnoDB performance TODO list

These are my plans for making InnoDB faster on SMP and high-IOPs servers. I think we can double throughput at high levels of concurrency.

Future work:
  1. Reduce the size of mutex and rw-lock structures
  2. Reduce contention on the sync array mutex
  3. Reduce contention on kernel_mutex
  4. Reduce contention on commit_prepare_mutex
  5. Reduce the number of mutex lock/unlock calls used when a thread is put on the sync array
  6. Name all events, rw-locks and mutexes in InnoDB to make contention statistics output useful
  7. Add optional support to time all operations that may block
  8. Introduce dulint to native 64-bit integer types
  9. Make BUF_READ_AHEAD_AREA a compile-time constant
  10. Prevent full table scans from wiping out the InnoDB buffer cache
  11. Make prefetching smarter
  12. Get feedback from Dimitri, Domas, Mikael and Percona
  13. Use prefetch with MRR/BKA to get parallel IO in InnoDB
  14. Investigate larger doublewrite buffer to allow for more concurrent IOs
  15. Make Innodb work with a 4kb page size
  16. Make trx_purge() faster when called by the main background thread
  17. Use crc32 for Innodb page checksums with hardware support or otherwise make checksum faster.
  18. Reduce the per-page overhead for sync objects
  19. Repeat
Current work:
  1. Add my.cnf options to disable InnoDB prefetch reads
  2. Put more output in SHOW INNODB STATUS and SHOW STATUS
  3. Reduce the overhead from buf_flush_free_margin()
  4. Change background IO threads to use available IO capacity
  5. Use more IO to merge insert buffer records when the insert buffer is full
Non-InnoDB work:
  1. Fix mutex contention for the HEAP engine
  2. Fix mutex contention for the MyISAM engine
  3. Fix mutex contention for the query cache
  4. Give priority (CPU, disk) to the replication SQL thread to minimize replication delay.
  5. Push changes for --oltp-secondary-index to public sysbench branch
  6. Add support to sysbench fileio for transaction log and doublewrite buffer IO patterns

5 comments:

  1. Mark, one theoretical item I've thought about after reading the code, but never really proven to be a problem -- the way the log buffer is written and flushed. I wonder if the fact that flushing blocks log buffer writes, and then moves a bunch of data around (the remaining unflushed stuff), is a problem.

    ReplyDelete
  2. I have not seen that to be a problem (yet). Ben has changes in the v3 patch that reduce mutex contention between log_sys->mutex and buffer_pool->mutex.

    ReplyDelete
  3. Mark,

    First, let me say how much we at Innobase appreciate your interest, support and investment in InnoDB. Please don't hesitate to contact us as you work on these things.

    While we can't a priori agree to incorporate any specific patch that anyone may come up with, we are definitely interested in looking at all potential contributions that can improve performance or functionality.

    We have to consider a number of things, of course, including making sure the patch applies to the InnoDB Plugin (not just the builtin InnoDB), and we do need to make patches portable (because users run on platforms other than Linux). We are more than willing to do that work, of course, but we have to apply our resources selectively.

    One thing that would help a lot is if we could see a demonstration of proposed patches in isolation, just one at a time. When we see a group of patches all tested at once, we have no way to tell whether any given patch makes a significant contribution or not. We would look for a "scientific method", where only ONE thing is changed in a controlled experiment.

    In performing these tests, it is helpful if the baseline is relevant too ... when we see results of some workload running on an unpatched 5.0 compared to a heavily patched 5.1, it's really hard to tell the difference any given patch makes.

    We also need to see concrete evidence of performance gains for individual patches, not just a rational / theoretical argument about how things are supposed to work. I know you know the subtleties here and recognize that sometimes a change may have unexpected (or unwanted) effects.

    It also helps to know which workloads are improved, which may not be affected ... and which might be negatively impacted.

    Finally, because multiple patches could address similar or related areas, a disciplined approach is to investigate patches in order of the magnitude of their improvement. We would like to incorporate a patch that provides the most significant benefit before we consider others. It may turn out that the contribution of one patch means that another patch is not useful or necessary.

    All that said, again thank you, and keep up the good work!

    ReplyDelete
  4. One more item: make auto_increment persistent, so when we restart, we don´t lose the latest value.
    (I know it´s not "performance" related, but the excuse I´ve been listening for the last years, it´s that this is a performance issue)

    ReplyDelete
  5. Horace,

    That is a great request, but I won't do it. Maybe Percona or Innobase will.

    ReplyDelete

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.