Wednesday, August 12, 2009

A reason to use 5.1

We now have 2 great storage engines for 5.1 -- InnoDB 1.0.4 and XtraDB. We need more performance results to understand InnoDB 1.0.4, but it looks excellent from the code I have reviewed. This describes some of the changes based on a brief review. All of this make my work easier as I can reduce the size of the patch I need to maintain extreme performance with MySQL.

Kudos to InnoDB for delivering these features in 5.1 and to Percona and Google for contributing patches.
  1. support for more background IO threads - InnoDB and XtraDB support a configurable number of background IO threads for prefetch reads and dirty page writes. The my.cnf parameters are innodb_read_io_threads and innodb_write_io_threads. Prefetch read requests are generated during queries and when insert buffer entries must be merged. For InnoDB, IO requests are hashed by extent number (64 16kb pages per extent) to the per thread request queues although when a request queue is full, then a request will use any queue. Each queue can hold 256 pending requests. I assume the code in XtraDB is the same. The Google patch uses one queue for all read or write threads which should provide better throughput when there are hot extents, but also requires many more changes to the current source.
  2. support for group commit - not only does this fix an old regression, I think that it also fixes bug 46459 which degrades performance when autocommit insert statements are used on tables with an auto increment column.
  3. adaptive flushing - one of the things that makes InnoDB is the use of adaptive algorithms to keep the server balanced. Many of these are not documented because they work and we don't need to know about them. They may have added a new one with support for adaptive flushing. I hope to see performance results from Percona for this, but I think this is another thing in InnoDB we can soon forget as it will work without problems.
  4. readahead - prior to 1.0.4, InnoDB could generate read prefetch requests when it detected sequential or random access to most pages in an extent. For 1.0.4, the use of readahead for random access to pages within an extent appears to have been removed. The use of readahead for sequential access to pages within an extent has been changed to use a new my.cnf parameter, innodb_read_ahead_threshold, that sets the number of pages that must be accessed sequentially within an extent before all of the pages in the physically adjacent extent will be prefetched. I am still not fond of this feature because:
    • I am not aware of any performance counters that report on the success of readahead (#fetched versus #fetched_and_used). But you can disable readahead now and measure the impact on your application.
    • Prefetch requests for the pages in the next extent are generated late. For example, if innodb_read_ahead_threshold=56, then requests are generated when the 56th (out of 64) page in the current extent is used.
    • If request merging is done for all of the pages in the next extent, then a 1MB read will be used and none of the pages can be accessed until the read completes.

6 comments:

  1. While I generally agree with you, I tend to tell customers to stick to 5.0 unless they have a clear need for 5.1. We are still running into crazy issues with 5.1 that leaves us both disappointed and scratching our heads. These are not so much related to InnoDB or XtraDB. That seems pretty solid. It's the other stuff, such as random issues with RBR, query_cache problems, partitioning weirdness, even loss of data in some cases.

    I still consider it to be an early adopter for at least most of the customers I deal with. As long as people are aware of the limitations and potential issues, it will likely serve them well. I personally run 5.1 on my servers but, then again, my servers don't really make any money :)

    ReplyDelete
  2. I can see a lot of advantages of 5.1. row-based replication is certainly a major one.

    I would definitely use 5.1 for newly developed applications, and would consider upgrading to 5.1 if the new features gave a clear benefit (depending on the QA workload involved).

    I recently did a very complicated migration from 4.1 to 5.0 (we started it just before 5.1 went GA), but it looks like 5.0 to 5.1 is less painful.

    ReplyDelete
  3. Mark,

    First a word about multiple IO threads. The google patch as we had a chance to look at it would have broken native AIO on windows. On Windows multiple IO threads have been around since day one and these threads wait on native wait events defined for local segment of the IO array. In other words, multiple threads calling WaitForMultipleObjects() on the same global event array would have resulted in undefined behavior. This, in addition to the extra complexity as you mentioned, was one of the reason we chose to keep the idea of local segments intact.

    Regarding readahead, having extra stats is definitely a step in right direction but we'd like to think it through. For example, does it make sense to have the readahead hits listed at a global level or a per table level?

    InnoDB actually issues a readahead call when it reads the boundary page. So in the example you have given above the readahead call won't be issued when 56th page is accessed, it will be issued when the last page of the current extent is accessed (even when innodb_read_ahead_threshold = 56). This is because checking the access pattern is a non-trivial task involving acquisition of buffer_pool mutex. It can be too expensive if performed at each access.
    Coalescing, I think, is a good thing and even if a readahead request is 1MB it should not take much more than a single 16K random read. Extent pages are made available as and when the IO thread calls completion routine on them. If you have data showing that coalescing is causing problems by making threads to wait for the first page while the other pages are read being read in please do share with us. We'll be happy to look into that and work towards a solution.

    Thanks for your nice comments about plugin 1.0.4.!

    ReplyDelete
  4. @Inaam - didn't I leave comment in the code warning that I don't do windows?

    I am adding more instrumentation to a Percona build to measure the distribution of request handling across threads and will publish results, but the real benefit from that is to get data from many users.

    ReplyDelete
  5. @Tim,

    There are definitely residual issues with RBR, for example intermittent binlog corruption. :( That said, RBR helps quite a bit in dealing with charsets, binary data, sql mode, etc. These are areas where some of the statement replication behavior is either confusing or frankly inexplicable. We parse the logs directly for Tungsten Replicator, hence have seen several of these problems up close and personal.

    ReplyDelete
  6. "Mark Callaghan of High Availability MySQL has found a reason to use 5.1: 'I can reduce the size of the patch I need to maintain extreme performance with MySQL.' Mark also has some remarks on four new features from InnoDB, Percona, and Google. [...]

    Log Buffer #158

    ReplyDelete

 
Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.