Thursday, September 2, 2010

Speaking at MySQL Sunday

I am speaking at MySQL Sunday. The title for my talk is Success with MySQL and I will focus on things that  operations and users can do to make a MySQL deployment succeed. There are many interesting talks scheduled for Sunday, including several at the same time as mine. I hope to see you there.

Sunday, July 25, 2010

Why don't you use X?

Sometimes I am asked why don't I use X instead of official MySQL. The answer is simple. I like to use it because I have been using it, the MySQL development team (including InnoDB) has done great work this year and because change is expensive. The cost of change includes the cost of evaluating the alternatives and the cost of deploying them. The cost of change also includes the features I won't work on because I am doing an evaluation. I also use it because the quality of new 5.1 releases has been very high this year. I know because I test them and some of the alternatives.

My initial evaluation criteria are simple. I don't like compiler or valgrind warnings. The alternative should not introduce new ones. I like regression tests. The alternative should not disable or fail existing tests. If the existing test is somewhat bogus, then it should be fixed. I love buildbot as done by MariaDB, fixes in official MySQL to reduce compiler warnings and all of the work done by Drizzle to not tolerate compiler warnings. When the alternative adds new features it must add new regression tests (hooray for status_user.test in MariaDB).

I spent a lot of time debugging valgrind warnings that occur in MySQL 5.1.47. All of them were bogus and I think future versions of MySQL will prevent these. That is good news as I prefer to not repeat that effort the next time I upgrade to a new MySQL release.

Percona and MariaDB confront the same issues and more when considering code to incorporate. In addition to what I wrote about above, they must review patches to make sure the code is reasonable. There is a lot of duplicate effort done by groups that patch or fork MySQL. I wish this weren't the case. If the external fork & patch effort is consolidated around MariaDB, then we can reduce some of this.

Friday, July 23, 2010

Building MariaDB with the InnoDB plugin

This post was inspired by a couple events. But I won't explain them other than to say I think there have been too many subjective comments (or FUD) about quality. This is an attempt to quantify whether the grass is greener on the other side.

Today I tried to build MariaDB with the InnoDB plugin. I was told his is now supported. The last time I checked XtraDB replaced the InnoDB plugin in MariaDB. MPAB had a reasonable reason for doing this as they don't want to test both the InnoDB plugin and XtraDB. But I prefer choice, despite the many great features in XtraDB.

First I tried using the 5.3 release. That failed fast. I prefer fast failures over obscure ones:

./configure --enable-thread-safe-client --with-plugins=partition,csv,blackhole,myisam,heap,innodb_plugin --without-plugin-innobase --with-fast-mutexes --with-extra-charsets=all --with-debug C_EXTRA_FLAGS="-fno-omit-frame-pointer -fno-strict-aliasing -Wall"
...

configure: error: unknown plugin: innodb_plugin

Then I tried the 5.2 MariaDB release. This time the configure command worked. After running make, both XtraDB and the InnoDB were compiled. Time to try a test. There were no failures! Alas, all tests were skipped.
./mysql-test-run.pl --suite=innodb_plugin
...
innodb_plugin.innodb                     [ skipped ]  No innodb support
innodb_plugin.innodb-analyze             [ skipped ]  No innodb support
innodb_plugin.innodb-autoinc             [ skipped ]  No innodb support
innodb_plugin.innodb-autoinc-44030       [ skipped ]  No innodb support
innodb_plugin.innodb-consistent          [ skipped ]  No innodb support
innodb_plugin.innodb-index               [ skipped ]  No innodb support
innodb_plugin.innodb-index_ucs2          [ skipped ]  No innodb support
innodb_plugin.innodb-lock                [ skipped ]  No innodb support
innodb_plugin.innodb-replace             [ skipped ]  No innodb support
innodb_plugin.innodb-semi-consistent     [ skipped ]  No innodb support
innodb_plugin.innodb-timeout             [ skipped ]  No innodb support
innodb_plugin.innodb-use-sys-malloc      [ skipped ]  No innodb support
innodb_plugin.innodb-zip                 [ skipped ]  No innodb support
innodb_plugin.innodb_bug21704            [ skipped ]  No innodb support
innodb_plugin.innodb_bug34053            [ skipped ]  No innodb support
innodb_plugin.innodb_bug34300            [ skipped ]  No innodb support
innodb_plugin.innodb_bug35220            [ skipped ]  No innodb support
innodb_plugin.innodb_bug36169            [ skipped ]  No innodb support
innodb_plugin.innodb_bug36172            [ skipped ]  No innodb support
innodb_plugin.innodb_bug38231            [ skipped ]  No innodb support
innodb_plugin.innodb_bug39438            [ skipped ]  No innodb support
innodb_plugin.innodb_bug40360            [ skipped ]  No innodb support
innodb_plugin.innodb_bug40565            [ skipped ]  No innodb support
innodb_plugin.innodb_bug41904            [ skipped ]  No innodb support
innodb_plugin.innodb_bug42101            [ skipped ]  No innodb support
innodb_plugin.innodb_bug42101-nonzero    [ skipped ]  No innodb support
innodb_plugin.innodb_bug44032            [ skipped ]  No innodb support
innodb_plugin.innodb_bug44369            [ skipped ]  No innodb support
innodb_plugin.innodb_bug44571            [ skipped ]  No innodb support
innodb_plugin.innodb_bug45357            [ skipped ]  No innodb support
innodb_plugin.innodb_bug46000            [ skipped ]  No innodb support
innodb_plugin.innodb_bug46676            [ skipped ]  No innodb support
innodb_plugin.innodb_bug47167            [ skipped ]  No innodb support
innodb_plugin.innodb_bug47621            [ skipped ]  No innodb support
innodb_plugin.innodb_bug47622            [ skipped ]  No innodb support
innodb_plugin.innodb_bug47777            [ skipped ]  No innodb support
innodb_plugin.innodb_bug51378            [ skipped ]  No innodb support
innodb_plugin.innodb_bug51920            [ skipped ]  No innodb support
innodb_plugin.innodb_bug52663            [ skipped ]  No innodb support
innodb_plugin.innodb_bug52745            [ skipped ]  No innodb support
innodb_plugin.innodb_file_format         [ skipped ]  No innodb support
innodb_plugin.innodb_information_schema  [ skipped ]  No innodb support
innodb_plugin.innodb_trx_weight          [ skipped ]  No innodb support

Then I checked for compiler warnings. I really dislike compiler warnings. MySQL has recently done a lot of work to remove them from 5.1 (thanks Davi). I think all of this work was considered one bug, but it was a lot of work and will make MySQL better. They have also begun to do some builds with Werror. See bug 53445 for all of the details.

After compiling storage engines with -Wall, there are no warnings for the official InnoDB plugin (storage/innodb_plugin). There are warnings for XtraDB (storage/xtradb):

btr/btr0btr.c:2871: warning: null argument where non-null required (argument 1)
btr/btr0cur.c:1841: warning: null argument where non-null required (argument 2)
btr/btr0cur.c:1860: warning: null argument where non-null required (argument 1)
btr/btr0cur.c:1967: warning: null argument where non-null required (argument 1)
fil/fil0fil.c:3106: warning: pointer targets in passing argument 2 of ‘dict_table_get_index_on_name’ differ in signedness
ibuf/ibuf0ibuf.c:775: warning: null argument where non-null required (argument 1)
ibuf/ibuf0ibuf.c:950: warning: null argument where non-null required (argument 1)
os/os0file.c:4194: warning: pointer targets in assignment differ in signedness
I want to compare the results from running all regression tests with valgrind. But that might take some time. I am able to run all InnoDB tests without valgrind warnings using the Facebook-patched MySQL 5.1.47. That required a few small changes that are likely in the recent 5.1.49 release. 

Have all regression tests been run under valgrind for MariaDB using either the InnoDB or XtraDB plugin?

Stay tuned for part 2.

Wednesday, July 21, 2010

Conversation starters for OSCON

I will be at OSCON in a few hours. Mohan and I have a talk on FlashCache on Thursday. The talk will have lots of details on the FlashCache implementation. I expect to be quiet except for a few slides on performance. Mohan and Paul did an amazing job getting FlashCache running on our servers. This is an opportunity to learn from Mohan.

As Percona has been doing a lot of work with it, I hope they will be at OSCON to discuss their experience with it.

There are other interesting things to talk about if your area is data management:

  • MySQL is doing great for their customers this year. I began using MySQL in 2005. This has been the best year for me. The 5.1 release is excellent and the 5.5 beta looks great. The MySQL development team has been fixing bugs fast. Merging InnoDB and MySQL into one company means that their development teams work better, fix bugs faster and talk more about new features.
  • Your SQL, not MySQL, is frequently the problem. Sometimes we make MySQL deployments better by hacking on MySQL. More often improvement comes from changing application SQL. While some of the application changes compensate for less than perfect behavior in MySQL, more are done to fix things that would be a problem for any database. The biggest thing that MySQL needs to fix in this area is monitoring. It must make it easier to identify performance problems. Until then tcpdump, Poor Mans Profiler and mk-query-digest are excellent options.
  • Where did Java go wrong? MySQL has a wonderful JDBC driver. I don't blame the implementation. But Java clients continue to cause too many database problems for me. I recently logged all SQL on a server and noticed that the JDBC client connected, ran 13 statements to prepare the connection (including 5 set autocommit statements) and then ran 1 query. That is an amazing amount of bloat. I have seen many other cases where preparing/returning a connection from/to the pool required 5 to 10 statements. Given that number of round trips between the client and server it isn't likely that the connection pool saves any overhead on the database. I am currently dealing with Java applications that set tx_isolation to read-committed for InnoDB. With MySQL 5.1 all binlog events written for such a connection must use row-based replication. For now I will assume that most of the Java apps use read-committed because they want to rather than because they need to.
  • Just because your database is sharded doesn't mean you lose joins. You lose the ability to do joins or enforce foreign keys across all of your data. But lots of interesting queries can be run within one shard. I prefer that long running queries using something other than MySQL as they will run much faster elsewhere.

Tuesday, July 13, 2010

A new book on MySQL replication and HA

There is a new book on MySQL replication and HA from Charles, Mats and Lars. I read it as a reviewer and learned more than a few things. It has many details on internals that are not described elsewhere unless you are willing to read the source code. It also describes how to deploy MySQL replication for many use cases. I think the book can save people from some failures that are inevitable when a distributed system is deployed for enough time on enough servers.

Wednesday, May 5, 2010

Best practices

Back in the day we wrote C without support for type checking. Many bugs were missed because of this. Someone wrote lint and many bugs were prevented. Not everyone takes advantage of tools that prevent easily fixed errors. MySQL builds are done without using the -Wall option in gcc to generate more warnings. Nor do they use the -Werror option to fail on warnings. This allows for some silly things in production releases like bug 51289 (return NULL for a function that is declared to return double):
double Item_cache_decimal::val_real()
{
  DBUG_ASSERT(fixed);
  double res;
  if (!value_cached && !cache_value())
    return NULL;
Not using -Wall and -Werror allows for more serious problems to be missed. Bug 42733 is an example of one such problem. It also has allowed me to miss problems in code that I change.

I filed bug 53445 for this. MySQL has been very good at fixing things lately. This should also get fixed. Visit the bug and subscribe to it or update it.

Searching for "site:bugs.mysql.com compiler warning" finds a lot of entries.

Sunday, April 25, 2010

Consistency across a WAN

There are three solutions for providing consistency in a data service that operates across a wide area network (WAN). None of them are free. What are you willing to pay and where are you willing to add complexity? Depending on what you choose your system can be more complex for external users, internal application developers or operations. The choices are multi-master with conflict resolution (eventual consistency), multi-master with conflict prevention (strong consistency) and single master with downtime on failover.

If you choose eventual consistency (EC) then internal application developers must write logic to resolve conflicts and external users will occasionally encounter inconsistent data. This might be a small price to pay for a system that provides higher availability and transparent failover. I am not aware of support for secondary indexes in the popular EC systems. I wonder if the same logic that does eventual consistency across a WAN might be reused to keep secondary indexes eventually consistent within a datacenter. That would impose an additional cost on internal application developers in return for expanding the workloads that EC can support.

If you choose strong consistency then external users experience more latency on writes as the transaction commit requires one or two round trips across a WAN. This might be a small price to pay for a system that provides higher availability and transparent failover. Galera is doing interesting work in this area for MySQL and they have already begun to publish results. I need to read more about that.

If you choose single master then you will spend more money to make that master less likely to fail. You will also experience more downtime and higher support costs while doing manual failover as quickly as possible. Solutions include RAID 10, battery backed write cache, highly-available SAN/NFS, DRBD and pagers for your operations team.

I don't know if people choose single master in the MySQL community. There are not many choices. It supports multi-master replication but without conflict resolution. It supports strong consistency with Galera but that is new on the MySQL market. Galera might be the killer application for MariaDB. Tungsten is another product that can reduce the complexity of master-slave replication.

Unless you are using Tungsten, it very hard to automate master failover for MySQL when there is more than one slave per master. But many deployments need a master and slave in one datacenter and another slave in a remote datacenter.
 
Creative Commons License
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.