1. Things are getting interesting. MPAB continues to drive away potential supporters with the tone of their messages, the inclusion of pointless assertions, and the complete lack of references.
    • For example, Oracle could buy some companies developing PostgreSQL and target the core developers. Without the core developers working actively on PostgreSQL, the PostgreSQL project will be weakened tremendously and it could even die as ar result.
      • Or another company could hire all of the core developers from one area of MySQL. I am glad that people have the opportunity to work elsewhere.
    • MySQL is the database with the highest number of installed units in all markets (except in the high enterprise market where it has only a medium size unit share). 
      • All markets or non-embedded markets? SQLite claims at least 500M deployments. Oracle claims there are 200M deployments of Berkeley DB. 
    • MySQL is causing Oracle sales losses around 1 billion usd/year (in lost sales to MySQL and because of having to do heavy discounting when competing with MySQL). 
      • Where does this number come from? 
    • Oracle did not provide any remedies to the EC and the public promises they have published are just empty promises. 
      • How do we know what Oracle has or has not promised? I am sure that Oracle can contact the EC without involving MPAB. Besides, I thought the hearings at the EC were private. I don't trust summaries of the hearings from anyone on either side of the issue. 
    • The open source software it has acquired, like InnoDB, has after being acquired, been developed secretly and slowly which is against how things are done in the open source environment. 
      • Compared to what? From my perspective neither Oracle/InnoDB nor Sun/MySQL have been great in this area. But so what? Both continue to improve their software and most people don't care whether or not the development process is open.
    • MariaDB is an enhanced (faster, more features and less bugs) drop-in replacement of MySQL that is only available under GPL. 
      • The GA release of MariaDB has no bugs because there is no GA release.
    • The fork can't be used with other products that are using MySQL as a building block for their closed source applications.
      • Yes, it can (thanks Sheeri)
    • The fork has to work in an environment where no one has to pay for it.
      • I will speculate that most of the money earned by MySQL is from customers who don't have to pay. People buy support contracts because the support product is excellent, not because they must.
    • As long as the products are recognized to be competing, any solution that the EC would accept has to ensure that there is as much competition in the database field before the merger as after the merger.
      • Is it that simple? Is competition a binary decision? 
    • If MySQL were licensed under a permissive license, like BSD, then the users would benefit as they now can securely continue to use MySQL in all context. Monty Program Ab would also switch to only produce code under BSD for the MariaDB server, to ensure that also MariaDB can be used in all context. Monty Program Ab would benefit very little from of this; We cannot take money from selling BSD; We can only hope that there is a market demand for our skilled engineers.
      • I would love for it to be BSD. Then I can form a company to build a custom version of it make people pay for that version. Your blog post cites EnterpriseDB for doing this. Why can't MPAB do the same?
    • The companies that would benefit the most from BSD are the companies that enhance MySQL (storage engine vendors and companies providing extensions to MySQL) and companies that embed MySQL in their products, like Adobe or Cisco.
      • How does saving storage engine vendors, Adobe and Cisco save the internet? This has nothing to do with the MySQL users you claim have something at risk. Storage engine vendors don't have thousands of customers. I assume app vendors who embed MySQL have more customers, but even in that case I fail to see how the internet is at risk.
    9

    View comments

  2. MySQL 5.5 is here with several new features including semi-sync replication. The MySQL team has been getting a lot done lately. I know because I get many bug and feature request status updates. They did a lot of work on semi-sync because their implementation was a rewrite rather than a port of the Google patch. It had to be done that way to maintain code quality. Those of us who maintain large patches against MySQL frequently do things the convenient way rather than the right way to simplify patch maintenance.

    Read the MySQL manual to understand what semi-sync does. It is not synchronous replication. With semi-sync each connection to a master can lose at most one transaction when the master crashes. This reduces but does not eliminate the problem of transaction loss on a master crash. Semi-sync also limits a busy connection so that it cannot commit faster than a slave can receive its transactions.

    Additional references for this include:
    2

    View comments

  3. The Oracle and MySQL RDBMS are very different products. This makes me happy. I used to work on the Oracle RDBMS. It has a lot of features that do amazing things. Unfortunately, this also makes it extremely hard to modify. MySQL doesn't have as many features. This makes it easier to modify. This also means there are a lot of things to fix in it when you care about high-performance and high-availability OLTP workloads.

    But now we have a new story emerging from an independent source of news on the Oracle-Sun merger.
    One more week won’t change the fact that MySQL competes fiercely with Oracle’s database products including its flagship ‘11g’ across all major market segments.
    What does this mean besides a few more months of uncertainty for people at Sun/MySQL? Do they compete for customers? Or do they compete based on technology? We can only guess as the report is not public. I am sure it is a great document, at least that is what I have been told.

    Can we get this done and return our focus to the roadmap for 5.4, 6.0 and the MySQL User Conference? I would much rather bicker about who doesn't get to present at the conference, the rate at which community patches are accepted and my inability to republish an edited version of MySQL docs. MySQL would otherwise be on a roll right now with the progress they have made on 5.1 (it is a great release) and with work in progress for future releases.

    Update

    Wow, maybe the GPL means something. Eben Moglen finds factual errors in the still-secret statement of objections.
    5

    View comments

  4. I am neither for nor against the Oracle-Sun merger. I am against the damage done by extending the uncertainty on the outcome of the deal. MySQL as an organization is in great shape. The 5.1 release turned out better than some expected. The InnoDB plugin is excellent. What is the roadmap? MySQL is limited in what they can say about their future. That hurts all users and customers.

    A lot of nonsense has been written about this. As MySQL employees cannot write about it, the discussion has been one sided, full of speculation and justified by quotes from random people. I almost provided a few quotes myself when contacted by someone I thought was a potential MySQL customer.

    I am neither a lawyer nor an economist, so I don't understand their notion of competition as applied to this issue. I wish that were clear. I don't think that the 8-year old E-Week benchmark implies anything about whether MySQL and Oracle compete. Nor do I think that a few slides from a project at Sun that failed to migrate Oracle customers to MySQL is evidence of that. Marten's letter set a high standard for the discussion. I hope others follow it.

    Clearly competition isn't defined by revenue as that is something between $100M and $300M per year. The database market is much larger than that. For better or worse, MySQL has not done a good job of monetizing their users. Maybe they have not tried to do that as their value seems to be independent of their revenue. But someone has to fund the development of MySQL.

    I have worked on source code for the Oracle and MySQL database servers. I have used MySQL in production. Some parts of MySQL are amazing (InnoDB, JDBC, support, docs, bug database, server uptime, ease of use, NDB) but in no way do they compete on a feature basis. I hope that never changes as MySQL would be ruined were it to become as complex as Oracle. I am sure they compete for some customers who don't need all of the features provided by Oracle. But that competition includes Sybase, Microsoft and IBM.
    11

    View comments

  5. Forgive me for being a shill, but InnoDB appears to have added a feature for the next release of the InnoDB plugin that prevents the buffer pool from getting wiped out by a full table scan. Many people have requested this. The documentation is excellent. I have tested it and not only did it work as advertised, but it didn't degrade performance on OLTP workloads. This fixes bug 45015 and is a nice feature to have when you occasionally use mysqldump to copy a table from a busy OLTP server. Now is a good time to evaluate MySQL 5.1 with the InnoDB plugin.
    7

    View comments

  6. Managed MySQL is here. Amazon RDS allows you to run MySQL on their hardware. It isn't perfect, but I think this is a great first release. I expect this will support PostgreSQL soon given that the command-line tools are not MySQL specific.

    Note:
    • This uses MySQL 5.1.38
    • I did not see an option to enable SSH connections to MySQL. I think that is required for this to be a great way to run MySQL.
    • This supports MyISAM and InnoDB. They don't give you command line access to the machines, so you cannot run myisamchk to recover corrupt MyISAM tables, nor can you run myisampack to compress them. I think it is a good idea to stick with InnoDB and then ask Amazon to upgrade to the InnoDB 1.0.4+ plugin.
    • This appears to use network attached storage for most data. For example, innodb_data_home_dir=/rdsdbdata/db/innodb. I am not sure whether this buffers data in the OS buffer cache and if it does not, that will hurt MyISAM performance as it does not buffer table data.
    • Replication is disabled. That makes it much easier to run many instances of MySQL in the environment. Replication state is not crash proof and Amazon probably does not want to spend their days recovering/replacing/rebuilding slaves. But that also limits the use of this for read scale out. Maybe Amazon and RightScale have something in progress to change that without introducing manageability overhead.
    • The master user does not have SHUTDOWN, SUPER or replication privileges.
    • Binlogs are enabled, but the master user does not have privileges to run SHOW MASTER STATUS. The documents state that databases can be recovered up to the last 5 minutes. I assume this means that any writes done are guaranteed to be archived somewhere after 5 minutes. If there were an option to archive the binlogs, then that would provide an extra degree of safety.
    17

    View comments

  7. Be careful when using FLUSH TABLES WITH READ LOCK (aka FTWRL). I have written about potential problems that may occur when using FTWRL. Anyone who runs ibbackup or xtrabackup on a server that writes a binlog needs FTWRL to run as fast as possible with as few problems as possible, but that is not always the case. In its current form, you must monitor FTWRL and either kill it or long-running queries when FTWRL takes too long.

    MySQL does three things when processing FTWRL. First it sets the global read lock. Then it closes open tables. Finally it sets a flag to block commits. You will have problems in production when FTWRL doesn't return quickly. It doesn't return quickly when there are long running queries as it waits for the current queries to finish. The problem is that insert, update, delete and replace statements are blocked after the first step. When FTWRL lingers in the second step (close open tables), then your server will stop accepting writes. An additional problem is that for deployments with many open tables, it is a lot of work to close and then re-open them. I need to confirm whether re-open is done serially because a mutex is held and whether InnoDB re-samples all indexes on all reopened tables to get optimizer statistics.

    I blame MyISAM for the current problems. As I am not a MyISAM expert, this is an educated guess and I welcome your feedback. The problem with FTWRL is FT (flush tables) and MyISAM is the reason that tables must be flushed. The --delay-key-write option and possibly other features in MyISAM allow open tables to buffer committed changes. The buffered changes are written to MyISAM data files when the open table is closed.

    INSERT DELAYED might also cause problems, but anybody who needs a hot backup shouldn't be using that option.

    I think we can make this better and my solution is DFTBGRL (don't flush tables but get read lock). Maybe it needs a better name. DFTBGRL skips the second step of FTWRL -- it sets the global read lock and then it sets a flag to block commits. This should be much safer to use in production.

    Update:
    After I wrote this, Harrison and Konstantin from MySQL/Sun gave me advice on a better way to fix this. I have implemented their advice and it appears to work, but I need to test it. The result will be much better than FTWRL for InnoDB.
    5

    View comments

  8. My travel is booked for OpenSQL camp in Portland on November 14 and 15. It should be a great event in my favorite city. It is also an opportunity to speak with technical people about something other than MySQL. The current sessions are skewed towards MySQL, but Portland has an active PostgreSQL community and is home to Len Shapiro who contributed a lot to the development of high-performance hash joins. I hope there is some kind of PostgreSQL-MySQL exchange. I have yet to propose a topic, but am considering MySQL GIS, MUMPS, embedded InnoDB or the InnoDB plugin.
    2

    View comments

  9. I was debugging the performance of a DELETE statement that contained a subquery in the FROM clause. As there is no EXPLAIN for DELETE, I converted it to a SELECT statement (and hoped the same optimizations were done). But I still had to wait for EXPLAIN to complete. EXPLAIN evaluates subqueries in the FROM clause for MySQL. This can make EXPLAIN take a long time and create load on a server. Recent versions of MySQL have had many improvements for subquery optimization, but the documentation for all versions states that this is still done. A feature request is open to change this. Feature requests are also open to get EXPLAIN for UPDATE, INSERT and DELETE.

    Do other RDBMS products support EXPLAIN for subqueries in a FROM clause without evaluating the subquery?
    6

    View comments

  10. I reviewed most of the changes from the v4 Google patch today. My head hurts now. During this review I checked whether bugs fixed in the patch have also been fixed in recent releases of official MySQL. I am happy that most of them have been fixed. But some changes will never be accepted, such as the one that added support for INF for FLOAT/DOUBLE columns.

    The default value of sql_mode is the empty string. You probably want to change that before your applications come to depend on it. When it is the empty string, invalid values are coerced to valid values on INSERT and UPDATE and a warning is returned. Applications usually ignore the warnings. The coercion includes:
    • INT values that are too big are set to the maximum value of an INT. The same is done for BIGINT
    • INF is changed to MAX_DOUBLE or MAX_FLOAT for a DOUBLE/FLOAT column
    • varchar and LOB columns are truncated to not exceed the maximum length
    • invalid DATE values are accepted
    What needs this behavior? MyISAM. For a storage engine that doesn't do rollback, one way to handle invalid data during an INSERT or UPDATE statement is to coerce it to valid values and proceed with the statement. I am not fond of this approach. An alternative for data warehouse workloads is to use an exception table to log rows with invalid data and avoid corrupting non-exception tables.

    MyISAM has also made replication semantics and internals much more complex. For example, what is written to the binlog in this case, and has this behavior changed between releases?
    begin;
    insert into Innodb_table values (1);
    insert into Myisam_table values (1);
    rollback;
    
    I think that MyISAM has its place. It does fast table scans, but InnoDB is much faster on just about everything else. I am just not thrilled with the impact it has had on MySQL. It can be used for tasks where a table or partition is loaded once and then made readonly after the insert. This is a good fit for data warehouse tasks. Although it would be better were multi-core performance improved and the key cache expanded to include data blocks. MyISAM can also be used for scratch tables on a slave.

    Drizzle avoided these problems by limiting MyISAM to temporary tables.
    11

    View comments

Loading