I used to think that MySQL rarely crashed. Then I began running a script to search the database error log (/var/lib/mysql/HOSTNAME.err) for crashes on a daily basis. Now I have numbers to show that it rarely crashes and can compute the MTBF for my servers.
For the purpose of this blog, I measure MTBF as the expected number of days that a MySQL server will run before crashing. If I have 100 servers and there is 1 crash every 10 days, then my MTBF is 1000 days. The crash may be caused by a software bug or flaky hardware and it isn't always possible to distinguish between the two. Also, this excludes planned restarts so the uptime of a server almost never reaches the MTBF.
With those rules, what is your MTBF?
I use two types of servers and MySQL with InnoDB. The two types have an MTBF of 600 days and 2000 days. With a bit of work, the MTBF for one of the types can be improved from 600 to more than 1000 days. I think these numbers are remarkable, especially when I convert from days to years and round up (2 to 6 years is a long time). But MySQL is the only RDBMS I have deployed in a production environment, so I can't compare it to anything. Thus, my question.
Tuesday, January 22, 2008
Subscribe to:
Post Comments (Atom)


it would be cool to see the script that you used.
ReplyDeleteOr maybe a little bit sad. It is just Bash and Awk. But I will try to share it.
ReplyDeleteHi,
ReplyDeleteinteresting posting (it is incredibly hard to find mttf-,mtbf- or mttr-data on the web). I'd be interested in the rest of your setup (OS, Hardware, Network, etc.).
keep it up
holger
Hi,
ReplyDeleteinteresting posting (it is incredibly hard to find mttf-,mtbf- or mttr-data on the web). I'd be interested in the rest of your setup (OS, Hardware, Network, etc.).
keep it up
holger
Alas, I don't want to use brand names. But the servers are commodity SMP boxes. They are nothing special, although what isn't special today was very expensive a few years back.
ReplyDelete