Packt Publishing gave me a copy of MySQL 5.1 Plugins Development to review. The book is written by two people who are MySQL experts -- Sergei Golubchik and Andrew Hutchings. I know Sergei because he answers a lot of my questions on the MySQL Internals mail list. I wasn't in the best mood when I opened the e-book as I worked late last night. My mood was much better after reading a few pages. The book is amazing. The content, presentation and editing are excellent. The book is full of relevant examples that show you how to build plugins including source code and tips on compiling. The text carefully explains all of the steps required to build each of the plugins.
I wish I had this book several years ago. It would have saved me a lot of trouble. But from the quality of the book I suspect that they have been working on it for several years.
Thank you Sergei and Andrew. This is a great addition to the MySQL community.
The target audience for this book is someone who wants to write code that will run within the mysqld process. This includes user defined functions (UDF) to be called by SQL statements, daemon plugins that can run code using a dedicated thread in the server process, information schema plugins to expose data in INFORMATION_SCHEMA, full-text parser plugins and storage engine plugins. There are examples for three storage engines. The first engine is simple and only supports read-only tables. The second example supports read-write tables without indexes and uses HTML as the file format. The final example builds a storage engine using Tokyo Cabinet.
My team has put a lot of things into the mysqld process. Some of those changes were even useful. This book would have prevented us from making a few mistakes. It even has an example to export the output from getrusage via SHOW STATUS. I added a similar patch to MySQL, but I don't think my change was as nice.
Thursday, September 30, 2010
Wednesday, September 29, 2010
Drizzle beta is here!
Congratulations!
Can we please tone down the marketing? It is great to see a community project grow, especially in the MySQL family. But you are judged by successful deployments, not by the class libraries used by your project.
Drizzle is only more reliable than MySQL when it keeps the replication log and InnoDB in sync during crash recovery. But it does not do that today. Official MySQL supports this on a master via sync_binlog. MySQL slaves do this today via rpl_transaction_enabled which is available in the Facebook and Google patches. I think it is also available in Percona Server, MariaDB and XtraDB.
I have been told that Drizzle was designed for multi-core scalability. I am not sure what that means on real workloads or benchmarks. I know that there are some bottlenecks in InnoDB that have nothing to do with the code above it (MySQL, Drizzle). I also know that MySQL has made huge strides this year and MySQL 5.5 far exceeds my expectations. Alas I cannot validate whether "designed for scalability" results in performance that is better than MySQL 5.5 as I cannot build Drizzle on my servers due to the dependency on gcc 4.2 or greater.
Can we please tone down the marketing? It is great to see a community project grow, especially in the MySQL family. But you are judged by successful deployments, not by the class libraries used by your project.
Drizzle is only more reliable than MySQL when it keeps the replication log and InnoDB in sync during crash recovery. But it does not do that today. Official MySQL supports this on a master via sync_binlog. MySQL slaves do this today via rpl_transaction_enabled which is available in the Facebook and Google patches. I think it is also available in Percona Server, MariaDB and XtraDB.
I have been told that Drizzle was designed for multi-core scalability. I am not sure what that means on real workloads or benchmarks. I know that there are some bottlenecks in InnoDB that have nothing to do with the code above it (MySQL, Drizzle). I also know that MySQL has made huge strides this year and MySQL 5.5 far exceeds my expectations. Alas I cannot validate whether "designed for scalability" results in performance that is better than MySQL 5.5 as I cannot build Drizzle on my servers due to the dependency on gcc 4.2 or greater.
Tuesday, September 14, 2010
Oh, no - MongoDB can be fast for key-value stores
I have broken my promise to stop writing about this. Sorry, but I had to correct my mistake. I ran three micro-benchmarks: get by primary key, get by secondary key and update by primary key. MySQL had a higher peak QPS for all of them. Alas, the results for get by primary key were skewed because pymongo, the Python driver for MongoDB, uses more CPU than MySQLdb, the Python driver for MySQL. The client host was saturated during the test and this limited peak QPS to 80,000 for MongoDB versus 110,000 for MySQL.
I repeated one test using two 16-core client hosts with 40 processes per host. For that test the peak QPS on MongoDB improved to 155,000 while the peak for MySQL remained at 110,000. That is an impressive result. The results for get by secondary key and update by primary key are still valid as the server host saturated on those tests.
Now I must consider rewriting the test harness in Java, C or C++ or I could add MongoDB support to sysbench. I prefer Python. In addition to under-reporting MongoDB peak performance it also under-reports MySQL performance. I am able to get 180,000 peak QPS using sysbench on one 16-core client host and mysqld on another versus 110,000 using the Python equivalent.
I repeated one test using two 16-core client hosts with 40 processes per host. For that test the peak QPS on MongoDB improved to 155,000 while the peak for MySQL remained at 110,000. That is an impressive result. The results for get by secondary key and update by primary key are still valid as the server host saturated on those tests.
Now I must consider rewriting the test harness in Java, C or C++ or I could add MongoDB support to sysbench. I prefer Python. In addition to under-reporting MongoDB peak performance it also under-reports MySQL performance. I am able to get 180,000 peak QPS using sysbench on one 16-core client host and mysqld on another versus 110,000 using the Python equivalent.
Monday, September 13, 2010
MySQL versus MongoDB - update performance
This is the end of my public performance comparison of MongoDB versus MySQL, at least for the next few weeks. For these tests I used a 16-core x86_64 client host and an 8-core x86_64 server host, MySQL 5.1.50 with the Facebook patch, MongoDB 1.7.0 and Python clients. Ping between the client and server hosts was ~200us.
I tested 5 configurations:
InnoDB has much better throughput when the binlog is disabled. The clients update rows selected at random from 2M rows. InnoDB is able to handle more of that load concurrently than MongoDB as a reader-writer lock is used that prevents concurrency within the database. Note that while InnoDB allows more of the update to be done concurrently, it isn't perfect as there are several global mutexes in MySQL/InnoDB including LOCK_open, kernel_mutex and the InnoDB buffer pool mutex. When the binlog is enabled by sync_binlog is disabled even more global mutexes are used. Finally, when the binlog is enabled and sync_binlog=1 then group commit is not enabled and updates are rate limited by the performance of fsync. In this case fsync is fast as a HW RAID card with battery backed cache was used.
This is output from PMP that demonstrates one source of mutex contention in mongod. The pileup occurs at mongo::MongoMutex which apprently implements the reader-writer lock.
def query_mongo(host, port, pipe_to_parent, requests_per, dbname, rows, check, testname, worst_n, id):
conn = pymongo.Connection(host, port)
db = conn[dbname]
signal.signal(signal.SIGTERM, sigterm_handler)
gets = 0
stats = SummaryStats(worst_n)
while True:
for loop in xrange(0, requests_per):
target = random.randrange(0, rows)
s = time.time()
try:
r = db.c.update({'_id': target}, {'$inc': {'k': 1 }}, safe=True)
assert r['updatedExisting'] == True
assert r['ok'] == 1
stats.update(s)
gets += 1
except:
assert got_sigterm
I tested 5 configurations:
- inno.5150fb.b0.s0 - MySQL 5.1.50, the Facebook patch, binlog disabled
- inno.5150fb.b1.s0 - MySQL 5.1.50, the Facebook patch, binlog enabled, sync_binlog=0
- inno.5150fb.b1.s1 - MySQL 5.1.50, the Facebook patch, binlog enabled, sync_binlog=1
- mongo.170.safe - MongoDB 1.7.0 and safe updates
- mongo.170.unsafe - MongoDB 1.7.0 and unsafe updates
Note that with unsafe updates the client does not wait for the server to respond. It sends requests as fast as it can or until the buffer between the client and server is full. When a sufficient number of concurrent clients are used and the clients run for enough time the buffer becomes full and unsafe updates do not improve performance.
InnoDB has much better throughput when the binlog is disabled. The clients update rows selected at random from 2M rows. InnoDB is able to handle more of that load concurrently than MongoDB as a reader-writer lock is used that prevents concurrency within the database. Note that while InnoDB allows more of the update to be done concurrently, it isn't perfect as there are several global mutexes in MySQL/InnoDB including LOCK_open, kernel_mutex and the InnoDB buffer pool mutex. When the binlog is enabled by sync_binlog is disabled even more global mutexes are used. Finally, when the binlog is enabled and sync_binlog=1 then group commit is not enabled and updates are rate limited by the performance of fsync. In this case fsync is fast as a HW RAID card with battery backed cache was used.
This is output from PMP that demonstrates one source of mutex contention in mongod. The pileup occurs at mongo::MongoMutex which apprently implements the reader-writer lock.
48 mongo::connThread,thread_proxy,start_thread,clone
37 pthread_cond_wait@@GLIBC_2.3.2,boost::condition_variable::wait,mongo::MongoMutex::lock,mongo::receivedUpdate,mongo::assembleResponse
10 recv,mongo::MessagingPort::recv,mongo::MessagingPort::recv
2
1 select,mongo::Listener::initAndListen,mongo::listen,mongo::_initAndListen,mongo::initAndListen,main,select
1 select,mongo::Listener::initAndListen
1 pthread_cond_wait@@GLIBC_2.3.2,mongo::FileAllocator::Runner::operator(),thread_proxy,start_thread,clone
1 nanosleep,mongo::DataFileSync::run,mongo::BackgroundJob::thr,thread_proxy,start_thread,clone
1 nanosleep,mongo::ClientCursorMonitor::run,mongo::BackgroundJob::thr,thread_proxy,start_thread,clone
1 nanosleep
1 mongo::webServerThread,thread_proxy,start_thread,clone
1 mongo::SnapshotThread::run,mongo::BackgroundJob::thr,thread_proxy,start_thread,clone
1 mongo::interruptThread,thread_proxy,start_thread,clone
1 mongo::BtreeBucket::findSingle,mongo::ModSetState::createNewFromMods,mongo::_updateObjects,mongo::updateObjects,mongo::receivedUpdate,mongo::assembleResponse
Source code for MongoDB queries is listed below. The code to setup MongoDB and MySQL is described in the previous posts.
def query_mongo(host, port, pipe_to_parent, requests_per, dbname, rows, check, testname, worst_n, id):
conn = pymongo.Connection(host, port)
db = conn[dbname]
signal.signal(signal.SIGTERM, sigterm_handler)
gets = 0
stats = SummaryStats(worst_n)
while True:
for loop in xrange(0, requests_per):
target = random.randrange(0, rows)
s = time.time()
try:
r = db.c.update({'_id': target}, {'$inc': {'k': 1 }}, safe=True)
assert r['updatedExisting'] == True
assert r['ok'] == 1
stats.update(s)
gets += 1
except:
assert got_sigterm
The my.cnf settings for MySQL except for the values of log_bin and sync_binlog:
innodb_buffer_pool_size=2000M
innodb_log_file_size=100M
innodb_flush_log_at_trx_commit=2
innodb_doublewrite=1
innodb_flush_method=O_DIRECT
innodb_thread_concurrency=0
innodb_max_dirty_pages_pct=80
innodb_file_format=barracuda
innodb_file_per_table
innodb_deadlock_detect=0
max_connections=2000
table_cache=2000
key_buffer_size=2000M
innodb_doublewrite=0
MySQL versus MongoDB - fetch by secondary index
This continues the silly benchmark series and compares performance from concurrent clients that fetch by secondary key. The previous post compared fetch by primary key. The test setup was the same as before. Clients were run on a 16-core x86 server. The servers (mongod, mysqld) were run alone on 8-core and 16-core x86 servers. The tests were run for servers that were 1ms and 200us apart according to ping. The database server saturates earlier when the client is only 200us away. That is to be expected.
InnoDB tables are clusters on the primary key and a query that fetches all columns by PK only has to read data from one leaf block of the PK index. When all columns are fetched by secondary key then the secondary index leaf node and PK index leaf node must be read. As all data was cached for this test that does not make a big difference. Were data not cached the extra IO used to read the PK index leaf node would be significant.
This displays throughput on the 16-core server. MongoDB saturates earlier on the 16-core server than on the 8-core server. From vmstat output this appears to be mutex contention on the server but I cannot provide more details on where that occurs as the mongod binary I downloaded has been stripped.
This displays throughput on the 8-core server. Peak QPS for MongoDB is much better than on the 16-core server.
This displays response time for the 16 core server.
The test was repeated using a hosts that were 200us apart according to ping. The database host was an 8-core server in this test. The peak QPS is similar to the previous tests but the servers saturate with fewer concurrent clients. The results are here and have been updated to include results for MongoDB 1.6.2 and 1.7.0.
Source code to setup the MySQL table:
def setup_mysql(host, db, user, password, engine, rows):
filterwarnings( 'ignore', category = MySQLdb.Warning )
conn = connect_mysql(host, db, user, password)
conn.autocommit(True)
cursor = conn.cursor()
cursor.execute('drop table if exists bm')
cursor.execute('create table bm (id int primary key, sid int, k int, c char(120), pad char(60), key sidx(sid)) engine=%s' % engine)
vals = []
for x in xrange(0, rows):
sx = str(x)
lsx = len(sx)
row = '(%d, %d, %d, "%s", "%s")' % (x, x, x, sx+'x'*(120-lsx), sx+'y'*(60-lsx))
vals.append(row)
if len(vals) == 1000:
r = cursor.execute('insert into bm values %s' % ','.join(vals))
vals = []
print '... row %d, result %s' % (x, r)
if vals:
r = cursor.execute('insert into bm values %s' % ','.join(vals))
vals = []
print '... row %d, result %s' % (x, r)
Source code to query the MySQL table:
def query_mysql(host, db, user, password, pipe_to_parent, requests_per, rows, check, testname, worst_n, id):
conn = connect_mysql(host, db, user, password)
conn.autocommit(True)
cursor = conn.cursor()
gets = 0
stats = SummaryStats(worst_n)
while True:
for loop in xrange(0, requests_per):
target = random.randrange(0, rows)
s = time.time()
cursor.execute('select id, k, c, pad from bm where sid = %d' % target)
sel_rows = cursor.fetchall()
stats.update(s)
if len(sel_rows) != 1:
print 'No rows for %d' % target
assert False
if sel_rows[0][0] != target:
print 'id is %s and should be %s' % (sel_rows[0][0], target)
assert False
gets += 1
Source code to setup the MongoDB collection:
def setup_mongo(host, port, dbname, rows):
conn = pymongo.Connection(host, port)
conn.drop_database(dbname)
db = conn[dbname]
for x in xrange(0, rows):
sx = str(x)
lsx = len(sx)
db.c.save({'_id':x, 'sid':x, 'k':x, 'c':sx+'x'*(120 - lsx), 'pad':sx+'y'*(60 - lsx)})
if x % 1000 == 0:
print '... row %d' % x
db.c.create_index('sid')
InnoDB tables are clusters on the primary key and a query that fetches all columns by PK only has to read data from one leaf block of the PK index. When all columns are fetched by secondary key then the secondary index leaf node and PK index leaf node must be read. As all data was cached for this test that does not make a big difference. Were data not cached the extra IO used to read the PK index leaf node would be significant.
This displays throughput on the 16-core server. MongoDB saturates earlier on the 16-core server than on the 8-core server. From vmstat output this appears to be mutex contention on the server but I cannot provide more details on where that occurs as the mongod binary I downloaded has been stripped.
This displays throughput on the 8-core server. Peak QPS for MongoDB is much better than on the 16-core server.
This displays response time for the 16 core server.
The test was repeated using a hosts that were 200us apart according to ping. The database host was an 8-core server in this test. The peak QPS is similar to the previous tests but the servers saturate with fewer concurrent clients. The results are here and have been updated to include results for MongoDB 1.6.2 and 1.7.0.
Source code to setup the MySQL table:
def setup_mysql(host, db, user, password, engine, rows):
filterwarnings( 'ignore', category = MySQLdb.Warning )
conn = connect_mysql(host, db, user, password)
conn.autocommit(True)
cursor = conn.cursor()
cursor.execute('drop table if exists bm')
cursor.execute('create table bm (id int primary key, sid int, k int, c char(120), pad char(60), key sidx(sid)) engine=%s' % engine)
vals = []
for x in xrange(0, rows):
sx = str(x)
lsx = len(sx)
row = '(%d, %d, %d, "%s", "%s")' % (x, x, x, sx+'x'*(120-lsx), sx+'y'*(60-lsx))
vals.append(row)
if len(vals) == 1000:
r = cursor.execute('insert into bm values %s' % ','.join(vals))
vals = []
print '... row %d, result %s' % (x, r)
if vals:
r = cursor.execute('insert into bm values %s' % ','.join(vals))
vals = []
print '... row %d, result %s' % (x, r)
Source code to query the MySQL table:
def query_mysql(host, db, user, password, pipe_to_parent, requests_per, rows, check, testname, worst_n, id):
conn = connect_mysql(host, db, user, password)
conn.autocommit(True)
cursor = conn.cursor()
gets = 0
stats = SummaryStats(worst_n)
while True:
for loop in xrange(0, requests_per):
target = random.randrange(0, rows)
s = time.time()
cursor.execute('select id, k, c, pad from bm where sid = %d' % target)
sel_rows = cursor.fetchall()
stats.update(s)
if len(sel_rows) != 1:
print 'No rows for %d' % target
assert False
if sel_rows[0][0] != target:
print 'id is %s and should be %s' % (sel_rows[0][0], target)
assert False
gets += 1
Source code to setup the MongoDB collection:
def setup_mongo(host, port, dbname, rows):
conn = pymongo.Connection(host, port)
conn.drop_database(dbname)
db = conn[dbname]
for x in xrange(0, rows):
sx = str(x)
lsx = len(sx)
db.c.save({'_id':x, 'sid':x, 'k':x, 'c':sx+'x'*(120 - lsx), 'pad':sx+'y'*(60 - lsx)})
if x % 1000 == 0:
print '... row %d' % x
db.c.create_index('sid')
Source code to query the MongoDB table:
def query_mongo(host, port, pipe_to_parent, requests_per, dbname, rows, check, testname, worst_n, id):
conn = pymongo.Connection(host, port)
db = conn[dbname]
gets = 0
stats = SummaryStats(worst_n)
while True:
for loop in xrange(0, requests_per):
target = random.randrange(0, rows)
s = time.time()
o = db.c.find_one({'sid': target})
stats.update(s)
assert o['_id'] == target
if check:
assert o['k'] == target
sx = str(o['_id'])
lsx = len(sx)
assert o['c'] == sx+'x'*(120-lsx)
assert o['pad'] == sx+'y'*(60-lsx)
gets += 1
Saturday, September 11, 2010
MySQL versus MongoDB - yet another silly benchmark
This is yet another silly benchmark because the results are likely to be misused. The results probably do not matter to you. I like MongoDB. It has many things in common with MySQL (ease of use, rapid iteration, give customers what they want). It would be more interesting to me were it to use embedded InnoDB as the backend.
Please read the update as I was able to get much better throughput from MongoDB once I realized that the client host was saturated on the CPU.
I want to evaluate the performance claims. Several of the MongoDB versus the world benchmarks were run on Mac laptops. I love my Mac laptop, but I don't use it to confirm database server throughput.
My first test uses concurrent clients to determine the load at which the server saturates. I run this test in two modes: cached and not cached. No disk IO should be done during the cached test. Most requests should require disk IO during the uncached test. I am not sure that I will run the uncached test for MongoDB as it memory maps the database file and there is no way to limit the amount of memory it will use so the MongoDB uncached test requires a database much larger than RAM. That will take a long time to setup and I don't know whether many MongoDB deployments run with databases much larger than RAM.
The test is interesting to me because it has found problems that limit throughput on workloads that I care about. Other tests should be done after understand the results from this test. I suspect that for many people peak QPS on highly-concurrent workloads is not a priority.
Each client does a sequence of calls (SELECT or HANDLER for MySQL, find_one for MongoDB) where each calls fetches one row by specifying the value of an indexed column. The column has a primary key index for MySQL and used the Collections create_index method for MongoDB. The id value to fetch is randomly selected.
The code for the benchmark is a work in progress, so I have yet to publish it. I used Python, the MySQLdb driver for MySQL and the pymongo driver for MongoDB. The clients are separate processes forked via the Python multiprocessing module to avoid the penalty from the Python GIL. Some of the code used for MongoDB is listed at the end of this post.
I only report peak QPS at this time. The results ignore average and worst-case response time. It is not good to ignore them which is another reason why this might be a silly benchmark. Eventually the code will be updated to measure that.
The clients and DBMS (mysqld, mongod) ran on separate 16-core servers. The test was run for 1, 2, 4, 8, 16, 24, ..., 152, 160 concurrent clients. Tests were run for MyISAM and InnoDB for MySQL. The tests used MySQL 5.1.50 unmodified (5150orig) and 5.1.50 with the Facebook patch (5150fb). There are a few changes in the Facebook patch that make a huge difference for peak QPS. I expect official MySQL to have these changes soon. Peak QPS was much higher for MySQL than for MongoDB. In order of peak QPS:
This displays throughput MySQL 5.1.50 with the Facebook patch, MySQL 5.1.50 unmodified and MongoDB 1.7.0. All ran on a 16-core Nehalem server -- 16 cores with hyperthreading enabled.
This displays average and 98th percentile response time for MySQL 5.1.50 with the Facebook patch and MongoDB 1.7.0. Both ran on a 16-core Nehalem server -- 16 cores with hyperthreading enabled. Results are from the same test as used for the previous graph.
I repeated the tests using an 8-core Nehalem server. It is the same as the previous server except that hyperthreading was disabled. The graphs below display throughput. MySQL is able to handle more concurrent clients before saturating on the server.
This is the response time graph for the 8-core server.
The servers for the tests above were 1ms apart according to ping. I repeated the test for servers that have a 200us ping time but I don't want to inline another graph. It is linked in the next sentence. The QPS for get by primary key with an interesting and unexplained spike in QPS for MongoDB near 152 concurrent clients. Despite having a similar QPS at that point, MongoDB average response time was about 1.5X that for MySQL. This includes results for MongoDB versions 1.6.2 and 1.7.0.
The code to setup the collection for MongoDB is:
Please read the update as I was able to get much better throughput from MongoDB once I realized that the client host was saturated on the CPU.
I want to evaluate the performance claims. Several of the MongoDB versus the world benchmarks were run on Mac laptops. I love my Mac laptop, but I don't use it to confirm database server throughput.
My first test uses concurrent clients to determine the load at which the server saturates. I run this test in two modes: cached and not cached. No disk IO should be done during the cached test. Most requests should require disk IO during the uncached test. I am not sure that I will run the uncached test for MongoDB as it memory maps the database file and there is no way to limit the amount of memory it will use so the MongoDB uncached test requires a database much larger than RAM. That will take a long time to setup and I don't know whether many MongoDB deployments run with databases much larger than RAM.
The test is interesting to me because it has found problems that limit throughput on workloads that I care about. Other tests should be done after understand the results from this test. I suspect that for many people peak QPS on highly-concurrent workloads is not a priority.
Each client does a sequence of calls (SELECT or HANDLER for MySQL, find_one for MongoDB) where each calls fetches one row by specifying the value of an indexed column. The column has a primary key index for MySQL and used the Collections create_index method for MongoDB. The id value to fetch is randomly selected.
The code for the benchmark is a work in progress, so I have yet to publish it. I used Python, the MySQLdb driver for MySQL and the pymongo driver for MongoDB. The clients are separate processes forked via the Python multiprocessing module to avoid the penalty from the Python GIL. Some of the code used for MongoDB is listed at the end of this post.
I only report peak QPS at this time. The results ignore average and worst-case response time. It is not good to ignore them which is another reason why this might be a silly benchmark. Eventually the code will be updated to measure that.
The clients and DBMS (mysqld, mongod) ran on separate 16-core servers. The test was run for 1, 2, 4, 8, 16, 24, ..., 152, 160 concurrent clients. Tests were run for MyISAM and InnoDB for MySQL. The tests used MySQL 5.1.50 unmodified (5150orig) and 5.1.50 with the Facebook patch (5150fb). There are a few changes in the Facebook patch that make a huge difference for peak QPS. I expect official MySQL to have these changes soon. Peak QPS was much higher for MySQL than for MongoDB. In order of peak QPS:
- Peak QPS on MySQL 5.1.50 with the Facebook patch exceeds 100,000 and QPS increases to 128 concurrent clients.
- Peak QPS on MySQL 5.1.50 unmodified is between 70,000 and 80,000 and QPS increases to 80 concurrent clients.
- Peak QPS on MongoDB 1.7.0 is 40,000 and QPS increases to 64 concurrent clients.
MongoDB appears to saturate on mutex contention on the server at 64 concurrent clients. Alas, the binary is stripped and I cannot run PMP to determine where the problem occurs. My diagnosis is based on vmstat output.
This displays throughput MySQL 5.1.50 with the Facebook patch, MySQL 5.1.50 unmodified and MongoDB 1.7.0. All ran on a 16-core Nehalem server -- 16 cores with hyperthreading enabled.
This displays average and 98th percentile response time for MySQL 5.1.50 with the Facebook patch and MongoDB 1.7.0. Both ran on a 16-core Nehalem server -- 16 cores with hyperthreading enabled. Results are from the same test as used for the previous graph.
I repeated the tests using an 8-core Nehalem server. It is the same as the previous server except that hyperthreading was disabled. The graphs below display throughput. MySQL is able to handle more concurrent clients before saturating on the server.
The servers for the tests above were 1ms apart according to ping. I repeated the test for servers that have a 200us ping time but I don't want to inline another graph. It is linked in the next sentence. The QPS for get by primary key with an interesting and unexplained spike in QPS for MongoDB near 152 concurrent clients. Despite having a similar QPS at that point, MongoDB average response time was about 1.5X that for MySQL. This includes results for MongoDB versions 1.6.2 and 1.7.0.
The code to setup the collection for MongoDB is:
def setup_db(host, port, dbname, rows):
conn = pymongo.Connection(host, port)
conn.drop_database(dbname)
db = conn[dbname]
for x in xrange(0, rows):
sx = str(x)
lsx = len(sx)
db.c.save({'_id':x, 'k':x, 'c':sx+'x'*(120 - lsx), 'pad':sx+'y'*(120 - lsx)})
if x % 1000 == 0:
print '... row %d' % xAnd the code to query MongoDB:
def query_process(host, port, pipe_to_parent, requests_per, dbname, rows, check, id):
conn = pymongo.Connection(host, port)
db = conn[dbname]
gets = 0
while True:
for loop in xrange(0, requests_per):
target = random.randrange(0, rows)
o = db.c.find_one({'_id': target})
assert o['_id'] == target
if check:
assert o['k'] == target
sx = str(o['id'])
lsx = len(sx)
assert o['c'] == sx+'x'*(120-lsx)
assert o['pad'] == sx+'y'*(120-lsx)
gets += 1
if pipe_to_parent.poll():
msg = pipe_to_parent.recv()
print 'Received: %s' % msg
pipe_to_parent.send({'gets' : gets})
pipe_to_parent.close()
returnThe my.cnf settings for MySQL 5.1:
plugin-load=innodb=ha_innodb_plugin.so;innodb_trx=ha_innodb_plugin.so;innodb_locks=ha_innodb_plugin.so;innodb_lock_waits=ha_innodb_plugin.so;innodb_cmp=ha_innodb_plugin.so;innodb_cmp_reset=ha_innodb_plugin.so;innodb_cmpmem=ha_innodb_plugin.so;innodb_cmpmem_reset=ha_innodb_plugin.so
innodb_buffer_pool_size=2000M
innodb_log_file_size=100M
innodb_flush_log_at_trx_commit=2
innodb_doublewrite=1
innodb_flush_method=O_DIRECT
innodb_thread_concurrency=0
innodb_file_format=barracuda
innodb_file_per_table
max_connections=2000
table_cache=2000
key_buffer_size=2000M
Wednesday, September 8, 2010
SBR and innodb_autoinc_lock_mode
The 5.1 manual states:
Therefore, if you are using statement-based replication, you must either avoid INSERT ... ON DUPLICATE KEY UPDATE or use innodb_autoinc_lock_mode = 0And earlier in the same page is the description of innodb_autoinc_lock_mode=0
This lock mode is provided only for backward compatibility and performance testing. There is little reason to use this lock mode unless you use “mixed-mode inserts” and care about the important difference in semantics described later.I don't think these statements agree. I am confused even more because upgrading a master-slave pair of servers to 5.1 with innodb_autoinc_lock_mode=1 and statement-based replication fixed a logical corruption problem that occurs when the wrong value was written to the binlog for SET INSERT_ID=... as part of a transaction that does INSERT ... ON DUPLICATE KEY UPDATE.There are a few details on that at bug 50413.
Thursday, September 2, 2010
Speaking at MySQL Sunday
I am speaking at MySQL Sunday. The title for my talk is Success with MySQL and I will focus on things that operations and users can do to make a MySQL deployment succeed. There are many interesting talks scheduled for Sunday, including several at the same time as mine. I hope to see you there.
Subscribe to:
Posts (Atom)









