I am running a benchmark with MySQL and Linux on a server with 16GB. I used innodb_buffer_pool_size=12G. The process size is 13.6GB. What else uses all of the memory? Domas investigated the problem and one of the causes is that the per buffer pool page overhead in Innodb is high.
But a 13.6GB process shouldn't be a problem on a 16GB box, should it? Well, the Linux VM tends to generate too many page-ins and page-outs in this case. Don, Kevin, Dathan, Peter, Peter, and Peter have written about this. In my case, the sum of page-ins and page-outs were more than 500 per second. This is reported in the si and so columns (449, 413 below). The sum of si and so is ~900 and assuming that means ~900kb and 4kb blocks, then there were more than 200 reads and writes per second for this activity. I hope that does not translate to 200 IOPs of disk activity.
procs -----------memory---------- ---swap-- -----io----The first problem is to understand the numbers. I wanted to understand the output of vmstat, vmstat -a and vmstat -sa. The Linux vmstat man page is not exactly clear.
r b swpd free buff cache si so bi bo
1 3 530064 128388 13024 2434336 449 413 636 17316
MemoryBetter and possibly accurate definitions for these are (sizes are in 1 kb blocks)
swpd: the amount of virtual memory used.
free: the amount of idle memory.
buff: the amount of memory used as buffers.
cache: the amount of memory used as cache.
inact: the amount of inactive memory. (-a option)
active: the amount of active memory. (-a option)
- swpd - amount of swap space in use
- free - unallocated physical memory
- buff - size of cache used for filesystem metadata blocks
- cache - size of cache for filesystem data blocks
- inact - amount of memory the kernel considers to be inactive and eligible to be written to swap space or to the filesystem. Inactive blocks may be dirty.
- active - see inact
# vmstat -saWhat do these mean? All numbers are for 1k blocks by default.
16226644 total memory
16100852 used memory
13576424 active memory
2340580 inactive memory
125792 free memory
36956 buffer memory
2417564 swap cache
530104 total swap
420700 used swap
109404 free swap
- total memory - physical memory
- used memory - allocated physical memory, the free column in vmstat output is total memory - used memory
- active memory - amount of memory the kernel considers to be in active use
- inactive memory - see active memory, may include dirty pages
- buffer memory - size of the cache for file-system metadata
- swap cache - RedHat describes this as the read-write cache for swap but the value matches the cache column in vmstat output and I think this is the size of the page cache used for files.
- total swap - total swap space
- used swap - used swap space, this is the swpd column in vmstat output
- free swap - unused swap space
- reduce the size of the InnoDB buffer cache. I know that there was no page-in and page-out activity when I ran the benchmark with innodb_buffer_pool_size=8G
- tune Linux: echo 0 > /proc/sys/vm/swappiness. This eventually removed all page-in and page-out activity and increased performance by 50%.
- use Solaris as some people have reported that this problem doesn't occur there
- use O_DIRECT, but Kevin reported that writes still go through the page cache and this will eventually lead to page-out activity. For unmodified InnoDB there is a good reason to not use O_DRECT as one thread and synchronous IO is used to write dirty pages. This will become a bottleneck on a busy server and isn't effective on a server with many disks and many CPU cores. This is not a performance problem when buffered IO is used as the writes usually complete fast to the OS buffer cache.