As background, the MySQL optimizer uses two types of table statistics
- predicate selectivity - computed per-query by handler::records_in_range. Determines the number of rows that will be fetched from a table. Evaluated for the single table predicates in the WHERE clause and only for columns that are indexed.
- index cardinality - uses the data displayed by SHOW INDEXES. Influences join order. Used with the join conditions to determine join selectivity.
- predicate selectivity is computed by probing the index and accesses one or two leaf nodes per probe.
- index cardinality is computed by sampling 8 leaf nodes per index. This is done the first time a table is opened, and after many rows have been changed for a table. Because this is very fast, index cardinality stats are always available whether or not ANALYZE TABLE has been run.
- index cardinality stats are not available until ANALYZE TABLE has been run. If you forget to run it for transient tables, then you are likely to get bad query plans.
- predicate selectivity is accurate before ANALYZE TABLE has been run.
Examples below show that index cardinality stats are available for InnoDB but not for MyISAM and Heap tables. First, create the tables for the experiment.
drop table if exists n, st, st2, it, it2;Display index cardinality for MyISAM and Heap tables. Note that the value for Cardinality is NULL except for the primary key columns.
create table n(i int);
insert into n values (0), (1), (2), (3), (4), (5), (6), (7), (8);
insert into n select * from n;
insert into n select * from n;
insert into n select * from n;
insert into n select * from n;
create temporary table tt2 (i int, p int not null auto_increment, index(i,p), primary key (p));
create table st2 (i int, p int not null auto_increment, index(i,p), primary key (p));
create table it2 (i int, p int not null auto_increment, index(i,p), primary key (p)) engine=innodb;
insert into tt2 select 0, null from n;
insert into st2 select 0, null from n;
insert into it2 select 0, null from n;
show indexes from tt2;
--------------
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment
tt2 0 PRIMARY 1 p A 144 NULL NULL BTREE
tt2 1 i 1 i A NULL NULL NULL YES BTREE
tt2 1 i 2 p A NULL NULL NULL BTREE
show indexes from st2;
--------------
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment
st2 0 PRIMARY 1 p A 144 NULL NULL BTREE
st2 1 i 1 i A NULL NULL NULL YES BTREE
st2 1 i 2 p A NULL NULL NULL BTREE
Display index cardinality for the InnoDB table. Note that the values for Cardinality are valid.
show indexes from it2;MyISAM, Heap and InnoDB are able to determine predicate selectivity without having run analyze table. The where clause selects all of the rows in the table.
--------------
Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_part Packed Null Index_type Comment
it2 0 PRIMARY 1 p A 144 NULL NULL BTREE
it2 1 i 1 i A 144 NULL NULL YES BTREE
it2 1 i 2 p A 144 NULL NULL BTREE
explain select * from tt2 where i = 0;MyISAM, Heap and InnoDB are able to determine predicate selectivity without having run analyze table. The where clause selects none of the rows in the table.
--------------
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tt2 index i i 9 NULL 108 Using where; Using index
explain select * from st2 where i = 0;
--------------
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE st2 index i i 9 NULL 108 Using where; Using index
explain select * from it2 where i = 0;
--------------
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE it2 ref i i 5 const 72 Using where; Using index
explain select * from tt2 where i = 1;
--------------
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE tt2 ref i i 5 const 1 Using where; Using index
explain select * from st2 where i = 1;
--------------
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE st2 ref i i 5 const 1 Using where; Using index
explain select * from it2 where i = 1;
--------------
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE it2 ref i i 5 const 1 Using where; Using index

4 comments:
Hi Mark,
Indeed - I've always found this slightly curious as well.
I will note one thing - if you are using HEAP tables with HASH based indexes then the cardinality *is* automatically updated - although it's not a 'true' cardinality, it's just an estimation based on total values and hash buckets - see ha_heap::update_key_stats()..
Cheers,
And another quick update.. I found something else curious relating to this which I just reported as a bug:
http://bugs.mysql.com/bug.php?id=34488
It seems that if you do any form of DML before the first ANALYZE, then the cardinality stats can get way out of line..
Great tip, thank you very much!
Also, I notice that running ANALYZE doesn't affect the cardinality until the table has data. I suppose that the caridnality isn't useful until there is data, but this means you can't just run 'myisamchk --analyze' on mysql startup and expect your tables to have valid indexes, even after data is entered into the tables.
Post a Comment