0

I'm new with mysql and am a little confused about what cardinality means, I read that it means the number or unique rows but I'd like to know what it does mean in this case, this is my table definition

+-------------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------------------+--------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | revisado | varchar(10) | YES | MUL | NULL | | | total | int(11) | NO | MUL | NULL | | | busqueda | varchar(300) | NO | MUL | NULL | | | clave | bigint(15) | NO | | NULL | | | producto_servicio | varchar(300) | NO | | NULL | | +-------------------+--------------+------+-----+---------+----------------+ 

the total of records right now is 13621

I have this query

SELECT clave, producto_servicio FROM buscador_claves2 WHERE busqueda = 'FERRETERIA' AND total = 2 AND revisado = 'APROBADO'

And this the index definition of the table

+------------------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +------------------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | buscador_claves2 | 0 | PRIMARY | 1 | id | A | 14309 | NULL | NULL | | BTREE | | | buscador_claves2 | 1 | idx_busqueda | 1 | busqueda | A | 14309 | 255 | NULL | | BTREE | | | buscador_claves2 | 1 | idx_total | 1 | total | A | 3 | NULL | NULL | | BTREE | | | buscador_claves2 | 1 | idx_revisado | 1 | revisado | A | 1 | NULL | NULL | YES | BTREE | | | buscador_claves2 | 1 | idx_compuesto1 | 1 | revisado | A | 1 | NULL | NULL | YES | BTREE | | | buscador_claves2 | 1 | idx_compuesto1 | 2 | total | A | 105 | NULL | NULL | | BTREE | | | buscador_claves2 | 1 | idx_compuesto1 | 3 | busqueda | A | 14309 | 255 | NULL | | BTREE | | | buscador_claves2 | 1 | idx_compuesto2 | 1 | busqueda | A | 14309 | 255 | NULL | | BTREE | | | buscador_claves2 | 1 | idx_compuesto2 | 2 | total | A | 14309 | NULL | NULL | | BTREE | | | buscador_claves2 | 1 | idx_compuesto2 | 3 | revisado | A | 14309 | NULL | NULL | YES | BTREE | | +------------------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 

the query is taking idx_compuesto1 as the index to find the data, what means the cardinality in this case for the revisado, total and busqueda columns as part of the index idx_compuesto1? and why it takes idx_compuesto1 instead of idx_compuesto2, I can see the cardinality is different in both indexes

This is the output of the query explain

*************************** 1. row *************************** id: 1 select_type: SIMPLE table: buscador_claves2 type: ref possible_keys: idx_busqueda,idx_total,idx_revisado,idx_compuesto1,idx_compuesto2 key: idx_compuesto1 key_len: 804 ref: const,const,const rows: 1 Extra: Using where 

I hope you can help me to understand better this info, thank you.

2
  • Maybe start with this Wikipedia article. Cardinality refers to uniqueness of the data in a column. Low cardinality, e.g. in a bit column, means that an index won't be able to distinguish well between different rows, and vice-versa for high cardinality. Commented Mar 30, 2018 at 17:31
  • I find "cardinality" to be almost never of any use in optimizing queries. Commented Apr 5, 2018 at 2:56

1 Answer 1

1

In MySQL, the value of the index cardinality column is the storage engine estimate for the number of unique values in that index. It is used to determine how well this index can be used during joins. Generally MySQL optimizer prefers the index with a higher cardinality, because it usually means it is able to filter down to fewer rows. The ideal scenario is for the value of cardinality to be always equal to SELECT COUNT(DISTINCT the_key)..., but in practice it is usually off by some relatively small margin due to the difficulty of accurately computing this during normal database operations in an efficient manner that does not disrupt database performance. The value will be more accurate immediately after ANALYZE TABLE. Being off on cardinality begins to matter when the optimizer can choose more than one key for a particular join, it makes a huge difference in performance which one gets chosen, and the cardinality estimates for those keys are sufficiently off to cause the optimizer to choose the wrong key. Those situations are relatively rare, but do happen. In that case, the problem can be solved either with ANALYZE TABLE or - if you are always 100% sure which key is better for the join - by explicitly making the optimizer use it with FORCE KEY in the query.

Sign up to request clarification or add additional context in comments.

1 Comment

If your values are skewed significantly, the "cardinality" can be significantly far off.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.