3

I have a MySQL table with ~17M rows where I end up doing a lot of aggregation queries.

For this example lets say I have index_on_b, index_on_c, compound_index_on_a_b, compound_index_on_a_c

I try and run a query explain

EXPLAIN SELECT SUM(revenue) FROM table WHERE a = some_value AND b = other_value 

And I find that the selected index is index_on_b, but when I use a query hint

SELECT SUM(revenue) FROM table USE INDEX(compound_index_on_a_b) 

The query runs way way faster. Is there anything I can do in MySQL config to make MySQL choose the compound indexes first?

1
  • Please provide the actual SHOW CREATE TABLE and the SELECT. There could be subtle things such as datatype inconsistencies getting in the way. Also EXPLAIN SELECT ... Commented Jan 28, 2016 at 1:31

2 Answers 2

1

There are 2 possible routes you can take:

A) The index resolution process is when according to the optimizer all things are equal based on the order the indexes are created in. You could drop index_b and recreate it and check if the optimizer was in a scenario where it just thought they were the same.

Or

B) Use optimizer_search_depth (see https://mariadb.com/blog/setting-optimizer-search-depth-mysql). By altering this parameter you determine how much effort the optimizer is allowed to spend on a query plan, and it might come up with the much better solution of using the combined index.

Sign up to request clarification or add additional context in comments.

8 Comments

What about index cardinality and updating this using analyse table?
@Shadow: How the cardinality is used is part of the optimizer process, and can as far as I know not be influenced. The analyze table only keeps it up to date (which is not a bad idea: It is best to have this information up to date).
@NorbertvanNobelen thanks for the reply. I was able to find a bunch of options for tweaking the query plan selection algorithm, but I still can't figure out what is causing MySQL to choose index merge or larger indexes over the compound index
Despite being open source, which gives 100% insight in the optimizer algorithm, it is documented poorly, so that is a really hard question to answer. I also noticed this tendency to use non-optimal indexes in different scenarios. This is a continuous battle in pretty much all relational DBMS
"Index merge" is almost always slower than the appropriate compound index.
|
0

A possible explanation:

If a has the same value throughout the table, then INDEX(b) is actually better than INDEX(a,b). This is because the former is smaller, hence faster to work with. Note that both will return the same number of rows, even without further checking of a.

Please provide:

SHOW CREATE TABLE SHOW INDEXES -- to see cardinality EXPLAIN SELECT 

1 Comment

Perhaps I didn't set the question up correctly but in this case INDEX(a, b) is certainly 100x faster (found through experimentation)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.