2

I've seen a few questions dabbling with the inefficiency of "NOT IN" in MySQL queries, but I didn't manage to reproduce the proposed solutions.

So I've got some sort of search engine. It starts with very simple queries, and then tries more complicated ones if it doesn't find enough results. Here's how it works in pseudocode

list_of_ids = do_simple_search() nb_results = size_of(list_of_ids) if nb_results < max_nb_results : list_of_ids .= do_search_where_id_not_in(list_of_ids) if nb_results < max_nb_results : list_of_ids .= do_complicated_search_where_id_not_in(list_of_ids) 

Hope I'm clear. Anyway here's the slow query, as shown by MySQL-slow :

SELECT DISTINCT c.id FROM clients c LEFT JOIN communications co ON c.id = co.client_id WHERE (co.titre LIKE 'S' OR co.contenu LIKE 'S') AND c.id NOT IN(N) LIMIT N, N 

And here's an EXPLAIN on that query :

id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE c index PRIMARY PRIMARY 2 NULL 25250 Using where; Using index; Using temporary 1 SIMPLE co ref qui_com,id_client,titre id_client 2 klients.c.id 8 Using where; Distinct 

MySQL version is 5.1.63-0ubuntu0.11.04.1-log

Maybe my approach is wrong here ? How would you do it ? thanks.

3
  • are you having performance problems? Commented Jun 5, 2014 at 14:24
  • 1
    If you want us to help optimize a query, you need to show us the table and index definitions, as well as row counts for each of the tables. Maybe your tables are defined poorly. Maybe the indexes aren't created correctly. Maybe you don't have an index on that column you thought you did. Without seeing the table and index definitions, we can't tell. We also need row counts because that can affect query optimization greatly. If you know how to do an EXPLAIN or get an execution plan, put the results in the question as well. If you have no indexes, visit use-the-index-luke.com ASAP. Commented Jun 5, 2014 at 15:26
  • Thanks for the heads up, I'll try to figure out a bit by myself and then come back Commented Jun 6, 2014 at 12:49

1 Answer 1

1

A couple of remarks:

1) Why do you do LEFT JOIN i/o (INNER) JOIN? LEFT JOIN means that you want to also get the records which are not matched agains clients, is that the intention? If not, then JOIN i/o LEFT JOIN is quicker.

2) Why do you need JOIN at all if you can simply do:

SELECT DISTINCT co.client_id from communications co WHERE (co.titre LIKE 'S' OR co.contenu LIKE 'S') AND co.id!=N LIMIT N,N; 

Also, if you do a JOIN, both joined fields must be indexes, otherwise it's slow too.

More importantly, you condition both client_id and id from communications table, but there is no common index for these both, which means more work to execute your query (hence using temporary which is in general not a good sign).

3) You do a complex condition on both co.titre and co.contenu, you seem to have indexes but they are not used. That means this part might be potentially quite slow.

Sign up to request clarification or add additional context in comments.

3 Comments

Oh wow, I can't believe I overlooked that I didn't need that join. Thanks a lot, that's indeed simpler.
Damn, I just realized why I had to do the JOIN. In some cases, I'm checking for the clients' status too. So you suggest I replace my LEFT JOINs with JOINs. SQL has never been my strongest point :/
I thought something like that. You can still JOIN with the clients table and use fields from there, of course.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.