0

I ran into a problem today that I couldn't quite understand, so I was hoping for some outside knowledge. I was trying to find the number of items in a table where their id isn't referenced in another. I ran two different queries and seem to have conflicting results.

select count(*) from TableA where ID not in (select aID from TableB) 

returns 0

select count(*) from TableA a left join TableB b on b.aID = a.ID where b.aID is null 

returns a few thousand.

All IDs in both TableA and TableB are unique. An ID from TableA never shows up in the aID column from TableB more than once. To me, it seems like I am querying the same thing but receiving different results. Where am I going wrong?

3
  • 1
    Does this answer your question? What's the difference between NOT EXISTS vs. NOT IN vs. LEFT JOIN WHERE IS NULL? Commented Nov 17, 2023 at 0:55
  • Debug questions require a minimal reproducible example--cut & paste & runnable code including initialization; desired & actual output (including verbatim error messages); tags & versions; clear specification & explanation. For SQL include DDL & tabular initialization code. For debug that includes the least code you can give that is code that you show is OK extended by code that you show is not OK. How to Ask Help center When you get a result you don't expect, pause your overall goal, chop to the 1st subexpression with unexpected result & say what you expected & why, justified by documentation. (Debugging fundamental.) Commented Nov 17, 2023 at 5:24
  • Clearly this can be expected to be a faq. Please before considering posting: Pin down code issues via minimal reproducible example. Read manuals/references & google error messages & many clear, concise & precise phrasings of your question/problem/goal, with & without your particular names/strings/numbers, 'site:stackoverflow.com' & tags; read many answers. SO/SE search is unusual, read the help. Google re googling/searching, including Q&A at Meta Stack Overflow & Meta Stack Exchange. How much research effort is expected of Stack Overflow users? Reflect research in posts. Commented Nov 18, 2023 at 0:53

2 Answers 2

3

Do not use not in with a subquery. If any value in the subquery is NULL, then all rows are filtered out. These are the rules of how NULL is defined in SQL. The LEFT JOIN is correct.

The reason is that NULL means an unknown value. Almost any comparison with NULL returns NULL, which is treated as false. So, the only possibilities with NOT IN with NULL are that an element matches what you are looking for -- and the expression returns false -- or an element is NULL -- and the expression returns NULL which is treated as false.

I usually advise replacing the NOT IN with NOT EXISTS:

select count(*) from TableA a where not exists (select 1 from TableB b where b.aID = a.ID); 

The LEFT JOIN performs correctly and usually has good performance.

Sign up to request clarification or add additional context in comments.

2 Comments

Ah that makes perfect sense. When I changed it to ID not in (select aID from TableB where aID is not null) worked too. But using exists is probably a good habit for me to get in
@bds212 . . . The WHERE clause definitely fixes the problem. I prefer NOT EXISTS, but if you get in the habit of using it, you just don' have to worry about NULL values.
0

We should always use the EXISTS operator if the columns involved are nullables. Also,Exist is faster than In clause.

Using IN/Not IN operator might produce an inferior plan and also can lead to misleading results if a null value is inserted in the table just like in you case.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.