Representing 'not in' subquery as join

Question

I am trying to convert the following query:

select * from employees where emp_id not in (select distinct emp_id from managers);

into a form where I represent the subquery as a join. I tried doing:

select * from employees a, (select distinct emp_id from managers) b where a.emp_id!=b.emp_id;

I also tried:

select * from employees a, (select distinct emp_id from managers) b where a.emp_id not in b.emp_id;

But it does not give the same result. I have tried the 'INNER JOIN' syntax as well, but to no avail. I have become frustrated with this seemingly simple problem. Any help would be appreciated.

blog.codinghorror.com/a-visual-explanation-of-sql-joins This has helped me, and others more than I care to mention. — xQbert
– xQbert, Commented Jul 30, 2014 at 20:35
@xQbert Thank you! That does help. However, can somebody explain to me why my approach didn't work? — Moon_Watcher
– Moon_Watcher, Commented Jul 30, 2014 at 20:38

xQbert · Accepted Answer · 2014-07-30 20:56:39Z

Assume employee Data set of

Emp_ID 1 2 3 4 5 6 7

Assume Manger data set of

Emp_ID 1 2 3 4 5 8 9 select * from employees where emp_id not in (select distinct emp_id from managers);

The above isn't joining tables so no Cartesian product is generated... you just have 7 records you're looking at...

The above would result in 6 and 7 Why? only 6 and 7 from Employee Data isn't in the managers table. 8,9 in managers is ignored as you're only returning data from employee.

select * from employees a, (select distinct emp_id from managers) b where a.emp_id!=b.emp_id;

The above didnt' work because a Cartesian product is generated... All of Employee to all of Manager (assuming 7 records in each table 7*7=49) so instead of just evaluating the employee data like you were in the first query. Now you also evaluate all managers to all employees

so Select * results in

1,1 1,2 1,3 1,4 1,5 1,8 1,9 2,1 2,2...

Less the where clause matches... so 7*7-7 or 42. and while this may be the answer to the life universe and everything in it, it's not what you wanted.

I also tried:

select * from employees a, (select distinct emp_id from managers) b where a.emp_id not in b.emp_id;

Again a Cartesian... All of Employee to ALL OF Managers

So this is why a left join works

SELECT e.* FROM employees e LEFT OUTER JOIN managers m on e.emp_id = m.emp_id WHERE m.emp_id is null

This says join on ID first... so don't generate a Cartesian but actually join on a value to limit the results. but since it's a LEFT join return EVERYTHING from the LEFT table (employee) and only those that match from manager.

so in our example would be returned as e.emp_Di = m.Emp_ID

1,1 2,2 3,3 4,4 5,5 6,NULL 7,NULL

now the where clause so

6,Null 7,NULL are retained...

older ansii SQL standards for left joins would have been *= in the where clause...

select * from employees a, managers b where a.emp_id *= b.emp_id --I never remember if the * is the LEFT so it may be =* and b.emp_ID is null;

But I find this notation harder to read as the join can get mixed in with the other limiting criteria...

Vulcronos · Accepted Answer · 2014-07-30 20:29:33Z

Try this:

select e.* from employees e left join managers m on e.emp_id = m.emp_id where m.emp_id is null

This will join the two tables. Then we discard all rows where we found a matching manager and are left with employees who aren't managers.

I'd like to choose this as the answer, but I don't have enough reputation to do so. Sorry!

Maurice Reeves · Accepted Answer · 2014-07-30 20:30:48Z

Your best bet would probably be a left join:

select e.* from employees e left join managers m on e.emp_id = m.emp_id where m.emp_id is null;

The idea here is you're saying that you want to select everything from employees, including anything that matches in the manager table based on emp_id and then filtering out the rows that actually have something in the manager table.

Milen · Accepted Answer · 2014-07-30 20:47:02Z

Use Left Outer Join instead

select e.* from employees e left outer join managers m on e.emp_id = m.emp_id where m.emp_id is null

left outer join will preserve the rows from m table even if they do not have a match i e table based on the emp_id field. The we filter on where m.emp_id is null - give me all the rows from e where there's no matching record in m table. A bit more on the subject can be found here:

Visual representation of joins

from employees a, (select distinct emp_id from managers) b implies cross join - all posible combinations between tables (and you needed left outer join instead)

Thank you! That works. But can you explain to me why my approach wasn't working? The logic seemed alright to me.

hd1 · Accepted Answer · 2014-07-30 20:31:28Z

The MINUS keyword should do the trick:

SELECT e.* FROM employees e MINUS Select m.* FROM managers m

Hope that helps...

Milen · Accepted Answer · 2014-07-30 20:37:59Z

0

select * from employees where Not (emp_id in (select distinct emp_id from managers));

edited Jul 30, 2014 at 20:37

Milen

8,8977 gold badges47 silver badges59 bronze badges

answered Jul 30, 2014 at 20:34

L.Pursell

785 bronze badges

Collectives™ on Stack Overflow

Representing 'not in' subquery as join

6 Answers 6

Comments

1 Comment

Comments

1 Comment

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Comments

1 Comment

Comments

1 Comment

Comments

Comments

Related