2

Today, i got a problem from a friend.

Problem - Write a SQL query using UNION ALL(not union) that uses the where clause to eliminate duplicates.

  • I can not use group by expression
  • I can not use unique , distinct keywords.

Input -

 id(Table 1) 1 2 fk_id(Table 2) 1 1 2 

I gave him the solution below query

select id from ( select id , row_number() over(partition by id order by id) rn from ( select id from T1 union all select fk_ID id from T2 ) )where rn = 1; Output - id 1 2 

which is generating unique id's.

Now suspense by him i also can not use row_number(). i just have to use where condition. i am writing query on oracle database.

Please suggest.

Thanks in advance.

5
  • Interesting academic question with possibly no practical application at all. Commented Feb 14, 2017 at 12:05
  • Why can we not use row_number() ? Commented Feb 14, 2017 at 12:11
  • @BobC according to the question i only have to use where clause to get the desired results. Please suggest. Commented Feb 14, 2017 at 12:13
  • My question is why? Is this some sort of science experiment, or is there are real business case here? Commented Feb 14, 2017 at 12:16
  • No, this is not a real business case. He just got the question in SQL exam Commented Feb 14, 2017 at 12:17

3 Answers 3

4

From its name and the data shown, we can assume that id in table t1 is unique.

From its name and the data shown, we can assume that fk_id in table t2 is a foreign key to table1.id.

So the union of the IDs in the two tables are simply the IDs that we find in table t1.

As we are forced to use UNION ALL on the two tables, though, we can use a pseudo UNION ALL not adding anything:

select id from t1 union all select fk_id from t2 where 1 = 2; 

If t2.fk_id were not a foreign key referencing t1.id, we would use NOT EXISTS or NOT IN in the where clause instead. If this is to give a result without duplicates, however, there must be no duplicates in t2 then to start with. (As you are showing that duplicate values in t2 do exist, this approach would not work then.) Here is a query for unique values from t1 plus unique values from t2 that are not referencing the t1 values:

select id from t1 union all select fk_id from t2 where fk_id not in (select id from t1); 
Sign up to request clarification or add additional context in comments.

6 Comments

Can you please write the query when there is no primary foreign key relationship between the table?
this query will work in you case too. if their is no primary and foreign key
Sorry, I thought the query was so simple that the hint on using NOT IN / NOT EXISTS would suffice. I've updated my answer.
Sorry again. I was confused. When there are duplicate values in t2 (as your sample data is showing) and they are not referencing the t1 values (so you could have a duplicate value that is not in t1), then you cannot get rid of them without UNION / DISTINCT / GROUP BY or any such approach. You'd have to make them distinct somehow, which the WHERE clause alone cannot provide.
If this is Oracle only and Oracle's propriatary ROWNUM or ROWID are allowed, then see Aleksej's answer. (Using ROWID instead of generating a unique ID with ROWNUM would be the much simpler approach, however.)
|
2

In a more generic case, where you can have duplicates in both tables, this could be a way.

test data:

create table table1(id) as ( select 1 from dual union all select 1 from dual union all select 2 from dual union all select 2 from dual union all select 1 from dual ) create table table2(fk_id) as ( select 1 from dual union all select 1 from dual union all select 1 from dual union all select 3 from dual union all select 4 from dual union all select 1 from dual union all select 4 from dual union all select 2 from dual ) 

query:

with tab1_union_all_tab2 as ( select 'tab1'||rownum as uniqueId, id from table1 UNION ALL select 'tab2'||rownum , fk_id from table2 ) select id from tab1_union_all_tab2 u1 where not exists ( select 1 from tab1_union_all_tab2 u2 where u1.id = u2.id and u1.uniqueId < u2.uniqueId ) 

result:

 ID ---------- 3 4 1 2 

This should clarify the idea behind:

with tab1_union_all_tab2 as ( select 'tab1'||rownum as uniqueId, id from table1 UNION ALL select 'tab2'||rownum , fk_id from table2 ) select uniqueId, id, ( select nvl(listagg ( uniqueId, ', ') within group ( order by uniqueId), 'NO DUPLICATES') from tab1_union_all_tab2 u2 where u1.id = u2.id and u1.uniqueId < u2.uniqueId ) duplicates from tab1_union_all_tab2 u1 UNIQUEID ID DUPLICATES ---------- ---------- -------------------------------------------------- tab11 1 tab12, tab15, tab21, tab22, tab23, tab26 tab12 1 tab15, tab21, tab22, tab23, tab26 tab13 2 tab14, tab28 tab14 2 tab28 tab15 1 tab21, tab22, tab23, tab26 tab21 1 tab22, tab23, tab26 tab22 1 tab23, tab26 tab23 1 tab26 tab24 3 NO DUPLICATES tab25 4 tab27 tab26 1 NO DUPLICATES tab27 4 NO DUPLICATES tab28 2 NO DUPLICATES 

As rightly observed by Thorsten Kettner, you can easily edit this to use rowid instead of building a unique id by concatenating a string and the rownum:

with tab1_union_all_tab2 as ( select rowid uniqueId, id from table1 UNION ALL select rowid , fk_id from table2 ) select id from tab1_union_all_tab2 u1 where not exists ( select 1 from tab1_union_all_tab2 u2 where u1.id = u2.id and u1.uniqueId < u2.uniqueId ) 

2 Comments

For Oracle this is a great solution, working even with possible duplicate values in both tables. (The same can be done by simply using Oracle's ROWID by the way, so you don't have to generate a unique row ID yourself.) The answer would have been even better with a brief explanation what the query is doing and why you decided for this approach. +1 from me.
@Aleksej Thanks for your effort
1

write a where statement for the second select in the union all as where id != fk_id

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.