SELECT UNION as DISTINCT

Question

How do I perform a DISTINCT operation on a single column after a UNION is performed?

T1 -- ID Value 1 1 2 2 3 3 T2 -- ID Value 1 2 4 4 5 5

I am trying to return the table:

ID Value 1 1 2 2 3 3 4 4 5 5

I tried:

SELECT DISTINCT ID, Value FROM (SELECT*FROM T1 UNION SELECT*FROM T2) AS T3

This does not seem to work.

You are not giving us all the details, will the value always has to be the same as field 1, min value, max value, random value...Any way distinct is on all the fields, not just one field. — Itay Moav -Malimovka
– Itay Moav -Malimovka, Commented Jan 9, 2012 at 0:50

Bohemian · Accepted Answer · 2012-01-09 01:13:08Z

50

Why are you using a sub-query? This will work:

SELECT * FROM T1 UNION SELECT * FROM T2

UNION removes duplicates. (UNION ALL does not)

answered Jan 9, 2012 at 1:13

Bohemian♦

427k103 gold badges603 silver badges750 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

alf Over a year ago

Point was, OP wanted something called "one-field DISTINCT", and there's no such a concept.

user3750325 Over a year ago

If you UNION records [1, 1] and [1, 2], you will get both in the result set. OP wanted no repeats from the first column. Obviously this answer was helpful to a lot of people, but I don't think it answers what was asked.

Bohemian Over a year ago

@user7733611 Actually, you're right now that I examine OP's example data. This query is the refactored equivalent of OP's query.

alf · Accepted Answer · 2016-06-08 06:32:04Z

As far as I can say, there's no "one-column distinct": distinct is always applied to a whole record (unless used within an aggregate like count(distinct name)). The reason for this is, SQL cannot guess which values of Value to leave for you—and which to drop. That's something you need to define by yourself.

Try using GROUP BY to ensure ID is not repeated, and any aggregate (here MIN, as in your example it was the minimum that survived) to select a particular value of Value:

SELECT ID, min(Value) FROM (SELECT * FROM T1 UNION ALL SELECT * FROM T2) AS T3 GROUP BY ID

Should be exactly what you need. That is, it's not the same query, and there's no distinct—but it's a query which would return what's shown in the example.

I'd suggest using UNION ALL in the subquery as there is no point in doing a DISTINCT twice.
@MitchWheat I'm sure it's not—but it's a query which would return what's shown in the example.
@MitchWheat: It isn't, but it'll do what the OP specifically said he wanted in his "I'm trying to return the table" table.

jherran · Accepted Answer · 2016-04-19 08:48:03Z

6

I think that's what you meant:

SELECT * FROM T1 UNION SELECT * FROM T2 WHERE ( **ID ** NOT IN (SELECT ID FROM T1) );

edited Apr 19, 2016 at 8:48

jherran

3,3878 gold badges39 silver badges54 bronze badges

answered Apr 19, 2016 at 7:53

KT8

1973 silver badges10 bronze badges

1 Comment

user3750325 Over a year ago

I really think this should be the accepted answer to the question. It lets you prioritize which table gets values chosen from instead of doing a MIN() with a GROUP BY. Depends on how OP wanted to choose the Value.

liselorev · Accepted Answer · 2014-07-31 10:31:40Z

This - even though this thread is way old - might be a working solution for the question of the OP, even though it might be considered dirty.

We select all tuples from the first table, then adding (union) it with the tuples from the second table limited to those that doe not have the specific field matched in the first table.

SELECT * FROM T1 UNION SELECT * FROM T2 WHERE ( Value NOT IN (SELECT Value FROM T1) );

Collectives™ on Stack Overflow

SELECT UNION as DISTINCT

4 Answers 4

3 Comments

6 Comments

1 Comment

Comments

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

6 Comments

1 Comment

Comments

Related