delete duplicate records, keep one

Question

I have a temp table created from a copy from a CSV file and the result includes some duplicate ids. I need to delete any duplication. I have tried the following:

delete from my_table where id in (select id from (select count(*) as count, id from my_table group by id) as counts where count>1);

However this deletes both the duplicate records and I must keep one.

How can I delete only the 2nd record with a duplicated Id?

Thanks.

Is this temp table just a single column? DISTINCT will keep them of course.... — Ray Toal
– Ray Toal, Commented Aug 23, 2011 at 14:21

Xophmeister · Accepted Answer · 2011-08-23 15:42:39Z

Your query deletes all IDs that have a count greater than 1, so it removes everything that is duplicated. What you need to do is isolate one record from the list of duplicates and preserve that:

delete from my_table where id in (select id from my_table where some_field in (select some_field from my_table group by some_field having count(id) > 1)) and id not in (select min(id) from my_table where some_field in (select some_field from my_table group by some_field having count(id) > 1) group by some_field);

EDIT Fixed :P

@Irit: Point taken! Fixed :) My example is for multicolumn data; if there's only one column (ID), then the OP should put a PK constraint on the ID column and run the import again!
Dammit: Made another mistake... OK: It's fixed now and it works :)

J0HN · Accepted Answer · 2011-08-23 14:57:10Z

Assuming you don't have foreign key relations...

CREATE TABLE "temp"(*column definitions*); insert into "temp" (*column definitions*) select *column definitions* from ( select *,row_number() over(PARTITION BY id) as rn from "yourtable" ) tm where rn=1; drop table "yourtable"; alter table "temp" rename to "yourtable";

Collectives™ on Stack Overflow

delete duplicate records, keep one

2 Answers 2

2 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Related