2

I have a temp table created from a copy from a CSV file and the result includes some duplicate ids. I need to delete any duplication. I have tried the following:

delete from my_table where id in (select id from (select count(*) as count, id from my_table group by id) as counts where count>1); 

However this deletes both the duplicate records and I must keep one.

How can I delete only the 2nd record with a duplicated Id?

Thanks.

1
  • Is this temp table just a single column? DISTINCT will keep them of course.... Commented Aug 23, 2011 at 14:21

2 Answers 2

2

Your query deletes all IDs that have a count greater than 1, so it removes everything that is duplicated. What you need to do is isolate one record from the list of duplicates and preserve that:

delete from my_table where id in (select id from my_table where some_field in (select some_field from my_table group by some_field having count(id) > 1)) and id not in (select min(id) from my_table where some_field in (select some_field from my_table group by some_field having count(id) > 1) group by some_field); 

EDIT Fixed :P

Sign up to request clarification or add additional context in comments.

2 Comments

@Irit: Point taken! Fixed :) My example is for multicolumn data; if there's only one column (ID), then the OP should put a PK constraint on the ID column and run the import again!
Dammit: Made another mistake... OK: It's fixed now and it works :)
1

Assuming you don't have foreign key relations...

CREATE TABLE "temp"(*column definitions*); insert into "temp" (*column definitions*) select *column definitions* from ( select *,row_number() over(PARTITION BY id) as rn from "yourtable" ) tm where rn=1; drop table "yourtable"; alter table "temp" rename to "yourtable"; 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.