1

I have this table in my database:

tblAgencies ---------------------- AgencyID (PK) VendorID RegionID Name Zip 

Long story short, I accidentally copied my entire table into itself - so every row in my table has a duplicate.

But with my AgencyID field being the identity, and automatically incrementing, I need to find duplicates based on all the other fields, since AgencyID is unique.

Does anyone know how I can do this?

2
  • 1
    If your PK is auto incremented, can't you just run a DELETE where the AgencyID > [the last good record]? Commented Nov 14, 2013 at 22:22
  • use row_number for whatever columns make a duplicate, and delete where it = 2? Commented Nov 14, 2013 at 22:22

8 Answers 8

3

This will keep the oldest AgencyID values, and delete any duplicates otherwise.

;WITH x AS ( SELECT *, rn = ROW_NUMBER() OVER (PARTITION BY VendorID, RegionID, Name, Zip ORDER BY AgencyID) FROM dbo.tblAgencies ) DELETE x WHERE rn > 1; 

Be careful, though; this may not work if other tables reference AgencyID and they've obtained any of your newer, erroneous values.

Sign up to request clarification or add additional context in comments.

Comments

1

The simplest solution, use select distinct into a temp table, then reload the original

1 Comment

This will only work if there's no foreign keys referring to the original table.
0

This query will give you duplicates provided that the combination of all other columns is unique:

select * from mytable t1 where exists (select * from mytable t2 where t1.VendorID = t2.VendorID and t1.RegionID = t2.RegionID and and t1.Name = t2.Name and t1.Zip = t2.Zip and t1.AgencyID > t2.AgencyID) 

Comments

0

This should give you all the rows that have duplicate values except for the minimum agencyid row.

select * from tblAgencies where AgencyID not in (select min(AgencyID) from tblAgencies group by VendorID, RegionID, Name, Zip) 

edit: adding SQLFiddle

Comments

0
;with CTE AS ( SELECT ID_Column, rn = ROW_NUMBER() OVER (PARTITION BY Column1, Column2, Column3... ORDER BY ID ASC) FROM T ) DELETE FROM CTE WHERE rn >= 2 

Comments

0
;with CTE AS (SELECT MAX(AgencyID) AgentID,VendorID , RegionID , Name , Zip FROM tblAgencies GROUP BY VendorID , RegionID , Name , Zip HAVING COUNT(*) > 1) DELETE FROM tblAgencies WHERE EXISTS (SELECT 1 FROM CTE WHERE AgentID = tblAgencies.AgencyID) 

Comments

0

Lots of answers that will give you what you want here, but there's no need to use a CTE or do any grouping, the simplest way is just:

delete t1 from tblAgencies t1 join tblAgencies t2 on t1.VendorId = t2.VendorId and t1.RegionId = t2.RegionId and t1.Name = t2.Name and t1.Zip = t2.Zip and t1.AgencyId > t2.AgencyId 

Comments

0

Maybe this will help: How to delete duplicates in the presence of a primary key?

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.