0

I have a table something similar to :

ID Value1 Value2 122 800 1600 122 800 1800 133 700 1500 154 800 1800 133 700 1500 188 700 1400 176 900 1500 

From this table I want to delete the duplicates (ID of 122 and 133) which have a greater difference between value2 and value1.

This means that where ID is 122 I want to keep the first row (1800-800>1600-800) This means that where ID is 133 I want to keep either one because they both have the same difference.

ID Value1 Value2 122 800 1600 122 800 1800 <------delete this row 133 700 1500 <------delete either this row or the other identical row 154 800 1800 133 700 1500 <------delete either this row or the other identical row 188 700 1400 176 900 1500 

It is on a much larger scale that this, so I cant just individually delete records.

Is there a way to write a statement that will delete all duplicates from my table where Value2 - Value1 is greater than Value2 - Value1 for its duplicate?

3
  • What database are you using? SQL Server or MySQL. Please tag questions appropriately. Commented Feb 28, 2014 at 17:44
  • You can do something like that with a Stored Procedure...i think it's the best way IMHO Commented Feb 28, 2014 at 17:45
  • 1
    If your database is MS SQL Server, you could use a CTE. Help us help you by tagging your question with the correct database. Commented Feb 28, 2014 at 17:48

1 Answer 1

2

SQL Server has this great feature of updatable CTEs and subqueries. So, you can do this as:

with todelete as ( select t.*, row_number() over (partition by id order by value2 - value1) as diff_seqnum from table t ) delete from todelete where diff_seqnum > 1; 

That is, enumerate the rows for each id based on the difference in the two values. Then, only keep the rows where the sequence number is 1.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.