0

We want to delete duplicated rows on our MySQL database, and we have tried a lot of queries, but for unfortunately we haven't succeeded yet. We found this query on several posts, but didn't work either:

DELETE t1 FROM Raw_Validated_backup AS t1 INNER JOIN Raw_Validated_backup AS t2 ON t1.time_start=t2.time_start AND t1.time_end=t2.time_end AND t1.first_temp_lpn=t2.first_temp_lpn AND t1.first_WL=t2.first_WL AND t1.first_temp_lpn_validated=t2.first_temp_lpn_validated AND t1.second_temp_lpn=t2.second_temp_lpn AND t1.second_WL=t2.second_WL AND t1.second_temp_lpn_validated=t2.second_temp_lpn_validated AND t1.third_temp_lpn=t2.third_temp_lpn AND t1.third_WL=t2.third_WL AND t1.third_temp_lpn_validated=t2.third_temp_lpn_validated AND t1.first_temp_rising=t2.first_temp_rising AND t1.first_WR=t2.first_WR AND t1.first_temp_rising_validated=t2.first_temp_rising_validated AND t1.second_temp_rising=t2.second_temp_rising AND t1.second_WR=t2.second_WR AND t1.second_temp_rising_validated=t2.second_temp_rising_validated AND t1.third_temp_rising=t2.third_temp_rising AND t1.third_WR=t2.third_WR AND t1.third_temp_rising_validated=t2.third_temp_rising_validated AND t1.id<t2.id; 

Message we receive after running query: No errors, 0 rows affected, taking 40,4 s

1
  • Incidentally, if operationally possible, it's often far quicker to create a new table, retaining just the rows you want to keep, and then dropping/archiving the old table and renaming the new one. Commented Jan 6, 2020 at 11:49

1 Answer 1

2

This query:

select max(id) id from Raw_Validated_backup group by <list of all the columns except id> 

returns all the ids for the rows that you want to keep.
So delete the rest:

delete from Raw_Validated_backup where id not in ( select t.id from ( select max(id) id from Raw_Validated_backup group by <list of all the columns except id> ) t ) 

See the demo.
Another option with a self join:

delete v1 from Raw_Validated_backup v1 inner join Raw_Validated_backup v2 on v1.time_start = v2.time_start and v1.time_end = v2.time_end and ....... and v1.id < v2.id; 

See a simplified demo.

Sign up to request clarification or add additional context in comments.

4 Comments

Use USING (fieldslist) instead of ON - it is shorter and more clear.
Just tested and worked as wished. Thank you very much!
@Akina I agree using should be handy in this case but then the last condition v1.id < v2.id should be moved to a WHERE clause.
the last condition v1.id < v2.id should be moved to a WHERE clause Of course.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.