-2

How would I delete yesterday's duplicate data from a MySQL table and keep latest record?

For example, with the following data:

SELECT * FROM data; | pid | serviceid | id | created | ----| ----------|----------------|--------- | 10 | ABNCKN | Q0CHZ2GI1VKOPL | 2022-06-14 10:49:34 | 11 | ABNCKN | Q0CHZ2GI1VKOPL | 2022-06-14 10:49:34 | 12 | KSPSLS | QLSPDF0S0SDFKK | 2022-06-15 11:44:21 | 13 | AKNKSL | QLSPDF0S0SDFKK | 2022-06-15 12:51:42 | 14 | AKNKSL | QLSPDF0S0SDFKK | 2022-06-15 12:51:42 

I used

DELETE n1 FROM data n1, data n2 WHERE n1.pid < n2.pid AND n1.id = n2.id 

How could I DELETE to only remove duplicates and keep latest pid for yesterday's data?

Expected output would be

SELECT * FROM data; | pid | serviceid | id | created | ----| ----------|----------------|--------- | 10 | ABNCKN | Q0CHZ2GI1VKOPL | 2022-06-14 10:49:34 | 11 | ABNCKN | Q0CHZ2GI1VKOPL | 2022-06-14 10:49:34 | 12 | KSPSLS | QLSPDF0S0SDFKK | 2022-06-15 11:44:21 | 14 | AKNKSL | QLSPDF0S0SDFKK | 2022-06-15 12:51:42 

remove | 13 | AKNKSL | QLSPDF0S0SDFKK | 2022-06-15 12:51:42

5
  • Does this answer your question? Find and remove duplicate rows by two columns Commented Jun 16, 2022 at 8:43
  • there are many good answers here and other documentation if you just google with the tilte of your question Commented Jun 16, 2022 at 8:44
  • 1
    Why row with pid 10 is not removed as a duplicate record? Commented Jun 16, 2022 at 8:56
  • @NishantGupta i want to delete only yesterday's data, i.e., 15th June Data Commented Jun 17, 2022 at 3:44
  • @SRash have you tried the answer posted by Himanshu? Commented Jun 17, 2022 at 4:44

3 Answers 3

1
DELETE * FROM data WHERE id IN (SELECT id FROM data WHERE name="ABNCKN" LIMIT((SELECT COUNT(name) FROM data WHERE name="ABNCKN") - 1))); 

SELECT id FROM data WHERE name="ABNCKN" >> this query will get all ids with specific name, and limit will limit the the number of rows returned. The number of rows will be dynamic based on the duplicated rows except one row. and you can add ORDER BY to the query to arrange them.

and finally delete all duplicated rows except one row.

Sign up to request clarification or add additional context in comments.

Comments

0

Didn't understand very well why you keep some duplicated data in the expected output but what you can try is get the last Id with a JOIN and filter it from the WHERE so is not deleted:

DELETE d1 FROM data d1, data d2 JOIN (SELECT MAX(pid) pid FROM data) d3 WHERE d1.pid != d3.pid AND d1.pid < d2.pid AND d1.id = d2.id; 

1 Comment

im running some batch jobs to upload data into mysql for every hour, sometimes it has duplicate entries, my requirement is to delete only last date duplicate entries.
0

You can achieve this using self join and add condition to only delete yesterday's data:

DELETE t1 FROM data t1 INNER JOIN data t2 ON t1.pid < t2.pid AND t1.serviceid = t2.serviceid AND t1.id = t2.id AND t1.created = t2.created WHERE CAST(t1.created AS DATE) = DATE_ADD(CURRENT_DATE(), INTERVAL -1 DAY); SELECT * FROM data; 

See this db<>fiddle

For MySQL 8+ you can also achieve this using ROW_NUMBER():

WITH CTE AS ( SELECT *, ROW_NUMBER() OVER(PARTITION BY serviceid,id,created ORDER BY pid DESC) AS RN FROM data WHERE CAST(created AS DATE) = DATE_ADD(CURRENT_DATE(), INTERVAL -1 DAY) ) DELETE data FROM data JOIN CTE ON data.pid=CTE.pid WHERE CTE.RN>1; SELECT * FROM data; 

Check this db<>fiddle

Output after delete:

pid serviceid id created
10 ABNCKN Q0CHZ2GI1VKOPL 2022-06-18 07:27:35
11 ABNCKN Q0CHZ2GI1VKOPL 2022-06-18 07:27:35
12 KSPSLS QLSPDF0S0SDFKK 2022-06-19 07:27:35
14 AKNKSL QLSPDF0S0SDFKK 2022-06-19 07:27:35

4 Comments

``` DELETE t1 FROM data t1 INNER JOIN data t2 WHERE t1.pid < t2.pid AND t1.serviceid = t2.serviceid AND t1.id = t2.id AND t1.created = t2.created; SELECT * FROM data; ``` this one i already tried, this delete all duplicates entries ref to newly one. i required only to delete yesterday's duplicate data, not all duplicate record.
By 'yesterday's duplicate date' do you mean to perform these actions (to find and remove duplicate data) only on yesterday's created date? If yes, then simply add that condition in WHERE clause. Like this: WHERE CAST(created AS DATE) = DATE_ADD(CURRENT_DATE(), INTERVAL -1 DAY). See this fiddle.
@SRash: Check the updated answer now.
@SRash: Did you see the updated answer?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.