SQL Delete specific rows based on date and criteria [duplicate]

Question

I've got a table that has duplicate data that needs to be cleaned up. Consider the following example:

CREATE TABLE #StackOverFlow ( [ctrc_num] int, [Ctrc_name] varchar(6), [docu] bit, [adj] bit, new bit, [some_date] datetime ); INSERT INTO #StackOverFlow ([ctrc_num], [Ctrc_name], [docu], [adj], [new], [some_date]) VALUES (12345, 'John R', null, null, 1, '2023-12-11 09:05:13.003'), (12345, 'John R', 1, null, 0, '2023-12-11 09:05:12.987'), (12345, 'John R', null, null, 1, '2023-12-11 09:05:12.947'), (56789, 'Sam S', null, null, 1, '2023-12-11 09:05:13.003'), (56789, 'Sam S', null, null, 1, '2023-12-11 09:05:12.987'), (56789, 'Sam S', 1, null, 0, '2023-12-11 09:05:12.947'), (78945, 'Pat P', null, null, 1, '2023-12-11 09:05:13.003'), (78945, 'Pat P', null, null, 1, '2023-12-11 09:05:12.987'), (78945, 'Pat P', null, null, 1, '2023-12-11 09:05:12.947');

This gives me:

[ctrc_num] [Ctrc_name] [docu] [adj] [new] [some_date] ----------------------------------------------------------------------- 12345 John R NULL NULL 1 2023-12-11 09:05:13.003 12345 John R 1 NULL 0 2023-12-11 09:05:12.987 12345 John R NULL NULL 1 2023-12-11 09:05:12.947 56789 Sam S NULL NULL 1 2023-12-11 09:05:13.003 56789 Sam S NULL NULL 1 2023-12-11 09:05:12.987 56789 Sam S 1 NULL 0 2023-12-11 09:05:12.947 78945 Pat P NULL NULL 1 2023-12-11 09:05:13.003 78945 Pat P NULL NULL 1 2023-12-11 09:05:12.987 78945 Pat P NULL NULL 1 2023-12-11 09:05:12.947

What I need to do is delete from the table duplicates. If new is 0, delete the records where new is 1. If all records have new = 1 keep the newest record and delete the older ones.

The result should look like this:

[ctrc_num] [Ctrc_name] [docu] [adj] [new] [some_date] ----------------------------------------------------------------------- 12345 John R 1 NULL 0 2023-12-11 09:05:12.987 56789 Sam S 1 NULL 0 2023-12-11 09:05:12.947 78945 Pat P NULL NULL 1 2023-12-11 09:05:13.003

I've tried ROW_NUMBER:

;WITH RankedByDate AS ( SELECT ctrc_num, Ctrc_name, docu, adj, new, some_date, ROW_NUMBER() OVER (PARTITION BY Ctrc_num, Ctrc_name, [docu],[adj], [new] ORDER BY some_date DESC) AS rNum FROM #StackOverFlow ) SELECT * FROM RankedByDate

This separates the ones with new = 0, but I still have the ones with new = 1 that are ordered.

Grouping gives me the records that are duplicated but no way to delete the ones needed to be deleted:

SELECT [ctrc_num] ,[Ctrc_name] ,[docu] ,[adj] ,[new] FROM #StackOverFlow GROUP BY [ctrc_num] ,[Ctrc_name] ,[docu] ,[adj] ,[new] HAVING COUNT(*) > 1

What constitutes a duplicate? Same [ctrc_num] and [Ctrc_name]? — PM 77-1
– PM 77-1, Commented Dec 15, 2023 at 16:09
There are no duplicate rows since no two rows are equal. Therefore you must specify what you mean by duplicate. Also, which value of the rows not building the duplicate do you want to keep? — Olivier Jacot-Descombes
– Olivier Jacot-Descombes, Commented Dec 15, 2023 at 16:18
Unless there can be more than one new = 0, your logic can be summarized as remove all rows partitioned by ctrc_num order by new, some_date desc where row_number > 1. It shouldn't be very hard to come up with sql corresponding to the above. — siggemannen
– siggemannen, Commented Dec 15, 2023 at 16:21

Thom A · Accepted Answer · 2023-12-15 17:01:25Z

2

Break the problem down into it's parts

"If new is 0, delete the records where new is 1"

delete from #StackOverFlow where [new] = 1 and [ctrc_num] in (select [ctrc_num] from #StackOverFlow where [new] = 0);

"If all records have new = 1 keep the newest record and delete the older ones" Use a CTE to add a row number based on the date and partitioned by the [ctrc_num] such that the "first" record in each group is the one you want to keep - if there is only 1 row in a group that's the one you want to keep anyway. Then delete everything else
```
;with cte as ( select [ctrc_num] ,ROW_NUMBER() OVER (PARTITION BY [ctrc_num] ORDER BY [ctrc_num], [some_date] DESC) as rw from #StackOverFlow ) DELETE FROM cte where rw <> 1; 
```

edited Dec 15, 2023 at 17:01

Thom A♦

97.7k12 gold badges67 silver badges102 bronze badges

answered Dec 15, 2023 at 16:20

CHill60

2,0711 gold badge14 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

kool_kris Over a year ago

This is exactly what I was looking for. I was hoping I would be able to eliminate the duplicate without having to break it into more than one part, but this works.

siggemannen Over a year ago

you can write this as subquery too, no need for CTE.

CHill60 Over a year ago

@TN - Why? In step 1 I deleted any records where new = 1 if there was a subsequent new = 0. So either there is only a single record per [ctrc_num] and new = 1 OR there is/are 1+ records for a [ctrc_num] where new = 0. Sorting by new only becomes relevant if trying to do both steps at once.

CHill60 Over a year ago

@kool_kris - as you will see from siggemannen's solution, it is possible to do what you want in a single query. But when you are trying to figure out how to do something it is good practice to break it down first. See also "SQL Antipatterns" by Bill Karwin - Chapter 18 "Spaghetti Query" - "Solve a Complex Problem in One Step". You can always merge the "bits" together afterwards - once you have something working. Personally, I'd rather have three simple queries I can follow than one complex one that has me puzzled :-)

CHill60 Over a year ago

@TN - Phew. I did scratch my head for a while though - at least you made me think :-)

|

T N · Accepted Answer · 2023-12-18 16:39:37Z

It is possible to do what you want is a single query.

;with cte as( select [ctrc_num], [Ctrc_name], [docu],[adj], [new], [some_date] ,ROW_NUMBER() over(partition by [ctrc_num] -- group by [ctrc_num] order by [new], --0 then 1 [some_date] desc --newest first ) rn from #StackOverFlow ) delete cte where rn>1 ; select * from #StackOverFlow

Collectives™ on Stack Overflow

SQL Delete specific rows based on date and criteria [duplicate]

2 Answers 2

6 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Linked

Related