3

I have the following query:

Original query:

SELECT cd1.cust_number_id, cd1.cust_number_id, cd1.First_Name, cd1.Last_Name FROM @Customer_Data cd1 inner join @Customer_Data cd2 on cd1.Cd_Id <> cd2.Cd_Id and cd2.cust_number_id <> cd1.cust_number_id and cd2.First_Name = cd1.First_Name and cd2.Last_Name = cd1.Last_Name inner join @Customer c1 on c1.Cust_id = cd1.cust_number_id inner join @Customer c2 on c2.cust_id = cd2.cust_number_id WHERE c1.cust_number <> c2.cust_number 

enter image description here

I optimized it as follows, but there is an error in my optimization and I can't find it:

Optimized query:

 SELECT cd1.cust_number_id, cd1.cust_number_id, cd1.First_Name,cd1.Last_Name FROM ( SELECT cdResult.cust_number_id, cdResult.First_Name,cdResult.Last_Name, COUNT(*) OVER (PARTITION BY cdResult.First_Name, cdResult.Last_Name) as cnt_name_bday FROM @Customer_Data cdResult WHERE cdResult.First_Name IS NOT NULL AND cdResult.Last_Name IS NOT NULL) AS cd1 WHERE cd1.cnt_name_bday > 1; 

enter image description here

Test data:

DECLARE @Customer_Data TABLE ( Cd_Id INT, cust_number_id INT, First_Name NVARCHAR(30), Last_Name NVARCHAR(30) ) INSERT @Customer_Data (Cd_Id,cust_number_id,First_Name,Last_Name) VALUES (1, 22, N'Alex', N'Bor'), (2, 22, N'Alex', N'Bor'), (3, 23, N'Alex', N'Bor'), (4, 24, N'Tom', N'Cruse'), (5, 25, N'Tom', N'Cruse') DECLARE @Customer TABLE ( Cust_id INT, Cust_number INT ) INSERT @Customer (Cust_id, Cust_number) VALUES (22, 022), (23, 023), (24, 024), (25, 025) 

The problem is that the original query returns 6 rows (duplicating the row). And optimized returns just duplicates, how to make the optimized query also duplicated the row?

2 Answers 2

4

I would suggest just using window functions:

SELECT CD.cud_customer_id FROM (SELECT cd.*, COUNT(*) OVER (PARTITION BY cud_name, cud_birthday) as cnt_name_bday FROM dbo.customer_data cd ) cd WHERE cnt_name_bday > 1; 

Your query is finding duplicates for either name or birthday. You want duplicates with both at the same time.

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for your help, I have updated the information, could you look it up and give me another tip?
how can I achieve duplicate rows? and did I correctly add the condition of checking records for NULL?
@Tibomso . . . I don't understand your comment. This returns duplicated rows.
1

You can use only one exists :

SELECT cd.cud_customer_id FROM dbo.customer_data AS cd WHERE EXISTS (SELECT 1 FROM dbo.customer_data AS c WHERE c.cud_name = cd.cud_name AND c.cud_birthday = cd.cud_birthday AND c.cust_id <> cd.cud_customer_id ); 

1 Comment

Thank you for your help, I have updated the information, could you look it up and give me another tip?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.