Find duplicate records based on two columns

Question

I have a table named employee, there are many records in this table. Here is some sample data :

fullname | address | city ----------------------------- AA address1 City1 AA address3 City1 AA address8 City2 BB address5 City2 BB address2 City1 CC address6 City1 CC address7 City2 DD address4 City1

I want to have a SELECT query in sql server which will show only the duplicate records based on the columns fullname and city. For the given data and considering the condition, only the first two records is duplicate. So my expected output should be like below :

fullname | address | city ----------------------------- AA address1 City1 AA address3 City1

To get this output, I have this query :

select fullname, city from employee group by fullname, city having count(*)>1

As you can see, it selects two columns only and thus it is giving the following output :

fullname | city ------------------ AA City1

If I re-write the query like below :

select fullname, city, address from employee group by fullname, city, address having count(*)>1

Unfortunately it is showing no records! Can anybody please help me to write the correct query to get the expected result?

Martin Smith · Accepted Answer · 2017-10-15 18:48:02Z

Instead of a grouped COUNT you can use it as a windowed aggregate to access the other columns

SELECT fullname, address, city FROM (SELECT *, COUNT(*) OVER (PARTITION BY fullname, city) AS cnt FROM employee) e WHERE cnt > 1

Md. Suman Kabir · Accepted Answer · 2017-10-17 08:05:50Z

Agree with above answer. But If you don't want to use Windows functions which might not work properly on all DBs you can join to itself on city and full name after the group by and having and then get the addresses

 Select employee.* from employee join (select fullname, city from employee group by fullname, city having count(*)>1) q1 on q1.fullname = employee.fullname and q1.city = employee.city

Prathamesh shinde · Accepted Answer · 2017-10-15 19:12:20Z

Try the Following Code:

 create table ##Employee (Fullname varchar(25), Address varchar(25), City varchar(25)) insert into ##Employee values ( 'AA', 'address1', 'City1') ,( 'AA', 'address3', 'City1') ,( 'AA', 'address8', 'City2') ,( 'BB', 'address5', 'City2') ,( 'BB', 'address2', 'City1') ,( 'CC', 'address6', 'City1') ,( 'CC', 'address7', 'City2') select E.* from ##Employee E cross apply( select Fullname,City,count(Fullname) cnt from ##Employee group by Fullname,City having Count(Fullname)>1)x where E.Fullname=x.Fullname and E.City=x.City

Gordon Linoff · Accepted Answer · 2017-10-15 19:12:54Z

If you have a unique id or the address is always different, you can try:

select e.* from employee e where exists (select 1 from employee e2 where e2.fullname = e.fullname and e2.city = e.city and e2.address <> e.address -- or id or some other unique column );

Although I would probably go with the window function approach, you might find that under some circumstances, this is faster (especially if you have an index on employee(fullname, city, address)).

Khorshed Alam · Accepted Answer · 2017-10-16 02:49:39Z

Here you go with the solution:

DECLARE @Employee TABLE ( Fullname VARCHAR(25), [Address] VARCHAR(25), City VARCHAR(25) ) INSERT INTO @Employee VALUES ('AA', 'address1', 'City1') ,('AA', 'address1', 'City1') ,('AA', 'address3', 'City1') ,('AA', 'address8', 'City2') ,('BB', 'address5', 'City2') ,('BB', 'address2', 'City1') ,('CC', 'address6', 'City1') ,('CC', 'address7', 'City2') ;WITH cte AS ( SELECT *, ROW_NUMBER() OVER(PARTITION BY FullName, [Address], [City] ORDER BY Fullname) AS sl, HashBytes('MD5', FullName + [Address] + [City]) AS RecordId FROM @Employee AS e ) SELECT c.FullName, c.[Address], c.City FROM cte AS c INNER JOIN cte AS c1 ON c.RecordId = c1.RecordId WHERE c.sl = 2

Result :

FullName Address City AA address1 City1 AA address1 City1

Md. Suman Kabir · Accepted Answer · 2023-10-31 07:16:52Z

-1

 SELECT Feild1, Feild2, COUNT() FROM table name GROUP BY Feild1, Feild2 HAVING COUNT()>1

This will give you all yours answer

edited Oct 31, 2023 at 7:16

Md. Suman Kabir

5,4515 gold badges29 silver badges45 bronze badges

answered Apr 30, 2021 at 9:35

Arunav dutta gupta

1091 silver badge5 bronze badges

Collectives™ on Stack Overflow

Find duplicate records based on two columns

6 Answers 6

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Related