Get list of duplicate rows in MySql

Question

i have a table like this

ID     nachname     vorname
1       john         doe
2       john         doe
3       jim          doe
4       Michael     Knight

I need a query that will return all the fields (select *) from the records that have the same nachname and vorname (in this case, records 1 and 2). Can anyone help me with this? Thanks

wimvds · Accepted Answer · 2010-05-21 12:36:06Z

17

The following query will give the list of duplicates :

SELECT n1.* FROM table n1
inner join table n2 on n2.vorname=n1.vorname and n2.nachname=n1.nachname
where n1.id <> n2.id

BTW The data you posted seems to be wrong "Doe" and "Knight" are a lastname, not a firstname :p.

answered May 21, 2010 at 12:36

wimvds

12.8k2 gold badges43 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user347033 Over a year ago

I just needed to add select distinct (the query was returning 2 times the same row). Thank you for your help

domis86 Over a year ago

Note: if fields "vorname" and "nachname" are nullable then "coalesce" function should be used for comparison. See: stackoverflow.com/questions/9608639/…

Lucio Mollinedo Over a year ago

@domis86: or you could use the NULL-safe equal to operator (<=>), which compares nullable columns. Reference: dev.mysql.com/doc/refman/8.0/en/…

ewernli · Accepted Answer · 2010-05-21 12:14:18Z

13

The general solution to your problem is a query of the form

SELECT col1, col2, count(*)
FROM t1
GROUP BY col1, col2
HAVING count(*) > 1

This will return one row for each set of duplicate row in the table. The last column in this result is the number of duplicates for the particular values.

If you really want the ID, try something like this:

SELECT id FROM 
t1, 
( SELECT col1, col2, count(*)
  FROM t1
  GROUP BY col1, col2
  HAVING count(*) > 1 ) as t2
WHERE t1.col1 = t2.col1 AND t1.col2 = t2.col2

Haven't tested it though

edited May 21, 2010 at 12:14

answered May 21, 2010 at 11:57

ewernli

38.8k6 gold badges94 silver badges123 bronze badges

2 Comments

jle Over a year ago

This would not actually return all of the rows, it would just find the duplicate rows.

wimvds Over a year ago

This is way to expensive, you can solve it using a simple join (see my answer :p).

David Gelhar · Accepted Answer · 2010-05-21 12:56:10Z

2

You can do it with a self-join:

select distinct t1.id from t as t1 inner join t as t2 
on t1.col1=t2.col1 and t1.col2=t2.col2 and t1.id<>t2.id

the t1.id<>t2.id is necessary to avoid ids matching against themselves. (If you want only 1 row out of each set of duplicates, you can use t1.id<t2.id).

edited May 21, 2010 at 12:56

answered May 21, 2010 at 12:27

David Gelhar

27.9k3 gold badges69 silver badges87 bronze badges

2 Comments

wimvds Over a year ago

Nope, that one will only return 1 row with the 2 matching records in it, not the 2 rows that it should return...

David Gelhar Over a year ago

@wimvds true, if you want all duplicate rows (instead of 1 row of each duplicate set, you should use <>)

jle · Accepted Answer · 2010-05-22 06:14:44Z

0

select * from table AS t1 inner join
(select max(id) As id,nachname,vorname, count(*) 
from t1 group by nachname,vorname 
having count(*) >1) AS t2 on t1.id=t2.id

This should return ALL of the columns from the table where there is duplicate nachname and vorname. I recommend changing * to the exact columns that you need.

Edit: I added a max(id) so that the group by wouldn't be a problem. My query isn't as elegant as I would want though. There's probably an easier way to do it.

edited May 22, 2010 at 6:14

answered May 21, 2010 at 12:02

jle

9,5095 gold badges50 silver badges68 bronze badges

3 Comments

ewernli Over a year ago

Hmm... I see what you mean now. But I'm pretty sure your query is wrong. You can't return id if you're not using it to group by.

David Gelhar Over a year ago

That join doesn't work - there's no id column in the t2 query.

wimvds Over a year ago

This is just blatantly wrong... The group by will in fact eliminate any duplicates you have if you're using MySQL since you only group on nachname and vorname, so it will return 1 row, with 1 ID, instead of all distinctive rows as you probably expected (just try it, you'll see). Oh, and any other RDBMS would complain about your group by (which is imho the only correct way, I hate MySQL trying to guess what you want and execute these erronous queries instead of throwing an error).

Collectives™ on Stack Overflow

Get list of duplicate rows in MySql

4 Answers 4

3 Comments

2 Comments

2 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

2 Comments

2 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related