Delete duplicate rows from table

Question

I have unique keys id keys in my table but I have a column with duplicate values? how do I get rid of those, while preserving only one of them like this :

Duplicate records :

id  | name   | surname |
1   | test   | one     |
2   | test   | two     |
3   | test3  | three   |
4   | test7  | four    |
5   | test   | five    |
6   | test11 | eleven  |

Without duplicates :

id  | name   | surname |
1   | test   | one     |
3   | test3  | three   |
4   | test7  | four    |
6   | test11 | eleven  |

I've googled this but it seems not to be working :

DELETE  ct1
FROM    mytable ct1
        , mytable ct2
WHERE   ct1.name = ct2.name 
        AND ct1.id < ct2.id 

ERROR:  syntax error at or near "ct1"
LINE 1: DELETE  ct1
                ^

********** Error **********

I'm using postgres database.

After you get that data cleaned up, you probably need to put a UNIQUE constraint on "name". — Mike Sherrill 'Cat Recall'
– Mike Sherrill 'Cat Recall', Commented May 8, 2011 at 3:18

Pablo Santa Cruz · Accepted Answer · 2011-05-07 12:56:32Z

3

You can try this running multiple times:

delete from mytable where id in (
    select max(id)
      from mytable
     group by name
    having count(1) > 1
);

Where multiple times equals the maximum number of repetitions you have in name column.

Otherwise, you can try this more complex query:

delete from mytable where id in (
    select id from mytable
    except 
    (
    select min(id)
      from mytable
     group by name
    having count(1) > 1
    union all
    select min(id)
      from mytable
     group by name
    having count(1) = 1
    )
);

Running this query one time only should delete all you need. Haven't tried it though...

answered May 7, 2011 at 12:56

Pablo Santa Cruz

182k33 gold badges250 silver badges300 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Pablo Santa Cruz Over a year ago

Glad it helped. For complex grouping like this, I would recommend you to learn window functions such as Rank @Dalen is suggesting in other answer. They're worth learning.

Dalen · Accepted Answer · 2011-06-27 12:28:10Z

3

Using Rank, actually I'm not totally sure about the syntax because I'm not that good at PostgreSQL, this is just a hint anyway (anybody's correction will be appreciated):

DELETE FROM mytable
WHERE id NOT IN
(
   SELECT x.id FROM
   (
      SELECT id, RANK() OVER (PARTITION BY name ORDER BY id ASC) AS r
      FROM mytable
   ) x
   WHERE x.r = 1
)

edited Jun 27, 2011 at 12:28

answered May 7, 2011 at 13:06

Dalen

9,0264 gold badges49 silver badges53 bronze badges

Collectives™ on Stack Overflow

Delete duplicate rows from table

2 Answers 2

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related