2

I have unique keys id keys in my table but I have a column with duplicate values? how do I get rid of those, while preserving only one of them like this :

Duplicate records :

id  | name   | surname |
1   | test   | one     |
2   | test   | two     |
3   | test3  | three   |
4   | test7  | four    |
5   | test   | five    |
6   | test11 | eleven  |

Without duplicates :

id  | name   | surname |
1   | test   | one     |
3   | test3  | three   |
4   | test7  | four    |
6   | test11 | eleven  |

I've googled this but it seems not to be working :

DELETE  ct1
FROM    mytable ct1
        , mytable ct2
WHERE   ct1.name = ct2.name 
        AND ct1.id < ct2.id 

ERROR:  syntax error at or near "ct1"
LINE 1: DELETE  ct1
                ^

********** Error **********

I'm using postgres database.

1
  • After you get that data cleaned up, you probably need to put a UNIQUE constraint on "name". Commented May 8, 2011 at 3:18

2 Answers 2

3

You can try this running multiple times:

delete from mytable where id in (
    select max(id)
      from mytable
     group by name
    having count(1) > 1
);

Where multiple times equals the maximum number of repetitions you have in name column.

Otherwise, you can try this more complex query:

delete from mytable where id in (
    select id from mytable
    except 
    (
    select min(id)
      from mytable
     group by name
    having count(1) > 1
    union all
    select min(id)
      from mytable
     group by name
    having count(1) = 1
    )
);

Running this query one time only should delete all you need. Haven't tried it though...

Sign up to request clarification or add additional context in comments.

1 Comment

Glad it helped. For complex grouping like this, I would recommend you to learn window functions such as Rank @Dalen is suggesting in other answer. They're worth learning.
3

Using Rank, actually I'm not totally sure about the syntax because I'm not that good at PostgreSQL, this is just a hint anyway (anybody's correction will be appreciated):

DELETE FROM mytable
WHERE id NOT IN
(
   SELECT x.id FROM
   (
      SELECT id, RANK() OVER (PARTITION BY name ORDER BY id ASC) AS r
      FROM mytable
   ) x
   WHERE x.r = 1
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.