I am using postgres. I want to delete Duplicate rows. The condition is that , 1 copy from the set of duplicate rows would not be deleted.
i.e : if there are 5 duplicate records then 4 of them will be deleted.
I am using postgres. I want to delete Duplicate rows. The condition is that , 1 copy from the set of duplicate rows would not be deleted.
i.e : if there are 5 duplicate records then 4 of them will be deleted.
Try the steps described in this article: Removing duplicates from a PostgreSQL database.
It describes a situation when you have to deal with huge amount of data which isn't possible to group by.
A simple solution would be this:
DELETE FROM foo
WHERE id NOT IN (SELECT min(id) --or max(id)
FROM foo
GROUP BY hash)
Where hash is something that gets duplicated.
The fastest is is join to the same table. http://www.postgresql.org/docs/8.1/interactive/sql-delete.html
CREATE TABLE test(id INT,id2 INT);
CREATE TABLE
mapy=# INSERT INTO test VALUES(1,2);
INSERT 0 1
mapy=# INSERT INTO test VALUES(1,3);
INSERT 0 1
mapy=# INSERT INTO test VALUES(1,4);
INSERT 0 1
DELETE FROM test t1 USING test t2 WHERE t1.id=t2.id AND t1.id2<t2.id2;
DELETE 2
mapy=# SELECT * FROM test;
id | id2
----+-----
1 | 4
(1 row)