0

So I have a medium sized SQLite with ~10.3 million rows. I have some duplicated rows that I want to remove:

The column names are:

  1. Keyword
  2. Rank
  3. URL

The duplication I want to remove would be where the keyword and rank are both the same, but, the URL could be different. So I would only want the first instance of the keyword/rank pair to remain in the database and remove all subsequent matching rows.

What is the most efficient way to go through the entire DB and do this for all the rows?

2 Answers 2

3

You can try something like this:

sqlite> create table my_example (keyword, rank, url);
sqlite> insert into my_example values ('aaaa', 2, 'wwww...');
sqlite> insert into my_example values ('aaaa', 2, 'wwww2..');
sqlite> insert into my_example values ('aaaa', 3, 'www2..');
sqlite> DELETE FROM my_example
   ...> WHERE rowid not in
   ...> (SELECT MIN(rowid)
   ...> FROM my_example
   ...> GROUP BY keyword, rank);
sqlite> select * from my_example;
keyword     rank        url
----------  ----------  ----------
aaaa        2           wwww...
aaaa        3           www2..
sqlite>
Sign up to request clarification or add additional context in comments.

1 Comment

+1. Your example looks perfect. I came up with similar query but wanted to put couple of points for OP. So please don't mind another answer.
1

When you say So I would only want the first instance of the keyword/rank pair to remain in the database and remove all subsequent matching rows., you can never guarantee that. The reason is that your table dont have a unique key (like id or create_date). So there is no guarantee that the row which was entered first will be returned first if you select it again. So keeping this part aside, you can do something like this which will give you first instance most of the time.

delete from tbl 
where 
rowid not in
(
select  min(rowid) 
from tbl
group by Keyword,Rank
)

See sqlfiddle example here

1 Comment

Yep. Fiddle has min in it. Corrected. Thanks for pointing it out.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.