0

I have a table, say, Product (Id, Name):

Id Name
1 'one'
2 'two'
3 'three'

I have a new version of data that is needed to update the table, with possible duplicate values.

For example:

Name
'one'
'two'
'two'
'two'
'four'

I need the result table to consist of as many rows of each "Name" as there are in the second table, i.e.

Id Name
1 'one'
2 'two'
3 'three'
4 'two'
5 'two'
6 'four'

So, if the value, like 'one' already exists, we don't need to insert a duplicate, but if there are more rows of the same value in the second table, like 'two' or 'four', we need to insert respectively 2 and 1 more values into the table.

How do I do it using SQL?

I've tried answers that suggest inserting into the table without duplicates, which is not my task.

2
  • delete "one" "two" and "four" in the table, then insert your new data. Commented Jul 3, 2024 at 14:13
  • @topsail that is not the preferable solution for my task, but thanks for your suggestion Commented Jul 3, 2024 at 15:18

1 Answer 1

1

First I set up som tables for the testing:

create temp table Product(id  integer primary key generated always as identity, name text);
insert into Product (name) values ('one'),('two'),('three');
select * from Product;

create  temp table newdata(name text);
insert into newdata values ('one'),('two'),('two'),('two'),('four');
select * from newdata;

Then I use the analytical function "row_number()" which creates row numbers on the result. That is just a way of counting how many instances of each name is in your list

select name,row_number() over (partition by name order by name) rn from newdata;


+------+----+
| name | rn |
+------+----+
| four |  1 |
| one  |  1 |
| two  |  1 |
| two  |  2 |
| two  |  3 |
+------+----+

For product:

select name,row_number() over (partition by name order by name) rn from product


+-------+----+
| name  | rn |
+-------+----+
| one   |  1 |
| three |  1 |
| two   |  1 |
+-------+----+

Finding the differences between the two tells me which names I am missing:

select name,row_number() over (partition by name order by name) rn from newdata
except
select name,row_number() over (partition by name order by name) rn from product;


+------+----+
| name | rn |
+------+----+
| two  |  3 |
| four |  1 |
| two  |  2 |
+------+----+

Then I just need to insert the missing names:

insert into product (name)
select name from (
select name,row_number() over (partition by name order by name) rn from newdata
except
select name,row_number() over (partition by name order by name) rn from product
) a;
select * from product


+----+-------+
| id | name  |
+----+-------+
|  1 | one   |
|  2 | two   |
|  3 | three |
|  4 | two   |
|  5 | four  |
|  6 | two   |
+----+-------+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.