Avoid duplicates in INSERT INTO SELECT query in SQL Server

Question

I have the following two tables:

Table1
----------
ID   Name
1    A
2    B
3    C

Table2
----------
ID   Name
1    Z

I need to insert data from Table1 to Table2. I can use the following syntax:

INSERT INTO Table2(Id, Name) SELECT Id, Name FROM Table1

However, in my case, duplicate IDs might exist in Table2 (in my case, it's just "1") and I don't want to copy that again as that would throw an error.

I can write something like this:

IF NOT EXISTS(SELECT 1 FROM Table2 WHERE Id=1)
INSERT INTO Table2 (Id, name) SELECT Id, name FROM Table1 
ELSE
INSERT INTO Table2 (Id, name) SELECT Id, name FROM Table1 WHERE Table1.Id<>1

Is there a better way to do this without using IF - ELSE? I want to avoid two INSERT INTO-SELECT statements based on some condition.

nloewen · Accepted Answer · 2019-01-29 02:07:44Z

269

Using NOT EXISTS:

INSERT INTO TABLE_2
  (id, name)
SELECT t1.id,
       t1.name
  FROM TABLE_1 t1
 WHERE NOT EXISTS(SELECT id
                    FROM TABLE_2 t2
                   WHERE t2.id = t1.id)

Using NOT IN:

INSERT INTO TABLE_2
  (id, name)
SELECT t1.id,
       t1.name
  FROM TABLE_1 t1
 WHERE t1.id NOT IN (SELECT id
                       FROM TABLE_2)

Using LEFT JOIN/IS NULL:

INSERT INTO TABLE_2
  (id, name)
   SELECT t1.id,
          t1.name
     FROM TABLE_1 t1
LEFT JOIN TABLE_2 t2 ON t2.id = t1.id
    WHERE t2.id IS NULL

Of the three options, the LEFT JOIN/IS NULL is less efficient. See this link for more details.

edited Jan 29, 2019 at 2:07

nloewen

1,29911 silver badges18 bronze badges

answered Mar 25, 2010 at 5:07

OMG Ponies

334k85 gold badges536 silver badges508 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

IDisposable Over a year ago

Just a clarification on the NOT EXISTS version, you'll need a WITH(HOLDLOCK) hint or no locks will be taken (because there are no rows to lock!) so another thread could insert the row under you.

Duncan Over a year ago

Interesting, because I have always believed joining to be faster than sub-selects. Perhaps that is for straight joins only, and not applicable to left joins.

HLGEM Over a year ago

Duncan, joining is often faster that subselects when they are correlated subqueries. If you have the subquery up in the select list a join will often be faster.

tomash Over a year ago

NOT EXISTS is especially useful with composite primary key, NOT IN won't work then

Drew Chapin Over a year ago

Any ideas why I would I still get cannot insert duplicate key... using any of the above methods?

|

Duncan · Accepted Answer · 2010-03-25 06:54:22Z

51

In MySQL you can do this:

INSERT IGNORE INTO Table2(Id, Name) SELECT Id, Name FROM Table1

Does SQL Server have anything similar?

edited Mar 25, 2010 at 6:54

answered Mar 25, 2010 at 6:24

Duncan

2,09413 silver badges11 bronze badges

7 Comments

Ashish Gupta Over a year ago

+1 for educating me on this . Very nice syntax. Definitely shorter and better than the one I used. Unfortunately Sql server does not have this.

IamIC Over a year ago

Not totally true. When you create a unique index, you can set it to "ignore duplicates", in which case SQL Server will ignore any attempts to add a duplicate.

Smack Jack Over a year ago

And SQL Server still can't... pathetic.

Ingus Over a year ago

So SQL Server still cant?

Pavan Kumar Aryasomayajulu Over a year ago

And still can't

|

timbre timbre · Accepted Answer · 2016-07-15 22:27:07Z

6

I just had a similar problem, the DISTINCT keyword works magic:

INSERT INTO Table2(Id, Name) SELECT DISTINCT Id, Name FROM Table1

edited Jul 15, 2016 at 22:27

timbre timbre

14.1k10 gold badges56 silver badges93 bronze badges

answered Jul 15, 2016 at 22:10

Hunter Bingham

1251 silver badge1 bronze badge

1 Comment

FreeMan Over a year ago

Unless I totally misunderstand you, this will work if you have duplicates in the set you're inserting from. It won't, however, help if the set you're inserting from might be duplicates of data already in the insert into table.

Community · Accepted Answer · 2017-05-23 10:31:26Z

5

Using ignore Duplicates on the unique index as suggested by IanC here was my solution for a similar issue, creating the index with the Option WITH IGNORE_DUP_KEY

In backward compatible syntax
, WITH IGNORE_DUP_KEY is equivalent to WITH IGNORE_DUP_KEY = ON.

Ref.: index_option

edited May 23, 2017 at 10:31

CommunityBot

11 silver badge

answered Jan 14, 2015 at 16:41

Tazz602

611 silver badge3 bronze badges

Comments

zx485 · Accepted Answer · 2018-10-20 22:32:27Z

5

I was facing the same problem recently...
Heres what worked for me in MS SQL server 2017...
The primary key should be set on ID in table 2...
The columns and column properties should be the same of course between both tables. This will work the first time you run the below script. The duplicate ID in table 1, will not insert...

If you run it the second time, you will get a

Violation of PRIMARY KEY constraint error

This is the code:

Insert into Table_2
Select distinct *
from Table_1
where table_1.ID >1

edited Oct 20, 2018 at 22:32

zx485

29.1k28 gold badges55 silver badges65 bronze badges

answered Oct 20, 2018 at 22:14

Zoloholic

511 silver badge3 bronze badges

Comments

M. Salah · Accepted Answer · 2016-07-31 08:09:14Z

4

From SQL Server you can set a Unique key index on the table for (Columns that needs to be unique)

answered Jul 31, 2016 at 8:09

M. Salah

6817 silver badges10 bronze badges

1 Comment

Cheung Over a year ago

It doesn't response to alternate of INSERT INGORE INTO.

FullStackFool · Accepted Answer · 2018-01-30 15:43:17Z

2

A little off topic, but if you want to migrate the data to a new table, and the possible duplicates are in the original table, and the column possibly duplicated is not an id, a GROUP BY will do:

INSERT INTO TABLE_2
(name)
  SELECT t1.name
  FROM TABLE_1 t1
  GROUP BY t1.name

answered Jan 30, 2018 at 15:43

FullStackFool

1,1809 silver badges16 bronze badges

Comments

user7334973 · Accepted Answer · 2020-10-14 19:59:21Z

In my case, I had duplicate IDs in the source table, so none of the proposals worked. I don't care about performance, it's just done once. To solve this I took the records one by one with a cursor to ignore the duplicates.

So here's the code example:

DECLARE @c1 AS VARCHAR(12);
DECLARE @c2 AS VARCHAR(250);
DECLARE @c3 AS VARCHAR(250);


DECLARE MY_cursor CURSOR STATIC FOR
Select
c1,
c2,
c3
from T2
where ....;

OPEN MY_cursor
FETCH NEXT FROM MY_cursor INTO @c1, @c2, @c3

WHILE @@FETCH_STATUS = 0
BEGIN
    if (select count(1) 
        from T1
        where a1 = @c1
        and a2 = @c2
        ) = 0 
            INSERT INTO T1
            values (@c1, @c2, @c3)

    FETCH NEXT FROM MY_cursor INTO @c1, @c2, @c3
END
CLOSE MY_cursor
DEALLOCATE MY_cursor

Shoham · Accepted Answer · 2021-04-12 13:25:20Z

0

I used a MERGE query to fill a table without duplications. The problem I had was a double key in the tables ( Code , Value ) , and the exists query was very slow The MERGE executed very fast ( more then X100 )

examples for MERGE query

answered Apr 12, 2021 at 13:25

Shoham

401 gold badge3 silver badges7 bronze badges

1 Comment

ruffin May 30 at 19:57

Remember that link answers aren't great answers -- can you replace that link with the pertinent information/an example MERGE statement that demonstrates your suggestion?

Gediminas Šukys · Accepted Answer · 2022-06-15 07:37:07Z

0

For one table it works perfectly when creating one unique index from multiple field. Then simple "INSERT IGNORE" will ignore duplicates if ALL of 7 fields (in this case) will have SAME values.

Select fields in PMA Structure View and click Unique, new combined index will be created.

answered Jun 15, 2022 at 7:37

Gediminas Šukys

7,4418 gold badges50 silver badges62 bronze badges

Comments

Sacro · Accepted Answer · 2018-11-14 02:03:16Z

-5

A simple DELETE before the INSERT would suffice:

DELETE FROM Table2 WHERE Id = (SELECT Id FROM Table1)
INSERT INTO Table2 (Id, name) SELECT Id, name FROM Table1

Switching Table1 for Table2 depending on which table's Id and name pairing you want to preserve.

edited Nov 14, 2018 at 2:03

answered Nov 14, 2018 at 1:43

Sacro

213 bronze badges

4 Comments

Andir Over a year ago

Please don't do this. You're basically saying "whatever data I had is worthless, let's just insert this new data!"

Sacro Over a year ago

@Andir If for some reason "Table2" shouldn't getting dropped after the "INSERT" then use the other methods, but this is a perfectly valid way to achieve what the OP asked.

MC9000 Over a year ago

Valid, but certainly slower and potentially corrupting without a transaction. If you go this route, wrap in a TRANSaction.

Tom Wilson Over a year ago

Despite all the whining about this answer, it may be the best solution depending on the situation.

Collectives™ on Stack Overflow

Avoid duplicates in INSERT INTO SELECT query in SQL Server

11 Answers 11

9 Comments

7 Comments

1 Comment

Comments

Comments

1 Comment

Comments

Comments

1 Comment

Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

9 Comments

7 Comments

1 Comment

Comments

Comments

1 Comment

Comments

Comments

1 Comment

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related