203

I wanted to run a simple query to throw up all the rows of Table1 where a principal column value is not present in a column in another table (Table2).

I tried using:

SELECT * FROM Table1 WHERE Table1.principal NOT IN Table2.principal

This is instead throwing a syntax error. Google search led me to forums where people were saying that MySQL does not support NOT IN and something extremely complex needs to be used. Is this true? Or am I making a horrendous mistake?

1
  • 1
    And what if I want similar data from three tables. I mean one table1 has 2000 entries, the other two tables 2&3 each have say 500 entries, all of them have common field 'name'. How can we get all the details from table 1 which are not present in table2&3 based on 'name'. Can we use NOT IN twice, if so how..? Commented Aug 3, 2011 at 5:16

6 Answers 6

340

To use IN, you must have a set, use this syntax instead:

SELECT * FROM Table1 WHERE Table1.principal NOT IN (SELECT principal FROM table2)
Sign up to request clarification or add additional context in comments.

4 Comments

Careful when table2.principal can be NULL. In that case NOT IN will always return FALSE because NOT IN is treated as <> ALL, which compares all rows from the subquery like Table1.principal <> table2.principal, which fails when comparing with NULL: Table1.principal <> NULL will not result in TRUE. To fix: NOT IN (SELECT principal FROM table2 WHERE principal IS NOT NULL).
Thanks for the comment @Basti! Spent a lot of time trying to understand why the query wasn't working as expected.
Don't forget to avoid using 'SELECT *' inside the 'NOT IN' list. You must choose a particular column. Otherwise you'll get this error: stackoverflow.com/questions/14046838/…
@Basti, Thank you, my SQL query worked with your suggestion.
175

The subquery option has already been answered, but note that in many cases a LEFT JOIN can be a faster way to do this:

SELECT table1.*
FROM table1 LEFT JOIN table2 ON table2.principal=table1.principal
WHERE table2.principal IS NULL

If you want to check multiple tables to make sure it's not present in any of the tables (like in SRKR's comment), you can use this:

SELECT table1.*
FROM table1
LEFT JOIN table2 ON table2.name=table1.name
LEFT JOIN table3 ON table3.name=table1.name
WHERE table2.name IS NULL AND table3.name IS NULL

3 Comments

In my own tests, had same performance for both NOT IN & LEFT JOIN. +1 both
once the query ran once you should get the same results no matter what due to internal DB caching
For me the performance was way better. I ran through different tables, while they had foreign keys set.
37

NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL in MySQL

MySQL, as well as all other systems except SQL Server, is able to optimize LEFT JOIN / IS NULL to return FALSE as soon the matching value is found, and it is the only system that cared to document this behavior. […] Since MySQL is not capable of using HASH and MERGE join algorithms, the only ANTI JOIN it is capable of is the NESTED LOOPS ANTI JOIN

[…]

Essentially, [NOT IN] is exactly the same plan that LEFT JOIN / IS NULL uses, despite the fact these plans are executed by the different branches of code and they look different in the results of EXPLAIN. The algorithms are in fact the same in fact and the queries complete in same time.

[…]

It’s hard to tell exact reason for [performance drop when using NOT EXISTS], since this drop is linear and does not seem to depend on data distribution, number of values in both tables etc., as long as both fields are indexed. Since there are three pieces of code in MySQL that essentialy do one job, it is possible that the code responsible for EXISTS makes some kind of an extra check which takes extra time.

[…]

MySQL can optimize all three methods to do a sort of NESTED LOOPS ANTI JOIN. […] However, these three methods generate three different plans which are executed by three different pieces of code. The code that executes EXISTS predicate is about 30% less efficient […]

That’s why the best way to search for missing values in MySQL is using a LEFT JOIN / IS NULL or NOT IN rather than NOT EXISTS.

(emphases added)

Comments

11

Be carefull NOT IN is not an alias for <> ANY, but for <> ALL!

http://dev.mysql.com/doc/refman/5.0/en/any-in-some-subqueries.html

SELECT c FROM t1 LEFT JOIN t2 USING (c) WHERE t2.c IS NULL

cant' be replaced by

SELECT c FROM t1 WHERE c NOT IN (SELECT c FROM t2)

You must use

SELECT c FROM t1 WHERE c <> ANY (SELECT c FROM t2)

Comments

7

Unfortunately it seems to be a issue with MySql usage of "NOT IN" clause, the screen-shoot below shows the sub-query option returning wrong results:

mysql> show variables like '%version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| innodb_version          | 1.1.8                        |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| version                 | 5.5.21                       |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | Linux                        |
+-------------------------+------------------------------+
7 rows in set (0.07 sec)

mysql> select count(*) from TABLE_A where TABLE_A.Pkey not in (select distinct TABLE_B.Fkey from TABLE_B );
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.07 sec)

mysql> select count(*) from TABLE_A left join TABLE_B on TABLE_A.Pkey = TABLE_B.Fkey where TABLE_B.Pkey is null;
+----------+
| count(*) |
+----------+
|      139 |
+----------+
1 row in set (0.06 sec)

mysql> select count(*) from TABLE_A where NOT EXISTS (select * FROM TABLE_B WHERE TABLE_B.Fkey = TABLE_A.Pkey );
+----------+
| count(*) |
+----------+
|      139 |
+----------+
1 row in set (0.06 sec)

mysql> 

Comments

1

The answers above are correct but difficult to understand. In order to make it clearer, I will give an answer based on an example. Let's say you want to select all the people who are not in the outstanding table:

option 1:

 select *
 from persons p
 where not exists(select * 
                 from outstandings o
                 where p.id = o.id)

option 2:

    select *
    from persons p
    where p.id not in( select m.id
                      from outstandings o
                      where m.id = o.id)

option 3:

    select *
    from persons p
    left join utstandings o on o.id=p.id
    where o.id in null

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.