SELECT ... WHERE IN vs JOIN

Question

Assume we have 4 tables A, B, C, D in some unspecified relational SQL database. A references B, also C and D as well. References means that A has column A.refX_id = X.id, X is A, B and C (common foreign key 1:N).

What I want, is to query table A with condition based on columns from all child tables B, C and D. My question is: What of the following variants is generally better? (In terms of usability, efficiency, speed.)

Variant 1:

SELECT DISTINCT A.* FROM A
   JOIN B ON A.refB_id = B.id
   JOIN C ON A.refC_id = C.id
   JOIN D ON A.refD_id = D.id
   WHERE <condition on B> AND <condition on C> AND <condition on D>;

Which I like more from perspective of a database, but looks a little bit harder to program.

Variant 2:

SELECT id FROM B WHERE <condition on B>; # result store to array "BIds" on program side
SELECT id FROM C WHERE <condition on C>; # result store to array "CIds" on program side
SELECT id FROM D WHERE <condition on D>; # result store to array "DIds" on program side

SELECT A.* FROM A
   WHERE refB_id IN (<B_ids>) AND refC_id IN (<C_ids>) AND refD_id IN (<D_ids>);

# <B_ids> menas expand whole array of ids, which can result in a very long query string

I thought that Variant 2 is completelly smut and unusable with potentially large data. But I've heard, that a lot of frameworks normally use it, because it is relativelly straightforward. Is it sort of legal way to query data like this in general case, if I know that the content of the "IN" clause is taken from a result of another query(ies)?

Was just looking up this very question and found this response: stackoverflow.com/questions/121631/inner-join-vs-where?rq=1 — CodeJockey
– CodeJockey, Commented May 10, 2016 at 19:54

Gordon Linoff · Accepted Answer · 2016-04-12 02:00:51Z

4

I would encourage you to use IN or EXISTS:

SELECT A.*
FROM A
WHERE EXISTS (SELECT 1 FROM B WHERE A.refB_id = B.id AND <condition on B>) AND
      EXISTS (SELECT 1 FROM C WHERE A.refC_id = C.id AND <condition on C>) AND
      EXISTS (SELECT 1 FROM D WHERE A.refD_id = D.id AND <condition on D>);

Advantages of this approach:

No danger of getting large intermediate Cartesian products.
No expense of eliminating duplicates for SELECT DISTINCT.
Most databases handle EXISTS pretty well.
You can optimize each subquery using indexes.

EDIT:

You can write this with IN and subqueries:

SELECT A.*
FROM A
WHERE A.refB_id IN (SELECT B.id FROM B WHERE <condition on B>) AND
      A.refC_id IN (SELECT C.id FROM C WHERE <condition on C>) AND
      A.refD_id IN (SELECT D.id FROM D WHERE <condition on D>);

edited Apr 12, 2016 at 2:00

answered Apr 11, 2016 at 22:17

Gordon Linoff

1.3m62 gold badges705 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

David Over a year ago

Thanks a lot! Have never seen SELECT 1... How would it looks like with IN? Did you mean that Variant1??

David Over a year ago

Yes, this is more readable. Thank you, that is exactly what I wanted.

CodeJockey Over a year ago

@David I disagree about readability but it depends on what you are used to seeing.

Rahul · Accepted Answer · 2016-04-11 22:13:29Z

3

Not sure which framework uses the 2nd approach but first approach is what I will go for and moreover everyone else will. If you have proper index created on the join column for all tables then 1st approach will produce a much better plan than the 2nd one cause you have multiple IN clause and what if every IN has to work on millions of elements???.

Also, I would change the INNER JOIN to LEFT JOIN assuming that not all ID's may match and move the WHERE conditions to JOIN ON conditions like

SELECT DISTINCT A.* FROM A
   LEFT JOIN B ON A.refB_id = B.id AND <condition on B>
   LEFT JOIN C ON A.refC_id = C.id AND <condition on C>
   LEFT JOIN D ON A.refD_id = D.id AND <condition on D>;

answered Apr 11, 2016 at 22:13

Rahul

78.1k14 gold badges80 silver badges132 bronze badges

Collectives™ on Stack Overflow

SELECT ... WHERE IN vs JOIN

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related