SQL Query: boolean processing

Question

I have no idea if this is the right forum or not. Lets say I have the following:

SELECT *
FROM MyTable m
WHERE ((A OR B) AND (C OR D))

Assume that A, B, C, D are proper boolean clauses that each need to be evaluated on a row-level basis. Lets also assume no indexes.

This is logically equivalent to:

SELECT *
FROM MyTable m
WHERE (A AND C)
   OR (A AND D)
   OR (B AND C)
   OR (B AND D)

Is there a performance advantage to either one? We're on MSSql-2008.

The query optimizer reserves all rights to shuffle things around. Based on statistics and indexes it might do something quite unexpected. That said, I would go with your first example as it seems clearer to the casual reader. — HABO
– HABO, Commented Jan 13, 2012 at 14:20
The use case is actually for generated SQL, so a user won't see it except for relatively rare debugging. — Shlomo
– Shlomo, Commented Jan 13, 2012 at 15:50

XIVSolutions · Accepted Answer · 2012-01-13 02:08:39Z

1

My understanding is that your first case is more efficient, because:

in this clause: WHERE ((A OR B) AND (C OR D))

the entire statement fails if neither A or B are true; the Second part of the statement, (C OR D) is not evaluated. Even if A OR B are true, there is only one more pair to check - C OR D. Worst case is that four criteria are checked before the statement as a whole can be evaluated (if A = False, B = False, C= False, but D = True). Best case is, the statement becomes false after checking only A and B. If neither are true, then the entire statement is false.

In your second case, each of the four cases must ALL be evaluated before the statement as a whole can be evaluated.

Nesting the OR conditionals inside the AND means if the first case fails, more on along, nothing more of interest here. You improve things even more if you place the case most likely to be false as the first pair.

I will be interested to hear from others on this . . .

answered Jan 13, 2012 at 2:08

XIVSolutions

4,52221 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Shlomo Over a year ago

So Best case for first statement is A: False + B: False, only two evaluations. In that case, the second statement also only evaluates A + B. They're logically equivalent, so if SQL Server is smart enough to figure it out, there shouldn't be a difference. Which is what I'm assuming the correct answer will be.

XIVSolutions Over a year ago

I agree that they are logically equivelent, but I believe the order of operations (as dictated by the parenthesis) matters to some compilers and not others. I agree - I too am interested to know whether A is evaluated each time in case 2.

Collectives™ on Stack Overflow

SQL Query: boolean processing

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related