0

I'm working with left joins that are creating NULL values. Let's say I'm interested in how many users have a basic plan or no plan at all (indicated by a NULL record in the left join)

SELECT u.id, s.plan_type 
FROM users as u
LEFT JOIN subscriptions as s on s.user_id = u.id
WHERE s.plan_type IN (NULL, 'basic')

The above does not work as intended, likely because it doesn't recognize it as NULL. I'd have to write something like WHERE s.plan_type IS NULL OR s.plan_type = 'basic'

I was wondering if there was a short hand for that, or if you always have to use IS NULL to identify NULL values

4
  • 2
    In a few words, no, you have to do the OR what you said. Why is it that bad? Maybe the explanation is that NULL value cannot be compared with equal operator (e.g. NULL = NULL return false), and IN operator likely performs an equal comparison for each element Commented Aug 31, 2021 at 23:04
  • Not sure how it works in POSTGRES but one option would be to use a coalesce... WHERE coalesce(s.plan_type, '-1') IN ('-1', 'basic'). Essentially, replace NULL with -1, then you can check for -1. Commented Aug 31, 2021 at 23:15
  • @JaimeDrq One reason it is bad is that an IN list can make more efficient use of indexes than an OR currently can. (Maybe we will fix that some day) Commented Sep 1, 2021 at 13:53
  • I trust you @jjanes, but being honest I am not fully confident about that Commented Sep 1, 2021 at 15:38

2 Answers 2

3

s.plan_type IN (NULL, 'basic') is just shorthand for (s.plan_type = NULL OR s.plan_type = 'basic').

Anything = NULL always returns NULL, not true. And only true qualifies in a WHERE clause. So rephrase:

SELECT u.id, s.plan_type 
FROM   users u
LEFT   JOIN subscriptions s ON s.user_id = u.id
WHERE (s.plan_type = 'basic' OR
       s.plan_type IS NULL);

To be precise, that's only correct if subscriptions.plan_type is defined NOT NULL. Failing that, the resulting NULL value could be just that, not a missing plan.

This alternative formulation does not share the same ambiguity for a resulting NULL:

SELECT u.id, s.plan_type
FROM   users u
JOIN   subscriptions s ON s.user_id = u.id
WHERE  s.plan_type = 'basic'
UNION ALL
SELECT u.id, NULL  -- always means missing
FROM   users u
WHERE  NOT EXISTS (SELECT FROM subscriptions WHERE user_id = u.id);

And it may be faster, because Postgres (like most RDBMS) often doesn't perform well with OR'ed predicates. See about "ugly OR":

If referential integrity is guaranteed with a FK constraint from subscriptions.user_id to users.id, and s.user_id is defined NOT NULL you can omit the users table from the first SELECT:

SELECT user_id AS id, plan_type
FROM   subscriptions
WHERE  plan_type = 'basic'
UNION ALL
SELECT id, NULL
FROM   users u
WHERE  NOT EXISTS (SELECT FROM subscriptions WHERE user_id = u.id);

Might be fastest.

Aside: No array is involved here. You show the form of IN using a list. See:

Sign up to request clarification or add additional context in comments.

Comments

3

If your table isn't so huge, you could use coalesce to handle null before matching

select u.id, s.plan_type 
from users as u
left join subscriptions as s on s.user_id = u.id
where coalesce(s.plan_type,'basic') = 'basic';

Note that the use of function on the left side of the where clause will make the query unsargable.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.