Postgres: If we select a computed column multiple times, will Postgres compute it again and again?

Question

Here is the query that I am trying,

SELECT s.id, s.name AS name,
CASE WHEN (ARRAY_AGG(tgs.status) @> '{Hard} ') THEN 'Hard'
WHEN (ARRAY_AGG(tgs.status) @> '{Soft} ') THEN 'Soft'
WHEN (ARRAY_AGG(tgs.status) @> '{Fluid} ') THEN 'Fluid'
WHEN (ARRAY_AGG(tgs.status) @> '{Gummy} ') THEN 'Gummy'
WHEN (ARRAY_AGG(tgs.status) @> '{Expired} ') THEN 'Expired'
END AS status, 
COUNT(*) OVER()
FROM sweets AS s 
INNER JOIN tasty_goofy_sweets AS tgs on tgs.sweet_id = s.id
GROUP BY s.id;

While implementing this my friend suggested that, instead of computing array_agg every time in the switch case, we could use LEFT JOIN LATERAL and compute it just once. i.e) to implement like below

SELECT s.id, s.name AS name,
CASE WHEN (tgs.status @> '{Hard} ') THEN 'Hard'
WHEN (tgs.arr_status @> '{Soft} ') THEN 'Soft'
WHEN (tgs.arr_status @> '{Fluid} ') THEN 'Fluid'
WHEN (tgs.arr_status @> '{Gummy} ') THEN 'Gummy'
WHEN (tgs.arr_status @> '{Expired} ') THEN 'Expired'
END AS status, 
COUNT(*) OVER()
FROM sweets AS s 
LEFT JOIN LATERAL ( SELECT ARRAY_AGG(tgs.status) AS arr_status FROM tasty_goofy_sweets tgs WHERE  tgs.sweet_id = s.id
) AS tgs ON TRUE
GROUP BY s.id;

But I am not sure if Postgres computes the ARRAY_AGG value every time, how can we decide which approach is better? I tried looking at explain analyse for both the queries, the number of rows involved in the latter query is more than the former. But I don't understand why this is so?

I intuitively feel the former approach is better, but can someone please reason out, which is better and why or am I asking too much ?

GMB · Accepted Answer · 2020-06-14 17:36:38Z

2

Most likely, Postgres will optimize away the multiple array_agg()s, compute it only once and reuse the results in each comparison. That's quite simple query optimization, that the database should easily spot.

Let me suggest, however, to simplify the query by using conditional aggreagation. Yo don't neeed to aggregate into an array just to check if a given value is there:

select
    s.id,
    s.name
    case 
        when count(*) filter(where status = 'Hard')    > 0 then 'Hard',
        when count(*) filter(where status = 'Soft')    > 0 then 'Soft',
        when count(*) filter(where status = 'Fluid')   > 0 then 'Fluid'
        when count(*) filter(where status = 'Gummy')   > 0 then 'Gummy',
        when count(*) filter(where status = 'Expired') > 0 then 'Expired'
    end status,
    count(*) over() cnt
from sweets s
inner join tasty_goofy_sweets AS tgs on tgs.sweet_id = s.id
group by s.id;

You could also express this without aggregation, using a lateral join and a conditional sort:

select
    s.id,
    s.name,
    tgs.status,
    count(*) over() cnt
from sweets s
cross join lateral (
    select status
    from tasty_goofy_sweets as tgs 
    where tgs.sweet_id = s.id
    order by case status 
        when 'Hard'    then 1
        when 'Soft'    then 2
        when 'Fluid'   then 3
        when 'Gummy'   then 4
        when 'Expired' then 5
    end
    limit 1
) tgs

edited Jun 14, 2020 at 17:36

answered Jun 14, 2020 at 17:14

GMB

224k25 gold badges103 silver badges151 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Surya Over a year ago

oh! does array_aggregation slow down the query

GMB Over a year ago

@LunaLovegood: yes the array aggregation does not seem optimal. I suspect that the fastest method is the lateral join - but you would need to assess that against your real dataset.

Gordon Linoff Over a year ago

@GMB . . . Because of the sequential ordering of case expressions, I would be mildly surprised if the expression is evaluated only once.

Surya Over a year ago

@GMB in the second approach that you suggested, why did you go with CROSS JOIN, wont LEFT JOIN be more appropriate given that we have the foreign key relation of tgs.sweet_id = s.id?

Surya Over a year ago

@GMB also in the first approach won't the count(*) get calculated for every case statement?

Gordon Linoff · Accepted Answer · 2020-06-15 13:21:14Z

1

I am fairly certain that in a case expression, the when clause is going to be evaluated separately for each condition. That means that your colleague is correct . . . probably.

The operative part of the documentation is:

Each condition is an expression that returns a boolean result. If the condition's result is true, the value of the CASE expression is the result that follows the condition, and the remainder of the CASE expression is not processed.

It is possible that Postgres does do some sort of optimization of subexpressions by evaluating them once. However, the statement: "the remainder of the CASE expression is not processed" is pretty strong and suggests that each clause will only be processed after the previous ones have evaluated to false (or NULL).

Regardless of whether the optimizer picks figures out that a function can be called only once, the lateral join guarantees that it will be evaluated once, so that seems like the safer solution for an expensive operation.

answered Jun 15, 2020 at 13:21

Gordon Linoff

1.3m62 gold badges705 silver badges857 bronze badges

3 Comments

Surya Over a year ago

The query with LEFT LATERAL seems to be a bit slower and affect more rows 🤔 Reference: explain.depesz.com/s/xVta The query without LEFT LATERAL explain.depesz.com/s/PZkX Perhaps it varies depending on data. 🤔

Gordon Linoff Over a year ago

@LunaLovegood . . . Do you have an index on tgs.sweet_id?

Gordon Linoff Over a year ago

@LunaLovegood . . . Duh. The equivalent query would use an inner join, not a left join. Either use left join for both or use inner join for both. Also, remove the group by when using a lateral join. It is unnecessary -- but Postgres probably does not optimize it away.

Collectives™ on Stack Overflow

Postgres: If we select a computed column multiple times, will Postgres compute it again and again?

2 Answers 2

5 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related