I'm having a Postgresql (version 9.4) performance puzzle. I have a function (prevd) declared as STABLE (see below). When I run this function on a constant in where clause, it is called multiple times - instead of once.
If I understand postgres documentation correctly, the query should be optimized to call prevd only once.
A STABLE function cannot modify the database and is guaranteed to return the same results given the same arguments for all rows within a single statement
Why it doesn't optimize calls to prevd in this case?
I'm not expecting prevd to be called once for all subsequent queries using prevd on the same argument (like it was IMMUTABLE). I'm expecting postgres to create a plan for my query with just one call to prevd('2015-12-12')
Please find the code below:
Schema
create table somedata(d date, number double precision);
create table dates(d date);
insert into dates
select generate_series::date
from generate_series('2015-01-01'::date, '2015-12-31'::date, '1 day');
insert into somedata
select '2015-01-01'::date + (random() * 365 + 1)::integer, random()
from generate_series(1, 100000);
create or replace function prevd(date_ date)
returns date
language sql
stable
as $$
select max(d) from dates where d < date_;
$$
Slow Query
select avg(number) from somedata where d=prevd('2015-12-12');
Poor query plan of the query above
Aggregate (cost=28092.74..28092.75 rows=1 width=8) (actual time=3532.638..3532.638 rows=1 loops=1)
Output: avg(number)
-> Seq Scan on public.somedata (cost=0.00..28091.43 rows=525 width=8) (actual time=10.210..3532.576 rows=282 loops=1)
Output: d, number
Filter: (somedata.d = prevd('2015-12-12'::date))
Rows Removed by Filter: 99718
Planning time: 1.144 ms
Execution time: 3532.688 ms
(8 rows)
Performance
The query above, on my machine runs around 3.5s. After changing prevd to IMMUTABLE, it's changing to 0.035s.
where test_multi_calls1(30) != numquery re-writing will happen for immutable but not for merely stable functions".select avg(number) from somedata where d=(select prevd(date '2015-12-12'));- it seems that the subquery forces the optimizer to "cache" or "materialize" a function resut,in the memory, and the query is fast.