0

I have a query like this

explain analyze
SELECT user_id, project_id, office_id, SUM(duration) AS tDuration
     FROM users
 WHERE date(start_datetime at TIME ZONE 'UTC') = '2020-05-01'
     GROUP BY project_id, user_id, office_id;

and i have created index on table like this

CREATE INDEX i1_users on users (date(start_datetime at TIME ZONE 'UTC'), project_id, user_id, office_id) include (duration);

but its not doing index scan scan as all the data needed are present in index itself

the explain result is as follows

GroupAggregate  (cost=7.80..7.82 rows=1 width=36) (actual time=5.672..11.735 rows=298 loops=1)
  Group Key:project_id, user_id, office_id
  ->  Sort  (cost=7.80..7.80 rows=1 width=32) (actual time=5.632..7.527 rows=298 loops=1)
        Sort Key: project_id, user_id, office_id
        Sort Method: quicksort  Memory: 48kB
        ->  Index Scan using i2_users on users  (cost=0.56..7.79 rows=1 width=32) (actual time=0.034..2.616 rows=298 loops=1)
              Index Cond: (date(timezone('UTC'::text, start_datetime)) = '2020-05-01'::date)
Planning Time: 2.070 ms
Execution Time: 13.991 ms

I have tried vacuum analyze users as well but no luck. And when ther is lage data in the table its doing sequence scan and sorting but since in index there is sorted data why not just use that?

3
  • 2
    The plan uses i2_users as the index. Can you provide the definition of i2_users? Commented Jul 12, 2020 at 7:06
  • @rinz1er its index on date(start_datetime at TIME ZONE 'UTC')) Commented Jul 12, 2020 at 8:15
  • 1
    "but its not doing index scan" - yes it does "Index Scan using i2_users". If you have queries that do a Seq Scan instead, then please edit your question and add the execution plan that show the Seq Scan, not the one that does the Index Scan Commented Jul 12, 2020 at 14:57

2 Answers 2

1

you are comparing a date "date(start_datetime at TIME ZONE 'UTC')" with a string "'2020-05-01'", which will prevent index-usage. might help:

SELECT user_id, project_id, office_id, SUM(duration) AS tDuration
 FROM users
WHERE date(start_datetime at TIME ZONE 'UTC') = TO_DATE('2020-05-01','YYYY-MM-DD')
 GROUP BY project_id, user_id, office_id;

(timezone might have to be added in to_date)

But do you realy need the timezone-conversion? if the column only stores the date, use it directly (avoiding the function based-index, allowing better statistics/optimisation):

CREATE INDEX i1_users on users (start_datetime, project_id, user_id, office_id) include (duration);
SELECT user_id, project_id, office_id, SUM(duration) AS tDuration
 FROM users WHERE start_datetime = to_date('2020-05-01','YYYY-MM-DD')
 GROUP BY project_id, user_id, office_id;

If start_datetime contains a true timestamp:

CREATE INDEX i1_users on users (date_trunc('day',start_datetime), project_id, user_id, office_id) include (duration);

SELECT user_id, project_id, office_id, SUM(duration) AS tDuration
 FROM users WHERE date_trunc('day',start_datetime) = to_date('2020-05-01','YYYY-MM-DD')
 GROUP BY project_id, user_id, office_id;
Sign up to request clarification or add additional context in comments.

1 Comment

it was on users typo in question.
0

The intelligence of the IOS-capable-detection part of the planner is a bit underwhelming here. It makes a list of all the columns it thinks it needs and makes sure those are available, and includes start_datetime in that list. That part of the code doesn't understand that the presence of date(start_datetime at TIME ZONE 'UTC') obviates the need for start_datetime itself.

You can "fix" this by adding start_datetime itself to the index, but of source at the cost of enlarging the index:

CREATE INDEX on users (date_trunc('day',start_datetime), project_id, user_id, office_id)
    include (duration,start_datetime);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.