1

I'm trying to optimize a query from a table with 3M rows.

The columns are value, datetime and point_id.

SELECT DATE(datetime), MAX(value) FROM historical_points WHERE point_id=1 GROUP BY DATE(datetime);

This query takes 2 seconds.

I tried indexing the point_id=1 but the results were not much better.

Is it possible to index the MAX query or is there a better way to do it? Maybe with an INNER JOIN?

EDIT: This is the explain analyze of similar one, that is tackling the case better. This one also ha performance problem.

EXPLAIN ANALYZE SELECT DATE(datetime), MAX(value), MIN(value) FROM buildings_hispoint WHERE point_id=64 AND datetime BETWEEN '2017-09-01 00:00:00' AND '2017-10-01 00:00:00' GROUP BY DATE(datetime);
>GroupAggregate  (cost=84766.65..92710.99 rows=336803 width=68) (actual time=1461.060..2701.145 rows=21 loops=1)
>  Group Key: (date(datetime))
>  ->  Sort  (cost=84766.65..85700.23 rows=373430 width=14) (actual time=1408.445..1547.929 rows=523621 loops=1)
>        Sort Key: (date(datetime))
>        Sort Method: external sort  Disk: 11944kB
>        ->  Bitmap Heap Scan on buildings_hispoint  (cost=10476.02..43820.81 rows=373430 width=14) (actual time=148.970..731.154 rows=523621 loops=1)
>              Recheck Cond: (point_id = 64)
>              Filter: ((datetime >= '2017-09-01 00:00:00+02'::timestamp with time zone) AND (datetime               Rows Removed by Filter: 35712
>              Heap Blocks: exact=14422
>              ->  Bitmap Index Scan on buildings_measurementdatapoint_ffb10c68  (cost=0.00..10382.67 rows=561898 width=0) (actual time=125.150..125.150 rows=559333 loops=1)
>                    Index Cond: (point_id = 64)
>Planning time: 0.284 ms
>Execution time: 2704.566 ms
4
  • do you have an index on datetime? Commented Sep 27, 2017 at 15:33
  • might sound strange, but a covering index to optimize on (point_id, datetime, value) as a single index. Commented Sep 27, 2017 at 16:08
  • 1
    We need to see your table and index definitions. Commented Sep 27, 2017 at 19:23
  • Provide your current EXPLAIN ANALYZE Commented Sep 27, 2017 at 20:54

2 Answers 2

1

Without seeing EXPLAIN output is difficult to say something. My guess is that you must include DATE() call on index definition:

CREATE INDEX historical_points_idx ON historical_points (DATE(datetime), point_id);

Also, if point_id has more distinct values than DATE(datetime) then you must reverse column order:

CREATE INDEX historical_points_idx ON historical_points (point_id, DATE(datetime));

Keep in mind that cardinality of columns is very important to the planner, columns with high selectivity is preferred to go first.

Sign up to request clarification or add additional context in comments.

1 Comment

I tried with: CREATE INDEX historical_points_idx ON historical_points (DATE(datetime AT TIME ZONE 'UTC'), point_id); but it didn't make any difference
1
SELECT DISTINCT ON (DATE(datetime)) DATE(datetime), value 
FROM historical_points WHERE point_id=1
ORDER BY DATE(datetime) DESC, value DESC;

Put an computed index on DATE(datetime), value. [I hope those aren't your real column names. Using reserved words like VALUE as a column name is a recipe for confusion.]

The SELECT DISTINCT will work like a GROUP ON. The ORDER BY replaces the MAX, and will be fast if indexed.

I owe this technique to @ErwinBrandstetter.

2 Comments

This is a bit faster, but not so much. To be honest, I don't know how much I can push it down. Do you think it looks feasible to reduce it way more?
What are your indexes. This should run fast.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.