I have the following partitioned table
Column | Type | Modifiers | Storage | Stats target | Description
---------------+-----------------------------+------------------------+---------+--------------+-------------
time | timestamp without time zone | not null | plain | |
connection_id | integer | not null | plain | |
is_authorized | boolean | not null default false | plain | |
is_active | boolean | not null default true | plain | |
Indexes:
"active_connection_time_idx" btree ("time")
Child tables: metrics.active_connection_2022_02_26t00,
metrics.active_connection_2022_02_27t00,
metrics.active_connection_2022_02_28t00,
metrics.active_connection_2022_03_01t00,
metrics.active_connection_2022_03_02t00,
metrics.active_connection_2022_04_21t00
All partitions have indexes for time column.
I need execute the following query
SELECT c.connection_id, (array_agg(is_authorized order by time desc))[1], bool_or(is_active) FROM metrics.active_connection c WHERE c.time BETWEEN '2022-01-26 00:00:00' AND '2022-04-15 23:59:59' GROUP BY c.connection_id;
And I get the plan (quick seq scan and low external sort):
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=1878772.55..1999873.62 rows=200 width=6) (actual time=11516.621..22951.961 rows=30631 loops=1)
Group Key: c.connection_id
-> Sort (cost=1878772.55..1909047.19 rows=12109857 width=14) (actual time=11388.096..15601.938 rows=12109856 loops=1)
Sort Key: c.connection_id
Sort Method: external merge Disk: 319520kB
-> Append (cost=0.00..247108.84 rows=12109857 width=14) (actual time=0.022..5346.587 rows=12109856 loops=1)
-> Seq Scan on active_connection c (cost=0.00..0.00 rows=1 width=14) (actual time=0.004..0.004 rows=0 loops=1)
Filter: (("time" >= '2022-01-26 00:00:00'::timestamp without time zone) AND ("time" <= '2022-04-15 23:59:59'::timestamp without time zone))
-> Seq Scan on active_connection_2022_02_26t00 c_1 (cost=0.00..21728.74 rows=1064849 width=14) (actual time=0.017..307.754 rows=1064849 loops=1)
Filter: (("time" >= '2022-01-26 00:00:00'::timestamp without time zone) AND ("time" <= '2022-04-15 23:59:59'::timestamp without time zone))
......
-> Seq Scan on active_connection_2022_03_02t00 c_5 (cost=0.00..20964.04 rows=1027336 width=14) (actual time=0.018..268.314 rows=1027336 loops=1)
Filter: (("time" >= '2022-01-26 00:00:00'::timestamp without time zone) AND ("time" <= '2022-04-15 23:59:59'::timestamp without time zone))
If I add index for the connection_id column I get another plan (slow index scan and quick in-memory sort)
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=2.23..1071044.89 rows=200 width=6) (actual time=203.337..49643.802 rows=30631 loops=1)
Group Key: c.connection_id
-> Merge Append (cost=2.23..980218.46 rows=12109857 width=14) (actual time=184.137..38926.435 rows=12109856 loops=1)
Sort Key: c.connection_id
-> Sort (cost=0.01..0.02 rows=1 width=14) (actual time=0.036..0.037 rows=0 loops=1)
Sort Key: c.connection_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on active_connection c (cost=0.00..0.00 rows=1 width=14) (actual time=0.004..0.004 rows=0 loops=1)
Filter: (("time" >= '2022-01-26 00:00:00'::timestamp without time zone) AND ("time" <= '2022-04-15 23:59:59'::timestamp without time zone))
-> Index Scan using active_connection_2022_02_26t00_conn_id on active_connection_2022_02_26t00 c_1 (cost=0.43..56013.08 rows=1064849 width=14) (actual time=6.386..1729.893 rows=1064849 loops=1)
Filter: (("time" >= '2022-01-26 00:00:00'::timestamp without time zone) AND ("time" <= '2022-04-15 23:59:59'::timestamp without time zone))
....
-> Index Scan using active_connection_2022_03_02t00_conn_id on active_connection_2022_03_02t00 c_5 (cost=0.42..54039.14 rows=1027336 width=14) (actual time=0.062..2142.939 rows=1027336 loops=1)
Filter: (("time" >= '2022-01-26 00:00:00'::timestamp without time zone) AND ("time" <= '2022-04-15 23:59:59'::timestamp without time zone))
Is it possible somehow get both quick sorting and quick seq scan?