I have the following table structure:
create table transfers
(
id serial not null
constraint transactions_pkey
primary key,
name varchar(255) not null,
money integer not null
);
create index transfers_name_index
on transfers (name);
When executing the following query it is quite slow as it does a sequential scan:
EXPLAIN ANALYZE SELECT name
FROM transfers
GROUP by name
ORDER BY name ASC;
Group (cost=37860.49..41388.54 rows=14802 width=15) (actual time=4285.530..7459.872 rows=999766 loops=1)
Group Key: name
-> Gather Merge (cost=37860.49..41314.53 rows=29604 width=15) (actual time=4285.529..7136.432 rows=999935 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=36860.46..36897.47 rows=14802 width=15) (actual time=4104.159..5107.148 rows=333312 loops=3)
Sort Key: name
Sort Method: external merge Disk: 14928kB
Worker 0: Sort Method: external merge Disk: 13616kB
Worker 1: Sort Method: external merge Disk: 13656kB
-> Partial HashAggregate (cost=35687.15..35835.17 rows=14802 width=15) (actual time=604.984..689.111 rows=333312 loops=3)
Group Key: name
-> Parallel Seq Scan on transfers (cost=0.00..32571.52 rows=1246252 width=15) (actual time=0.063..200.548 rows=997032 loops=3)
Planning Time: 0.088 ms
Execution Time: 7531.142 ms
However when setting seqscan to off, the index only scan is correctly used, as I would expect.
SET enable_seqscan = OFF;
EXPLAIN ANALYZE SELECT name
FROM transfers
GROUP by name
ORDER BY name ASC;
Group (cost=1000.45..100492.67 rows=14802 width=15) (actual time=8.032..2212.538 rows=999766 loops=1)
Group Key: name
-> Gather Merge (cost=1000.45..100418.66 rows=29604 width=15) (actual time=8.029..1880.388 rows=999778 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Group (cost=0.43..96001.60 rows=14802 width=15) (actual time=0.074..383.471 rows=333259 loops=3)
Group Key: name
-> Parallel Index Only Scan using transfers_name_index on transfers (cost=0.43..92885.97 rows=1246252 width=15) (actual time=0.066..189.436 rows=997032 loops=3)
Heap Fetches: 0
Planning Time: 0.197 ms
Execution Time: 2279.321 ms
Why does Postgres not use the more efficient index only scan without forcing it? The table contains about 3 million records. Am using PostgreSQL 11.2.
random_page_costshould be2Note that it could to be set at runtime, so just before your query executeset random_page_cost to 2;