I have a table:
create table accounts_service.operation_history (
history_id bigint generated always as identity primary key,
operation_id varchar(36) not null unique,
operation_type varchar(30) not null,
operation_time timestamptz default now() not null,
from_phone varchar(20),
user_id varchar(21),
-- and a lot of other varchar(x), text and even couple of number, boolean, jsonb, timestamp columns
);
create index operation_history_user_id_operation_time_idx
on accounts_service.operation_history (user_id, operation_time);
create index operation_history_operation_time_idx
on accounts_service.operation_history (operation_time);
I want to make a simple SELECT with a WHERE filter on operation_time (this is a required filter and can be a day or two) as well as additional filters for other columns: commonly, with varchar(x) type.
But my queries are slow:
explain (buffers, analyze)
select *
from operation_history operationh0_
where (null is null or operationh0_.user_id = null)
and operationh0_.operation_time >= '2024-09-30 20:00:00.000000 +00:00'
and operationh0_.operation_time <= '2024-10-02 20:00:00.000000 +00:00'
and (operationh0_.from_phone = '+000111223344')
order by operationh0_.operation_time asc, operationh0_.history_id asc
limit 25;
Limit (cost=8063.39..178328.00 rows=25 width=1267) (actual time=174373.106..174374.395 rows=0 loops=1)
Buffers: shared hit=532597 read=1433916
I/O Timings: read=517880.241
-> Incremental Sort (cost=8063.39..198759.76 rows=28 width=1267) (actual time=174373.105..174374.394 rows=0 loops=1)
Sort Key: operation_time, history_id
Presorted Key: operation_time
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB
Buffers: shared hit=532597 read=1433916
I/O Timings: read=517880.241
-> Gather Merge (cost=1000.60..198758.50 rows=28 width=1267) (actual time=174373.099..174374.388 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=532597 read=1433916
I/O Timings: read=517880.241
-> Parallel Index Scan using operation_history_operation_time_idx on operation_history operationh0_ (cost=0.57..197755.24 rows=12 width=1267) (actual time=174362.932..174362.933 rows=0 loops=3)
Index Cond: ((operation_time >= '2024-09-30 20:00:00+00'::timestamp with time zone) AND (operation_time <= '2024-10-02 20:00:00+00'::timestamp with time zone))
Filter: ((from_phone)::text = '+000111223344'::text)
Rows Removed by Filter: 723711
Buffers: shared hit=532597 read=1433916
I/O Timings: read=517880.241
Planning Time: 0.193 ms
Execution Time: 174374.449 ms
For simplicity:
set max_parallel_workers_per_gather = 0;
It's just simplifying plan, numbers are relevant. Retry the previous query:
Limit (cost=7535.40..189179.35 rows=25 width=1267) (actual time=261432.728..261432.729 rows=0 loops=1)
Buffers: shared hit=374346 read=1591362
I/O Timings: read=257253.065
-> Incremental Sort (cost=7535.40..210976.63 rows=28 width=1267) (actual time=261432.727..261432.727 rows=0 loops=1)
Sort Key: operation_time, history_id
Presorted Key: operation_time
Full-sort Groups: 1 Sort Method: quicksort Average Memory: 25kB Peak Memory: 25kB
Buffers: shared hit=374346 read=1591362
I/O Timings: read=257253.065
-> Index Scan using operation_history_operation_time_idx on operation_history operationh0_ (cost=0.57..210975.37 rows=28 width=1267) (actual time=261432.720..261432.720 rows=0 loops=1)
Index Cond: ((operation_time >= '2024-09-30 20:00:00+00'::timestamp with time zone) AND (operation_time <= '2024-10-02 20:00:00+00'::timestamp with time zone))
Filter: ((from_phone)::text = '+000111223344'::text)
Rows Removed by Filter: 2171134
Buffers: shared hit=374346 read=1591362
I/O Timings: read=257253.065
Planning Time: 0.170 ms
Execution Time: 261432.774 ms
So it filtered just 2 171 134 rows and it was more than 4 mins. Seems it is too long, isn't it?
- I tried selecting specific columns (e.g.
operation_time,from_phone,to_phone,history_id), it had no effect. - I tried
vacuum analyze, it had no effect. - I checked some parameters of Postgres, like
shared_buffers,work_mem, etc. Changing it has no effect. - I compared it with
pgTuneand it's ok.
Some another info:
SELECT relpages, pg_size_pretty(pg_total_relation_size(oid)) AS table_size
FROM pg_class
WHERE relname = 'operation_history';
| 18402644 | 210 GB |
select count(*) from operation_history;
| 352402877 |
Server drives: AWS gp3
I don't want to create indexes for all columns because there are massive writes to this table...
Is there any way to optimize it?
Or is it just making a lot of reads from the index and the table and it's ok and we need to do sharding, etc?
UPD: I checked index bloat with this query:
| idxname | real_size | extra_size | extra_pct | fillfactor | bloat_size | bloat_pct | is_na |
|---|---|---|---|---|---|---|---|
| operation_history_operation_time_idx | 7839301632 | 746373120 | 9.520913405772113 | 90 | 0 | -0.6147682314362566 | false |
I checked table bloat with this query
| tblname | real_size | extra_size | extra_pct | fillfactor | bloat_size | bloat_pct | is_na |
|---|---|---|---|---|---|---|---|
| operation_history | 150754459648 | 7987224576 | 5.298168024116535 | 100 | 7987224576 | 5.298168024116535 | false |
And it seems to be ok.
where (null is null or operationh0_.user_id = null)This looks quite incorrect and I don't get what this condition should do. Did you rather wantoperationh0_.user_id IS NULL?(null is null or x)looks like that's an optional condition which wasn't used in the app the request originated from - it's ugly but it doesn't matter because PostgreSQL optimises that away. Production env logs are typically full of queries with a bunch of1=1,x=coalesce(null,x)and all sorts ofand trueresulting from params being left unused and defaulting to something neutral and reducible.operationh0_.user_id = nullis logical nonsense nonetheless.