There are two large table num,pre in database :
\d num
Table "public.num"
Column | Type | Collation | Nullable | Default
----------+-------------------------+-----------+----------+---------------------------------
id | integer | | not null | nextval('num_id_seq'::regclass)
adsh | character varying(20) | | |
tag | character varying(256) | | |
version | character varying(20) | | |
coreg | character varying(256) | | |
ddate | date | | |
qtrs | numeric(8,0) | | |
uom | character varying(20) | | |
value | numeric(28,4) | | |
footnote | character varying(1024) | | |
\d pre
Table "public.pre"
Column | Type | Collation | Nullable | Default
----------+------------------------+-----------+----------+---------------------------------
id | integer | | not null | nextval('pre_id_seq'::regclass)
adsh | character varying(20) | | |
report | numeric(6,0) | | |
line | numeric(6,0) | | |
stmt | character varying(2) | | |
inpth | boolean | | |
rfile | character(1) | | |
tag | character varying(256) | | |
version | character varying(20) | | |
plabel | character varying(512) | | |
negating | boolean | | |
Check how many records in the tables:
select count(*) from num;
count
----------
83862587
(1 row)
Time: 204945.436 ms (03:24.945)
select count(*) from pre;
count
----------
36738034
(1 row)
Time: 100604.085 ms (01:40.604)
Execute a long query :
explain analyze select tag,uom,qtrs,value,ddate from num
where adsh='0000320193-22-000108' and tag in
(select tag from pre where stmt='IS' and
adsh='0000320193-22-000108') and ddate='2022-09-30';
It cost almost 7 minutes 30 seconds.
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Semi Join (cost=2000.00..2909871.29 rows=2 width=59) (actual time=357717.922..450523.035 rows=45 loops=1)
Join Filter: ((num.tag)::text = (pre.tag)::text)
Rows Removed by Join Filter: 61320
-> Gather (cost=1000.00..1984125.01 rows=32 width=59) (actual time=190.355..92987.731 rows=678 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on num (cost=0.00..1983121.81 rows=13 width=59) (actual time=348.753..331304.671 rows=226 loops=3)
Filter: (((adsh)::text = '0000320193-22-000108'::text) AND (ddate = '2022-09-30'::date))
Rows Removed by Filter: 27953970
-> Materialize (cost=1000.00..925725.74 rows=43 width=33) (actual time=0.097..527.331 rows=91 loops=678)
-> Gather (cost=1000.00..925725.53 rows=43 width=33) (actual time=65.880..357527.133 rows=96 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on pre (cost=0.00..924721.22 rows=18 width=33) (actual time=201.713..357490.037 rows=32 loops=3)
Filter: (((adsh)::text = '0000320193-22-000108'::text) AND ((stmt)::text = 'IS'::text))
Rows Removed by Filter: 12245979
Planning Time: 0.632 ms
JIT:
Functions: 27
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 5.870 ms, Inlining 272.489 ms, Optimization 367.828 ms, Emission 213.288 ms, Total 859.474 ms
Execution Time: 450524.974 ms
(22 rows)
Time: 450526.084 ms (07:30.526)
How can optimize the database to reduce time on query?Add index and close something (the db running on my local pc without any other users)?


