I have a table called tbl_inventory_detail_old as for the following DDL and it currently has 19M records. There are around 500 unique agent ids and around 35 unique product ids.
CREATE TABLE public.tbl_inventory_detail_old (
agent_id int8 NOT NULL,
bucket_id int2 NOT NULL,
product_id int2 NOT NULL,
quantity float8 NOT NULL,
serial varchar(32) NOT NULL,
serial_block_id int8 NULL,
bundle_serial varchar(32) NOT NULL,
transaction_id int8 NULL,
is_serial bool NOT NULL,
created_by varchar(64) NOT NULL,
updated_by varchar(64) NULL,
created_date timestamp NOT NULL,
updated_date timestamp NULL,
is_reserved bool NOT NULL DEFAULT false,
serial_status int4 NULL,
bundle_product_id int2 NULL,
CONSTRAINT pk_tbl_inventory_detail_old PRIMARY KEY (agent_id, bucket_id, product_id, serial, is_reserved, bundle_serial),
CONSTRAINT fk_tbl_inventory_detail_product_old FOREIGN KEY (product_id) REFERENCES public.tbl_product(id),
CONSTRAINT fk_tbl_inventory_detail_serial_block_old FOREIGN KEY (serial_block_id) REFERENCES public.tbl_serial_block(id),
CONSTRAINT fk_tbl_inventory_detail_transaction_old FOREIGN KEY (transaction_id) REFERENCES public.tbl_transaction(id)
);
CREATE INDEX idx_inventory_detail_serial_old ON public.tbl_inventory_detail_old USING btree (serial);
Once I ran the following query on the tbl_inventory_detail_old it took more than 2 seconds to get the result.
explain (ANALYZE, COSTS, VERBOSE, BUFFERS)
select
id1_0.serial
from
tbl_inventory_detail_old id1_0
where
id1_0.agent_id =115
and id1_0.product_id =15
and id1_0.bucket_id =1
and id1_0.is_reserved =false
and cast(id1_0.serial as numeric(38,0)) between cast('15115601000701' as numeric(38,0)) and cast('15115601000702' as numeric(38,0));
Please find the following results.
Gather (cost=1000.56..129510.07 rows=5736 width=13) (actual time=829.972..831.547 rows=2 loops=1)
Output: serial
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=9488 read=34133
I/O Timings: shared/local read=216.310
-> Parallel Index Only Scan using pk_tbl_inventory_detail_old on public.tbl_inventory_detail_old id1_0 (cost=0.56..127936.47 rows=2390 width=13) (actual time=552.611..826.319 rows=1 loops=3)
Output: serial
Index Cond: ((id1_0.agent_id = 115) AND (id1_0.bucket_id = 1) AND (id1_0.product_id = 15) AND (id1_0.is_reserved = false))
Filter: (((id1_0.serial)::numeric(38,0) >= '15115601000701'::numeric(38,0)) AND ((id1_0.serial)::numeric(38,0) <= '15115601000702'::numeric(38,0)))
Rows Removed by Filter: 666432
Heap Fetches: 999307
Buffers: shared hit=9488 read=34133
I/O Timings: shared/local read=216.310
Worker 0: actual time=823.606..823.607 rows=0 loops=1
Buffers: shared hit=2816 read=15830
I/O Timings: shared/local read=81.756
Worker 1: actual time=4.753..825.878 rows=2 loops=1
Buffers: shared hit=3300 read=10044
I/O Timings: shared/local read=61.850
Query Identifier: -16000001
Planning:
Buffers: shared hit=61
Planning Time: 0.256 ms
Execution Time: 831.576 ms
Then I decided to partitioning the table by creating a new table and insert data from the old table using following scripts. I also used sub-partitioning as well.
CREATE TABLE tbl_inventory_detail (
agent_id int8 NOT NULL,
bucket_id int2 NOT NULL,
product_id int2 NOT NULL,
quantity float8 NOT NULL,
serial varchar(32) NOT NULL,
serial_block_id int8 NULL,
bundle_serial varchar(32) NOT NULL,
transaction_id int8 NULL,
is_serial bool NOT NULL,
created_by varchar(64) NOT NULL,
updated_by varchar(64) NULL,
created_date timestamp NOT NULL,
updated_date timestamp NULL,
is_reserved bool NOT NULL DEFAULT false,
serial_status int4 NULL,
bundle_product_id int2 NULL,
CONSTRAINT pk_tbl_inventory_detail PRIMARY KEY (agent_id, product_id, bucket_id, is_reserved, serial, bundle_serial),
CONSTRAINT fk_tbl_inventory_detail_product FOREIGN KEY (product_id) REFERENCES tbl_product(id),
CONSTRAINT fk_tbl_inventory_detail_serial_block FOREIGN KEY (serial_block_id) REFERENCES tbl_serial_block(id),
CONSTRAINT fk_tbl_inventory_detail_transaction FOREIGN KEY (transaction_id) REFERENCES tbl_transaction(id)
) PARTITION BY RANGE (agent_id);
CREATE INDEX idx_inventory_detail_serial ON tbl_inventory_detail USING btree (serial);
CREATE TABLE tbl_inventory_detail_partition_1 PARTITION OF tbl_inventory_detail
FOR VALUES FROM (0) TO (100) PARTITION BY RANGE (product_id);
CREATE TABLE tbl_inventory_detail_partition_2 PARTITION OF tbl_inventory_detail
FOR VALUES FROM (100) TO (200) PARTITION BY RANGE (product_id);
CREATE TABLE tbl_inventory_detail_partition_3 PARTITION OF tbl_inventory_detail
FOR VALUES FROM (200) TO (300) PARTITION BY RANGE (product_id);
CREATE TABLE tbl_inventory_detail_partition_4 PARTITION OF tbl_inventory_detail
FOR VALUES FROM (300) TO (400) PARTITION BY RANGE (product_id);
CREATE TABLE tbl_inventory_detail_partition_5 PARTITION OF tbl_inventory_detail
FOR VALUES FROM (400) TO (500) PARTITION BY RANGE (product_id);
CREATE TABLE tbl_inventory_detail_partition_6 PARTITION OF tbl_inventory_detail
FOR VALUES FROM (500) TO (MAXVALUE) PARTITION BY RANGE (product_id);
CREATE TABLE tbl_inventory_detail_partition_1_1 PARTITION OF tbl_inventory_detail_partition_1 FOR VALUES FROM (0) TO (5);
CREATE TABLE tbl_inventory_detail_partition_1_2 PARTITION OF tbl_inventory_detail_partition_1 FOR VALUES FROM (5) TO (10);
CREATE TABLE tbl_inventory_detail_partition_1_3 PARTITION OF tbl_inventory_detail_partition_1 FOR VALUES FROM (10) TO (15);
CREATE TABLE tbl_inventory_detail_partition_1_4 PARTITION OF tbl_inventory_detail_partition_1 FOR VALUES FROM (15) TO (20);
CREATE TABLE tbl_inventory_detail_partition_1_5 PARTITION OF tbl_inventory_detail_partition_1 FOR VALUES FROM (20) TO (25);
CREATE TABLE tbl_inventory_detail_partition_1_6 PARTITION OF tbl_inventory_detail_partition_1 FOR VALUES FROM (25) TO (MAXVALUE);
CREATE TABLE tbl_inventory_detail_partition_2_1 PARTITION OF tbl_inventory_detail_partition_2 FOR VALUES FROM (0) TO (5);
CREATE TABLE tbl_inventory_detail_partition_2_2 PARTITION OF tbl_inventory_detail_partition_2 FOR VALUES FROM (5) TO (10);
CREATE TABLE tbl_inventory_detail_partition_2_3 PARTITION OF tbl_inventory_detail_partition_2 FOR VALUES FROM (10) TO (15);
CREATE TABLE tbl_inventory_detail_partition_2_4 PARTITION OF tbl_inventory_detail_partition_2 FOR VALUES FROM (15) TO (20);
CREATE TABLE tbl_inventory_detail_partition_2_5 PARTITION OF tbl_inventory_detail_partition_2 FOR VALUES FROM (20) TO (25);
CREATE TABLE tbl_inventory_detail_partition_2_6 PARTITION OF tbl_inventory_detail_partition_2 FOR VALUES FROM (25) TO (MAXVALUE);
CREATE TABLE tbl_inventory_detail_partition_3_1 PARTITION OF tbl_inventory_detail_partition_3 FOR VALUES FROM (0) TO (5);
CREATE TABLE tbl_inventory_detail_partition_3_2 PARTITION OF tbl_inventory_detail_partition_3 FOR VALUES FROM (5) TO (10);
CREATE TABLE tbl_inventory_detail_partition_3_3 PARTITION OF tbl_inventory_detail_partition_3 FOR VALUES FROM (10) TO (15);
CREATE TABLE tbl_inventory_detail_partition_3_4 PARTITION OF tbl_inventory_detail_partition_3 FOR VALUES FROM (15) TO (20);
CREATE TABLE tbl_inventory_detail_partition_3_5 PARTITION OF tbl_inventory_detail_partition_3 FOR VALUES FROM (20) TO (25);
CREATE TABLE tbl_inventory_detail_partition_3_6 PARTITION OF tbl_inventory_detail_partition_3 FOR VALUES FROM (25) TO (MAXVALUE);
CREATE TABLE tbl_inventory_detail_partition_4_1 PARTITION OF tbl_inventory_detail_partition_4 FOR VALUES FROM (0) TO (5);
CREATE TABLE tbl_inventory_detail_partition_4_2 PARTITION OF tbl_inventory_detail_partition_4 FOR VALUES FROM (5) TO (10);
CREATE TABLE tbl_inventory_detail_partition_4_3 PARTITION OF tbl_inventory_detail_partition_4 FOR VALUES FROM (10) TO (15);
CREATE TABLE tbl_inventory_detail_partition_4_4 PARTITION OF tbl_inventory_detail_partition_4 FOR VALUES FROM (15) TO (20);
CREATE TABLE tbl_inventory_detail_partition_4_5 PARTITION OF tbl_inventory_detail_partition_4 FOR VALUES FROM (20) TO (25);
CREATE TABLE tbl_inventory_detail_partition_4_6 PARTITION OF tbl_inventory_detail_partition_4 FOR VALUES FROM (25) TO (MAXVALUE);
CREATE TABLE tbl_inventory_detail_partition_5_1 PARTITION OF tbl_inventory_detail_partition_5 FOR VALUES FROM (0) TO (5);
CREATE TABLE tbl_inventory_detail_partition_5_2 PARTITION OF tbl_inventory_detail_partition_5 FOR VALUES FROM (5) TO (10);
CREATE TABLE tbl_inventory_detail_partition_5_3 PARTITION OF tbl_inventory_detail_partition_5 FOR VALUES FROM (10) TO (15);
CREATE TABLE tbl_inventory_detail_partition_5_4 PARTITION OF tbl_inventory_detail_partition_5 FOR VALUES FROM (15) TO (20);
CREATE TABLE tbl_inventory_detail_partition_5_5 PARTITION OF tbl_inventory_detail_partition_5 FOR VALUES FROM (20) TO (25);
CREATE TABLE tbl_inventory_detail_partition_5_6 PARTITION OF tbl_inventory_detail_partition_5 FOR VALUES FROM (25) TO (MAXVALUE);
CREATE TABLE tbl_inventory_detail_partition_6_1 PARTITION OF tbl_inventory_detail_partition_6 FOR VALUES FROM (0) TO (5);
CREATE TABLE tbl_inventory_detail_partition_6_2 PARTITION OF tbl_inventory_detail_partition_6 FOR VALUES FROM (5) TO (10);
CREATE TABLE tbl_inventory_detail_partition_6_3 PARTITION OF tbl_inventory_detail_partition_6 FOR VALUES FROM (10) TO (15);
CREATE TABLE tbl_inventory_detail_partition_6_4 PARTITION OF tbl_inventory_detail_partition_6 FOR VALUES FROM (15) TO (20);
CREATE TABLE tbl_inventory_detail_partition_6_5 PARTITION OF tbl_inventory_detail_partition_6 FOR VALUES FROM (20) TO (25);
CREATE TABLE tbl_inventory_detail_partition_6_6 PARTITION OF tbl_inventory_detail_partition_6 FOR VALUES FROM (25) TO (MAXVALUE);
INSERT INTO tbl_inventory_detail
(agent_id, bucket_id, product_id, quantity, serial, serial_block_id, bundle_serial, transaction_id, is_serial, created_by, updated_by, created_date, updated_date, is_reserved, serial_status, bundle_product_id)
select agent_id, bucket_id, product_id, quantity, serial, serial_block_id, bundle_serial, transaction_id, is_serial, created_by, updated_by, created_date, updated_date, is_reserved, serial_status, bundle_product_id
from tbl_inventory_detail_old;
Please find the following explain results as well. However it still takes around 2 seconds to get the results.
explain (ANALYZE, COSTS, VERBOSE, BUFFERS)
select
id1_0.serial
from
tbl_inventory_detail id1_0
where
id1_0.agent_id =115
and id1_0.product_id =15
and id1_0.bucket_id =1
and id1_0.is_reserved =false
and cast(id1_0.serial as numeric(38,0)) between cast('15115601000701' as numeric(38,0)) and cast('15115601000702' as numeric(38,0));
Gather (cost=1000.00..67709.65 rows=9996 width=15) (actual time=0.519..619.900 rows=2 loops=1)
Output: id1_0.serial
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=34471 dirtied=3
-> Parallel Seq Scan on public.tbl_inventory_detail_partition_2_4 id1_0 (cost=0.00..65710.05 rows=4165 width=15) (actual time=409.305..615.264 rows=1 loops=3)
Output: id1_0.serial
Filter: ((NOT id1_0.is_reserved) AND (id1_0.agent_id = 115) AND (id1_0.product_id = 15) AND (id1_0.bucket_id = 1) AND ((id1_0.serial)::numeric(38,0) >= '15115601000701'::numeric(38,0)) AND ((id1_0.serial)::numeric(38,0) <= '15115601000702'::numeric(38,0)))
Rows Removed by Filter: 666432
Buffers: shared hit=34471 dirtied=3
Worker 0: actual time=613.239..613.240 rows=0 loops=1
Buffers: shared hit=8759
Worker 1: actual time=614.567..614.568 rows=1 loops=1
Buffers: shared hit=16900 dirtied=1
Query Identifier: -16000000
Planning Time: 0.181 ms
Execution Time: 619.924 ms
Appreciate your help to optimize the above query and reduce the time?
serialstored as aVARCHARinstead of a native numeric type? Having to cast the values in the query precludes use of the index. Using a partitioned table for a 19M row dataset is a lot of extra work for no practical gain. There's nothing fundamentally wrong with the query that couldn't be addressed by using appropriate data types and indexing.serialas a varchar, create an index on this varchar but in your query you cast the content to numeric. That will fail when the content is not numeric and it destroys all potential usage of the index. Use the proper data type for your columns.