I have the following tables:
CREATE TABLE IF NOT EXISTS users
(
id NUMERIC(20, 0) NOT NULL DEFAULT NEXTVAL('users_sequence') PRIMARY KEY,
list_id NUMERIC(20, 0) NOT NULL,
first_name VARCHAR(512) DEFAULT NULL NULL,
last_name VARCHAR(512) DEFAULT NULL NULL,
full_name VARCHAR(1024) GENERATED ALWAYS AS
(CASE
WHEN first_name IS NULL AND
last_name IS NULL THEN NULL
ELSE
(TRIM(COALESCE(first_name, '') || ' ' || COALESCE(last_name, ''))) END) STORED,
deleted_at TIMESTAMP DEFAULT NULL NULL,
-- Some ~20 columns
);
CREATE TABLE IF NOT EXISTS user_aliases
(
id NUMERIC(20, 0) NOT NULL DEFAULT NEXTVAL('user_aliases_sequence') PRIMARY KEY,
entry_id NUMERIC(20, 0) NOT NULL,
list_id NUMERIC(20, 0) NOT NULL,
first_name VARCHAR(512) DEFAULT NULL NULL,
last_name VARCHAR(512) DEFAULT NULL NULL,
full_name VARCHAR(1024) GENERATED ALWAYS AS
(CASE
WHEN first_name IS NULL AND
last_name IS NULL THEN NULL
ELSE
(TRIM(COALESCE(first_name, '') || ' ' || COALESCE(last_name, ''))) END) STORED,
deleted_at TIMESTAMP DEFAULT NULL NULL,
CONSTRAINT fk_user_aliases_entry_id FOREIGN KEY (entry_id) REFERENCES users (id) ON UPDATE CASCADE ON DELETE CASCADE
);
and the following indices:
CREATE INDEX users_full_name_idx ON users USING GIST (full_name gist_trgm_ops, list_id) WHERE deleted_at IS NULL AND full_name IS NOT NULL;
CREATE INDEX user_aliases_full_name_idx ON user_aliases USING GIST (full_name gist_trgm_ops, list_id) WHERE deleted_at IS NULL AND full_name IS NOT NULL;
and I have the following functions:
And my query is:
SELECT id
FROM (
SELECT
e.id,
e.full_name
FROM users e
WHERE e.full_name % :value
AND e.deleted_at IS NULL
AND e.list_id IN (:lists)
UNION
SELECT
a.entry_id AS id,
a.full_name
FROM user_aliases a
WHERE a.full_name % :value
AND a.deleted_at IS NULL
AND a.list_id IN (:lists)
) filter_table
WHERE LEVENSHTEIN_LESS_EQUAL(full_name, :value, :threshold) <= :threshold
LIMIT :limit
I have around ~500K records in the parent (users) table, and again ~500K records in the child (user_aliases) table. 99.5% of the time, my query takes around ~200ms, which works for me. However, sometimes (and I believe when there's some load on the server) it takes around 20seconds. I have ~140 connections at the moment, majority of which is idle (connection pooling).
The server has the following resources:
- 8 vCPU
- 32GB memory
- 500 GB SSD
and some related bits of my postgresql.conf are as follows, happy to provide further, just don't know what:
max_connections = 300
shared_buffers = 12GB
effective_cache_size = 16GB
maintenance_work_mem = 1GB
work_mem = 32MB
max_wal_size = 1GB
min_wal_size = 80MB
And the query plan for is as follows:
Limit (cost=285.91..286.11 rows=10 width=20) (actual time=238.326..238.330 rows=0 loops=1)
Output: filter_table.id
Buffers: shared hit=40038
-> Subquery Scan on filter_table (cost=285.91..286.23 rows=16 width=20) (actual time=238.323..238.327 rows=0 loops=1)
Output: filter_table.id
Buffers: shared hit=40038
-> HashAggregate (cost=285.91..286.07 rows=16 width=536) (actual time=238.322..238.325 rows=0 loops=1)
" Output: e.id, e.full_name"
" Group Key: e.id, e.full_name"
Batches: 1 Memory Usage: 24kB
Buffers: shared hit=40038
-> Append (cost=13.58..285.83 rows=16 width=536) (actual time=238.317..238.320 rows=0 loops=1)
Buffers: shared hit=40038
-> Bitmap Heap Scan on master.users e (cost=13.58..199.25 rows=11 width=27) (actual time=164.384..164.385 rows=0 loops=1)
" Output: e.id, e.full_name"
Recheck Cond: (((e.full_name)::text % 'hasan can saral'::text) AND (e.deleted_at IS NULL))
" Filter: ((e.list_id = ANY ('{1,3,5,6,7,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,250,260,270,280,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,470,480,490,500,510,520,530,540,550,560,570,580,590,600,610,620,630,640,650,660,670,680,690,700,710,720,730,740,750,760,770,780,790,800,810,820,830,840,850,860,870,880,890,900,910,920,930,940,950,960,970,980,990,1000,1010,1020,1030,1040,1050,1060,1070,1080,1090,1100,1110,1120,1130,1140,1150,1160,1170,1180,1190,1200,1210,1220,1230,1240,1250,1260,1270,1280,1290,1300,1310,1320,1330,1340,1350,1360,1370,1380,1390,1400,1410,1420,1430,1440,1450,1460,1470,1480,1490,1500,1510,1520,1530,1540,1550,1560,1570,1580,1590,1600,1610,1620,1630,1640,1650,1660,1670,1680,1690,1700,1710,1720,1730,1740,1750,1760,1770,1780,1790,1800,1810,1820,1830,1840,1850,1860,1870,1880,1890,1900,1910,1920,1930,1940,1950,1960,1970,1980,1990,2000,2010,2020,2030,2040,2050,2060,2070,2080,2090,2100,2110,2120,2130,2140,2150,2160,2170,2180,2190,2200,2210,2220,2230,2240,2250,2260,2270,2280,2290,2300,2310,2320,2330,2340,2350,2360,2370,2380,2390,2400,2410,2420,2430,2440,2450,2460,2470,2480,2490,2500,2510,2520,2530,2540,2550,2560,2570,2580,2590,2600,2610,2620,2630,2640,2650,2660,2670,2680,2690,2700,2710,2720,2730,2740,2750,2760,2770,2780,2790,2800,2810,2820,2830,2840,2850,2860,2870,2880,2890,2900,2910,2920,2930,2940,2950,2960,2970,2980,2990,3000,3010,3020,3030,3040,3050,3060,3070,3080,3090,3100,3110,3120,3130,3140,3150,3160,3170,3180}'::numeric[])) AND (levenshtein_less_equal((e.full_name)::text, 'hasan can saral'::text, 2) <= 2))"
Rows Removed by Filter: 161
Heap Blocks: exact=145
Buffers: shared hit=27180
-> Bitmap Index Scan on users_full_name_idx (cost=0.00..12.77 rows=48 width=0) (actual time=163.986..163.987 rows=161 loops=1)
Index Cond: ((e.full_name)::text % 'hasan can saral'::text)
Buffers: shared hit=27035
-> Bitmap Heap Scan on master.user_aliases a (cost=9.36..86.34 rows=5 width=38) (actual time=73.928..73.928 rows=0 loops=1)
" Output: a.entry_id, a.full_name"
Recheck Cond: (((a.full_name)::text % 'hasan can saral'::text) AND (a.deleted_at IS NULL))
" Filter: ((a.list_id = ANY ('{1,3,5,6,7,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,250,260,270,280,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,470,480,490,500,510,520,530,540,550,560,570,580,590,600,610,620,630,640,650,660,670,680,690,700,710,720,730,740,750,760,770,780,790,800,810,820,830,840,850,860,870,880,890,900,910,920,930,940,950,960,970,980,990,1000,1010,1020,1030,1040,1050,1060,1070,1080,1090,1100,1110,1120,1130,1140,1150,1160,1170,1180,1190,1200,1210,1220,1230,1240,1250,1260,1270,1280,1290,1300,1310,1320,1330,1340,1350,1360,1370,1380,1390,1400,1410,1420,1430,1440,1450,1460,1470,1480,1490,1500,1510,1520,1530,1540,1550,1560,1570,1580,1590,1600,1610,1620,1630,1640,1650,1660,1670,1680,1690,1700,1710,1720,1730,1740,1750,1760,1770,1780,1790,1800,1810,1820,1830,1840,1850,1860,1870,1880,1890,1900,1910,1920,1930,1940,1950,1960,1970,1980,1990,2000,2010,2020,2030,2040,2050,2060,2070,2080,2090,2100,2110,2120,2130,2140,2150,2160,2170,2180,2190,2200,2210,2220,2230,2240,2250,2260,2270,2280,2290,2300,2310,2320,2330,2340,2350,2360,2370,2380,2390,2400,2410,2420,2430,2440,2450,2460,2470,2480,2490,2500,2510,2520,2530,2540,2550,2560,2570,2580,2590,2600,2610,2620,2630,2640,2650,2660,2670,2680,2690,2700,2710,2720,2730,2740,2750,2760,2770,2780,2790,2800,2810,2820,2830,2840,2850,2860,2870,2880,2890,2900,2910,2920,2930,2940,2950,2960,2970,2980,2990,3000,3010,3020,3030,3040,3050,3060,3070,3080,3090,3100,3110,3120,3130,3140,3150,3160,3170,3180}'::numeric[])) AND (levenshtein_less_equal((a.full_name)::text, 'hasan can saral'::text, 2) <= 2))"
Rows Removed by Filter: 62
Heap Blocks: exact=49
Buffers: shared hit=12858
-> Bitmap Index Scan on user_aliases_full_name_idx (cost=0.00..8.56 rows=20 width=0) (actual time=73.735..73.735 rows=62 loops=1)
Index Cond: ((a.full_name)::text % 'hasan can saral'::text)
Buffers: shared hit=12809
"Settings: effective_cache_size = '16GB', search_path = 'master', work_mem = '32MB'"
Planning Time: 2.445 ms
Execution Time: 238.404 ms
Why does my query performs poorly, and how do I resolve it?
Update:
I got the slow query plan using auto_explain (the query took 25s to execute):
Limit (cost=10645.98..10646.18 rows=10 width=20) (actual time=47782.688..47782.691 rows=0 loops=1)
Output: filter_table.id
Buffers: shared hit=3749709
-> Subquery Scan on filter_table (cost=10645.98..10667.42 rows=1072 width=20) (actual time=47782.684..47782.687 rows=0 loops=1)
Output: filter_table.id
Buffers: shared hit=3749709
-> HashAggregate (cost=10645.98..10656.70 rows=1072 width=536) (actual time=47782.682..47782.684 rows=0 loops=1)
" Output: e.id, e.full_name"
" Group Key: e.id, e.full_name"
Batches: 1 Memory Usage: 73kB
Buffers: shared hit=3749709
-> Append (cost=1449.29..10640.62 rows=1072 width=536) (actual time=47782.666..47782.669 rows=0 loops=1)
Buffers: shared hit=3749709
-> Bitmap Heap Scan on master.users e (cost=1449.29..10533.25 rows=1068 width=27) (actual time=47636.308..47636.310 rows=0 loops=1)
" Output: e.id, e.full_name"
" Recheck Cond: (((e.full_name)::text % 'SOME RELATIVELY LONG VALUE HERE'::text) AND (e.list_id = ANY ('{1,3,5,6,7,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,250,260,270,280,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,470,480,490,500,510,520,530,540,550,560,570,580,590,600,610,620,630,640,650,660,670,680,690,700,710,720,730,740,750,760,770,780,790,800,810,820,830,840,850,860,870,880,890,900,910,920,930,940,950,960,970,980,990,1000,1010,1020,1030,1040,1050,1060,1070,1080,1090,1100,1110,1120,1130,1140,1150,1160,1170,1180,1190,1200,1210,1220,1230,1240,1250,1260,1270,1280,1290,1300,1310,1320,1330,1340,1350,1360,1370,1380,1390,1400,1410,1420,1430,1440,1450,1460,1470,1480,1490,1500,1510,1520,1530,1540,1550,1560,1570,1580,1590,1600,1610,1620,1630,1640,1650,1660,1670,1680,1690,1700,1710,1720,1730,1740,1750,1760,1770,1780,1790,1800,1810,1820,1830,1840,1850,1860,1870,1880,1890,1900,1910,1920,1930,1940,1950,1960,1970,1980,1990,2000,2010,2020,2030,2040,2050,2060,2070,2080,2090,2100,2110,2120,2130,2140,2150,2160,2170,2180,2190,2200,2210,2220,2230,2240,2250,2260,2270,2280,2290,2300,2310,2320,2330,2340,2350,2360,2370,2380,2390,2400,2410,2420,2430,2440,2450,2460,2470,2480,2490,2500,2510,2520,2530,2540,2550,2560,2570,2580,2590,2600,2610,2620,2630,2640,2650,2660,2670,2680,2690,2700,2710,2720,2730,2740,2750,2760,2770,2780,2790,2800,2810,2820,2830,2840,2850,2860,2870,2880,2890,2900,2910,2920,2930,2940,2950,2960,2970,2980,2990,3000,3010,3020,3030,3040,3050,3060,3070,3080,3090,3100,3110,3120,3130,3140,3150,3160,3170,3180}'::numeric[])) AND (e.deleted_at IS NULL))"
" Filter: (levenshtein_less_equal((e.full_name)::text, 'SOME RELATIVELY LONG VALUE HERE'::text, 2) <= 2)"
Rows Removed by Filter: 75
Heap Blocks: exact=57
Buffers: shared hit=3736002
-> Bitmap Index Scan on users_full_name_idx (cost=0.00..1448.22 rows=3203 width=0) (actual time=47634.248..47634.248 rows=75 loops=1)
" Index Cond: (((e.full_name)::text % 'SOME RELATIVELY LONG VALUE HERE'::text) AND (e.list_id = ANY ('{1,3,5,6,7,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,250,260,270,280,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,470,480,490,500,510,520,530,540,550,560,570,580,590,600,610,620,630,640,650,660,670,680,690,700,710,720,730,740,750,760,770,780,790,800,810,820,830,840,850,860,870,880,890,900,910,920,930,940,950,960,970,980,990,1000,1010,1020,1030,1040,1050,1060,1070,1080,1090,1100,1110,1120,1130,1140,1150,1160,1170,1180,1190,1200,1210,1220,1230,1240,1250,1260,1270,1280,1290,1300,1310,1320,1330,1340,1350,1360,1370,1380,1390,1400,1410,1420,1430,1440,1450,1460,1470,1480,1490,1500,1510,1520,1530,1540,1550,1560,1570,1580,1590,1600,1610,1620,1630,1640,1650,1660,1670,1680,1690,1700,1710,1720,1730,1740,1750,1760,1770,1780,1790,1800,1810,1820,1830,1840,1850,1860,1870,1880,1890,1900,1910,1920,1930,1940,1950,1960,1970,1980,1990,2000,2010,2020,2030,2040,2050,2060,2070,2080,2090,2100,2110,2120,2130,2140,2150,2160,2170,2180,2190,2200,2210,2220,2230,2240,2250,2260,2270,2280,2290,2300,2310,2320,2330,2340,2350,2360,2370,2380,2390,2400,2410,2420,2430,2440,2450,2460,2470,2480,2490,2500,2510,2520,2530,2540,2550,2560,2570,2580,2590,2600,2610,2620,2630,2640,2650,2660,2670,2680,2690,2700,2710,2720,2730,2740,2750,2760,2770,2780,2790,2800,2810,2820,2830,2840,2850,2860,2870,2880,2890,2900,2910,2920,2930,2940,2950,2960,2970,2980,2990,3000,3010,3020,3030,3040,3050,3060,3070,3080,3090,3100,3110,3120,3130,3140,3150,3160,3170,3180}'::numeric[])))"
Buffers: shared hit=3735945
-> Bitmap Heap Scan on master.user_aliases a (cost=9.36..91.29 rows=4 width=38) (actual time=146.350..146.351 rows=0 loops=1)
" Output: a.entry_id, a.full_name"
Recheck Cond: (((a.full_name)::text % 'SOME RELATIVELY LONG VALUE HERE'::text) AND (a.deleted_at IS NULL))
" Filter: ((a.list_id = ANY ('{1,3,5,6,7,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,250,260,270,280,300,310,320,330,340,350,360,370,380,390,400,410,420,430,440,450,460,470,480,490,500,510,520,530,540,550,560,570,580,590,600,610,620,630,640,650,660,670,680,690,700,710,720,730,740,750,760,770,780,790,800,810,820,830,840,850,860,870,880,890,900,910,920,930,940,950,960,970,980,990,1000,1010,1020,1030,1040,1050,1060,1070,1080,1090,1100,1110,1120,1130,1140,1150,1160,1170,1180,1190,1200,1210,1220,1230,1240,1250,1260,1270,1280,1290,1300,1310,1320,1330,1340,1350,1360,1370,1380,1390,1400,1410,1420,1430,1440,1450,1460,1470,1480,1490,1500,1510,1520,1530,1540,1550,1560,1570,1580,1590,1600,1610,1620,1630,1640,1650,1660,1670,1680,1690,1700,1710,1720,1730,1740,1750,1760,1770,1780,1790,1800,1810,1820,1830,1840,1850,1860,1870,1880,1890,1900,1910,1920,1930,1940,1950,1960,1970,1980,1990,2000,2010,2020,2030,2040,2050,2060,2070,2080,2090,2100,2110,2120,2130,2140,2150,2160,2170,2180,2190,2200,2210,2220,2230,2240,2250,2260,2270,2280,2290,2300,2310,2320,2330,2340,2350,2360,2370,2380,2390,2400,2410,2420,2430,2440,2450,2460,2470,2480,2490,2500,2510,2520,2530,2540,2550,2560,2570,2580,2590,2600,2610,2620,2630,2640,2650,2660,2670,2680,2690,2700,2710,2720,2730,2740,2750,2760,2770,2780,2790,2800,2810,2820,2830,2840,2850,2860,2870,2880,2890,2900,2910,2920,2930,2940,2950,2960,2970,2980,2990,3000,3010,3020,3030,3040,3050,3060,3070,3080,3090,3100,3110,3120,3130,3140,3150,3160,3170,3180}'::numeric[])) AND (levenshtein_less_equal((a.full_name)::text, 'SOME RELATIVELY LONG VALUE HERE'::text, 2) <= 2))"
Rows Removed by Filter: 4
Heap Blocks: exact=4
Buffers: shared hit=13707
-> Bitmap Index Scan on user_aliases_full_name_idx (cost=0.00..8.56 rows=20 width=0) (actual time=146.099..146.099 rows=4 loops=1)
Index Cond: ((a.full_name)::text % 'SOME RELATIVELY LONG VALUE HERE'::text)
Buffers: shared hit=13703
"Settings: effective_cache_size = '16GB', search_path = 'master', work_mem = '32MB'"
Planning Time: 5.653 ms
Execution Time: 47783.054 ms
Update #2:
The query is slow for some specific, and relatively long filter values, and fast for most.
Changing the query to:
SELECT id
FROM (
(SELECT
e.id,
e.full_name
FROM users e
WHERE e.full_name % :value
AND e.deleted_at IS NULL
AND e.list_id IN (:lists)
LIMIT :limit)
UNION
(SELECT
a.entry_id AS id,
a.full_name
FROM user_aliases a
WHERE a.full_name % :value
AND a.deleted_at IS NULL
AND a.list_id IN (:lists)
LIMIT :limit)
) filter_table
WHERE LEVENSHTEIN_LESS_EQUAL(full_name, :value, :threshold) <= :threshold
LIMIT :limit
basically, adding LIMIT clauses to every subquery of UNION makes the query fast again. So there is a solution. But why?
explain(analyze, verbose, buffers, settings)and share the result in plain text as an update of your question. Besides that, do you needUNIONor wouldUNION ALLbe good enough? And why don't you filter in the subquery?NUMERIC(20, 0)? INT and BIGINT are smaller in size and are the natural choice for identifiers. The indexes will also be smaller, resulting in faster performance.