I have 2 postgresql instances on different computers(X and Y) on the same network. Y server has PgAdmin installed on and connected to both instances.
On the Y instance, I created FDW connection for a database on X.
My problem is It takes over 30 seconds when I run a select query over foreign table. If I run the same query into direct to X server from PgAdmin, it takes under 1 sec and I though the reason for slow performance couldn't be the network issue.
I tried both with on and off for use_remote_estimate but since my foreign table keeps updating with new data continuously, I decided to keep it off.
My query is below
SELECT q_num,
count(*)::integer AS total_calls,
count(1) FILTER (WHERE na_code = 0 AND fail_code = 0)::integer AS answered,
count(1) FILTER (WHERE na_code = 0 AND fail_code = 0 AND (wait + poll) <= '00:00:15'::interval)::integer AS answered_15,
count(1) FILTER (WHERE na_code = 0 AND fail_code = 0 AND (wait + poll) <= '00:00:20'::interval)::integer AS answered_20,
count(1) FILTER (WHERE na_code = 0 AND fail_code = 0 AND (wait + poll) <= '00:00:30'::interval)::integer AS answered_30,
count(1) FILTER (WHERE na_code <> 0 OR fail_code <> 0)::integer AS missed,
count(1) FILTER (WHERE na_code <> 0 OR fail_code <> 0 AND (wait + poll) > '00:00:15'::interval)::integer AS missed_15,
count(1) FILTER (WHERE na_code <> 0 OR fail_code <> 0 AND (wait + poll) > '00:00:20'::interval)::integer AS missed_20,
count(1) FILTER (WHERE na_code <> 0 OR fail_code <> 0 AND (wait + poll) > '00:00:30'::interval)::integer AS missed_30,
EXTRACT(epoch FROM sum(wait + poll))::integer AS total_waiting,
EXTRACT(epoch FROM max(wait + poll))::integer AS max_waiting
FROM foreign_table
WHERE time_start > (CURRENT_TIMESTAMP - '03:15:00'::interval)
GROUP BY q_num
time_start column has index on the remote server.
How can I improve the performance?
EDIT: Just tried to increase fetch_size but it still takes more than 20 seconds.
@Frank Heikens Here is the index
CREATE INDEX IF NOT EXISTS original_table_time_start_idx
ON public.original_table USING btree
(time_start ASC NULLS LAST)
TABLESPACE pg_default;
Postgresql version on X instance
PostgreSQL 11.9, compiled by Visual C++ build 1914, 64-bit
Postgresql version on Y instance
PostgreSQL 14.5, compiled by Visual C++ build 1914, 64-bit
EXPLAIN (ANALYZE, VERBOSE, BUFFERS) on X instance
GroupAggregate (cost=8.51..8.91 rows=2 width=49) (actual time=0.148..0.150 rows=1 loops=1)
Output: q_num, (count(*))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0))))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0) AND ((wait + poll) <= '00:00:15'::interval))))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0) AND ((wait + poll) <= '00:00:20'::interval))))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0) AND ((wait + poll) <= '00:00:30'::interval))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR (fail_code <> 0))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR ((fail_code <> 0) AND ((wait + poll) > '00:00:15'::interval)))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR ((fail_code <> 0) AND ((wait + poll) > '00:00:20'::interval)))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR ((fail_code <> 0) AND ((wait + poll) > '00:00:30'::interval)))))::integer, (date_part('epoch'::text, sum((wait + poll))))::integer, (date_part('epoch'::text, max((wait + poll))))::integer
Group Key: original_table.q_num
Buffers: shared hit=4
-> Sort (cost=8.51..8.51 rows=3 width=45) (actual time=0.099..0.101 rows=3 loops=1)
Output: q_num, na_code, fail_code, wait, poll
Sort Key: original_table.q_num
Sort Method: quicksort
Memory: 25kB
Buffers: shared hit=4
-> Index Scan using original_table_time_start_idx on public.original_table (cost=0.43..8.48 rows=3 width=45) (actual time=0.052..0.060 rows=3 loops=1)
Output: q_num, na_code, fail_code, wait, poll
Index Cond: (original_table.time_start > (CURRENT_TIMESTAMP - '00:15:00'::interval))
Buffers: shared hit=4
Planning Time: 1.383 ms
Execution Time: 0.638 ms
EXPLAIN (ANALYZE, VERBOSE, BUFFERS) on Y instance
GroupAggregate (cost=48287.41..48295.41 rows=6 width=49) (actual time=22846.842..22846.847 rows=1 loops=1)
Output: q_num, (count(*))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0))))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0) AND ((wait + poll) <= '00:00:15'::interval))))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0) AND ((wait + poll) <= '00:00:20'::interval))))::integer, (count(1) FILTER (WHERE ((na_code = 0) AND (fail_code = 0) AND ((wait + poll) <= '00:00:30'::interval))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR (fail_code <> 0))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR ((fail_code <> 0) AND ((wait + poll) > '00:00:15'::interval)))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR ((fail_code <> 0) AND ((wait + poll) > '00:00:20'::interval)))))::integer, (count(1) FILTER (WHERE ((na_code <> 0) OR ((fail_code <> 0) AND ((wait + poll) > '00:00:30'::interval)))))::integer, (EXTRACT(epoch FROM sum((wait + poll))))::integer, (EXTRACT(epoch FROM max((wait + poll))))::integer
Group Key: foreign_table.q_num
-> Sort (cost=48287.41..48287.59 rows=72 width=45) (actual time=22846.755..22846.759 rows=1 loops=1)
Output: q_num, na_code, fail_code, wait, poll
Sort Key: foreign_table.q_num
Sort Method: quicksort
Memory: 25kB
-> Foreign Scan on public.foreign_table (cost=100.00..48285.19 rows=72 width=45) (actual time=10927.554..22846.717 rows=1 loops=1)
Output: q_num, na_code, fail_code, wait, poll
Filter: (foreign_table.time_start > (CURRENT_TIMESTAMP - '03:15:00'::interval))
Rows Removed by Filter: 720077
Remote SQL: SELECT q_num, time_start, wait, poll, na_code, fail_code FROM public.original_table
Planning Time: 0.522 ms
Execution Time: 22849.716 ms
I hope these informations helps.
EXPLAIN (ANALYZE, BUFFERS, VERBOSE), it's difficult to gain much insight into the query's behavior. I suggest turninguse_remote_estimateback on since without it the local DB doesn't have much ability to optimize the query. There might also be some performance gain from increasing the fetch size from the default of 100 rows.