I am trying to understand why the behavior queries which logically only involve a primary key from one table perform a scan on a referenced foreign key in a joined table. Logically I see no reason for the following simple example to plan and execute as PostgreSQL 16.9 does. What do I need to understand is that my understanding of foreign keys and referential integrity is not how PostgreSQL 16.9 operates on joined tables.
The two tables are:
testv=> \d tbla
Table "public.tbla"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+-----------------------------------
apk | bigint | | not null | nextval('tbla_apk_seq'::regclass)
aval | integer | | not null |
Indexes:
"tbla_pkey" PRIMARY KEY, btree (apk)
Referenced by:
TABLE "tblb" CONSTRAINT "tblb_ak_fkey" FOREIGN KEY (ak) REFERENCES tbla(apk) ON DELETE CASCADE
and
testv=> \d tblb
Table "public.tblb"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+-----------------------------------
bpk | bigint | | not null | nextval('tblb_bpk_seq'::regclass)
ak | bigint | | not null |
bval | integer | | not null |
Indexes:
"tblb_pkey" PRIMARY KEY, btree (bpk)
Foreign-key constraints:
"tblb_ak_fkey" FOREIGN KEY (ak) REFERENCES tbla(apk) ON DELETE CASCADE
The join is:
testv=> \d+ joinv
View "public.joinv"
Column | Type | Collation | Nullable | Default | Storage | Description
--------+---------+-----------+----------+---------+---------+-------------
ak | bigint | | | | plain |
bk | bigint | | | | plain |
aval | integer | | | | plain |
bval | integer | | | | plain |
View definition:
SELECT a.apk AS ak,
b.bpk AS bk,
a.aval,
b.bval
FROM tbla a
JOIN tblb b ON a.apk = b.ak;
My clearly incorrect thinking is that if I query only ak the primary key of tbla from the joined view joinv where ak in the join is specified as being the primary key apk of tbla but is a foreign key in tblb referencing the primary key apk of tbla there is no logical reason to scan ak in tblb. It is a not null foreign key in tblb so by referential integrity there is no reason to scan tblb.
Before running the explain I run analyze tbla, tblb which I thought should understand the logic of a not null foreign key. The PostgreSQL query however is:
testv=> explain (analyze, buffers) select ak from joinv order by ak DESC limit 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..15000.47 rows=1 width=8) (actual time=52.320..52.320 rows=1 loops=1)
Buffers: shared hit=1109
-> Nested Loop (cost=0.42..15000044850.42 rows=1000000 width=8) (actual time=52.318..52.318 rows=1 loops=1)
Join Filter: (a.apk = b.ak)
Rows Removed by Join Filter: 173463
Buffers: shared hit=1109
-> Index Only Scan Backward using tbla_pkey on tbla a (cost=0.42..25980.42 rows=1000000 width=8) (actual time=0.008..0.008 rows=1 loops=1)
Heap Fetches: 0
Buffers: shared hit=4
-> Materialize (cost=0.00..21370.00 rows=1000000 width=8) (actual time=0.029..41.650 rows=173464 loops=1)
Buffers: shared hit=1105
-> Seq Scan on tblb b (cost=0.00..16370.00 rows=1000000 width=8) (actual time=0.025..15.745 rows=173464 loops=1)
Buffers: shared hit=1105
Planning:
Buffers: shared hit=8
Planning Time: 0.386 ms
Execution Time: 53.073 ms
(17 rows)
Time: 54.144 ms
This clearly performs a scan of tblb although the explain does not actually specify the column of tblb it is scanning as it does with tbla where it specifies the column Index Only Scan Backward using tbla_pkey on tbla a.
When the query is changed to ASC the explain is different:
testv=> explain (analyze, buffers) select ak from joinv order by ak ASC limit 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..15000.47 rows=1 width=8) (actual time=0.044..0.045 rows=1 loops=1)
Buffers: shared hit=5
-> Nested Loop (cost=0.42..15000044850.42 rows=1000000 width=8) (actual time=0.043..0.043 rows=1 loops=1)
Join Filter: (a.apk = b.ak)
Buffers: shared hit=5
-> Index Only Scan using tbla_pkey on tbla a (cost=0.42..25980.42 rows=1000000 width=8) (actual time=0.010..0.010 rows=1 loops=1)
Heap Fetches: 0
Buffers: shared hit=4
-> Materialize (cost=0.00..21370.00 rows=1000000 width=8) (actual time=0.029..0.029 rows=1 loops=1)
Buffers: shared hit=1
-> Seq Scan on tblb b (cost=0.00..16370.00 rows=1000000 width=8) (actual time=0.025..0.025 rows=1 loops=1)
Buffers: shared hit=1
Planning:
Buffers: shared hit=8
Planning Time: 0.299 ms
Execution Time: 0.071 ms
(16 rows)
Something further I do not understand is the Join Filter lines.
In the ASC case the relevent lines are:
-> Nested Loop (cost=0.42..15000044850.42 rows=1000000 width=8) (actual time=0.046..0.047 rows=1 loops=1)
Join Filter: (a.apk = b.ak)
Buffers: shared hit=5
-> Index Only Scan using tbla_pkey on tbla a (cost=0.42..25980.42 rows=1000000 width=8) (actual time=0.015..0.016 rows=1 loops=1)
In the DESC case the lines are:
-> Nested Loop (cost=0.42..15000044850.42 rows=1000000 width=8) (actual time=49.381..49.381 rows=1 loops=1)
Join Filter: (a.apk = b.ak)
Rows Removed by Join Filter: 173463
Buffers: shared hit=1109
-> Index Only Scan Backward using tbla_pkey on tbla a (cost=0.42..25980.42 rows=1000000 width=8) (actual time=0.006..0.007 rows=1 loops=1)
I do not understand the Rows Removed by Join Filter: 173463 which does not, to my thinking, correspond to anything I would calculate. Also I can't find documentation on why select count(ak) from joinv and select count(distinct ak) from joinv performs the join before counting in the second case but not the first.
testv=> select count(distinct apk) from tbla;
count
---------
1000000
(1 row)
Time: 64.469 ms
testv=> select count(distinct ak) from tblb;
count
--------
631816
(1 row)
Time: 157.897 ms
testv=> select count(ak) from joinv;
count
---------
1000000
(1 row)
Time: 117.547 ms
testv=> select count(distinct ak) from joinv;
count
--------
631816
(1 row)
Given these numbers I do not understand where Rows Removed comes from the
Join Filter: (a.apk = b.ak)
Rows Removed by Join Filter: 173463
when I have
testv=> select (select count(apk) from tbla) - (select count(ak) from tblb)
testv-> ;
?column?
----------
0
(1 row)
Time: 59.017 ms
testv=> select (select count(apk) from tbla) - (select count(ak) from tblb);
?column?
----------
0
(1 row)
Time: 59.960 ms
testv=> select (select count(distinct apk) from tbla) - (select count(distinct ak) from tblb);
?column?
----------
368184
(1 row)
Reading the PostgreSQL documentation on JOIN has not helped me. The section of the manual Controlling the Planner with Explicit JOIN Clauses is not clear to me but I don't see how joining just two tables with an INNER JOIN has possible alternate plans.
I have found when joining multiple (3 or more) tables with just INNER JOIN query execution planning and timing can be drastically different. Since it also is different for what I consider a simple 2 table INNER JOIN I am having trouble understanding and optimizing JOIN operations. It seems like the higher normal form one uses in table design the more JOIN operations are required for many queries.
Where can I find better information on JOIN construction and performance than the PostgreSQL manual?
SELECT x FROM a JOIN b ON a.id = b.id_a;is the same asSELECT x FROM a, b WHERE a.id = b.id_a;Your question is about SQL syntax, not about PostgreSQL. Oracle, SQL Server, MySQL, all support this.