Why does PostgreSQL scan a joined table with a foreign key reference to a primary key when it is not logically part of query? [closed]

Question

Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this question? Guide the asker to update the question so it focuses on a single, specific problem. Narrowing the question will help others answer the question concisely. You may edit the question if you feel you can improve it yourself. If edited, the question will be reviewed and might be reopened.

Closed 6 months ago.

Improve this question

I am trying to understand why the behavior queries which logically only involve a primary key from one table perform a scan on a referenced foreign key in a joined table. Logically I see no reason for the following simple example to plan and execute as PostgreSQL 16.9 does. What do I need to understand is that my understanding of foreign keys and referential integrity is not how PostgreSQL 16.9 operates on joined tables.

The two tables are:


testv=> \d tbla
                             Table "public.tbla"
 Column |  Type   | Collation | Nullable |              Default              
--------+---------+-----------+----------+-----------------------------------
 apk    | bigint  |           | not null | nextval('tbla_apk_seq'::regclass)
 aval   | integer |           | not null | 
Indexes:
    "tbla_pkey" PRIMARY KEY, btree (apk)
Referenced by:
    TABLE "tblb" CONSTRAINT "tblb_ak_fkey" FOREIGN KEY (ak) REFERENCES tbla(apk) ON DELETE CASCADE

and


testv=> \d tblb
                             Table "public.tblb"
 Column |  Type   | Collation | Nullable |              Default              
--------+---------+-----------+----------+-----------------------------------
 bpk    | bigint  |           | not null | nextval('tblb_bpk_seq'::regclass)
 ak     | bigint  |           | not null | 
 bval   | integer |           | not null | 
Indexes:
    "tblb_pkey" PRIMARY KEY, btree (bpk)
Foreign-key constraints:
    "tblb_ak_fkey" FOREIGN KEY (ak) REFERENCES tbla(apk) ON DELETE CASCADE

The join is:


testv=> \d+ joinv
                            View "public.joinv"
 Column |  Type   | Collation | Nullable | Default | Storage | Description 
--------+---------+-----------+----------+---------+---------+-------------
 ak     | bigint  |           |          |         | plain   | 
 bk     | bigint  |           |          |         | plain   | 
 aval   | integer |           |          |         | plain   | 
 bval   | integer |           |          |         | plain   | 
View definition:
 SELECT a.apk AS ak,
    b.bpk AS bk,
    a.aval,
    b.bval
   FROM tbla a
     JOIN tblb b ON a.apk = b.ak;

My clearly incorrect thinking is that if I query only ak the primary key of tbla from the joined view joinv where ak in the join is specified as being the primary key apk of tbla but is a foreign key in tblb referencing the primary key apk of tbla there is no logical reason to scan ak in tblb. It is a not null foreign key in tblb so by referential integrity there is no reason to scan tblb.

Before running the explain I run analyze tbla, tblb which I thought should understand the logic of a not null foreign key. The PostgreSQL query however is:


testv=> explain (analyze, buffers) select ak from joinv order by ak DESC limit 1;
                                                                      QUERY PLAN                                                                      
------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..15000.47 rows=1 width=8) (actual time=52.320..52.320 rows=1 loops=1)
   Buffers: shared hit=1109
   ->  Nested Loop  (cost=0.42..15000044850.42 rows=1000000 width=8) (actual time=52.318..52.318 rows=1 loops=1)
         Join Filter: (a.apk = b.ak)
         Rows Removed by Join Filter: 173463
         Buffers: shared hit=1109
         ->  Index Only Scan Backward using tbla_pkey on tbla a  (cost=0.42..25980.42 rows=1000000 width=8) (actual time=0.008..0.008 rows=1 loops=1)
               Heap Fetches: 0
               Buffers: shared hit=4
         ->  Materialize  (cost=0.00..21370.00 rows=1000000 width=8) (actual time=0.029..41.650 rows=173464 loops=1)
               Buffers: shared hit=1105
               ->  Seq Scan on tblb b  (cost=0.00..16370.00 rows=1000000 width=8) (actual time=0.025..15.745 rows=173464 loops=1)
                     Buffers: shared hit=1105
 Planning:
   Buffers: shared hit=8
 Planning Time: 0.386 ms
 Execution Time: 53.073 ms
(17 rows)

Time: 54.144 ms

This clearly performs a scan of tblb although the explain does not actually specify the column of tblb it is scanning as it does with tbla where it specifies the column Index Only Scan Backward using tbla_pkey on tbla a.

When the query is changed to ASC the explain is different:


testv=> explain (analyze, buffers) select ak from joinv order by ak ASC limit 1;
                                                                 QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..15000.47 rows=1 width=8) (actual time=0.044..0.045 rows=1 loops=1)
   Buffers: shared hit=5
   ->  Nested Loop  (cost=0.42..15000044850.42 rows=1000000 width=8) (actual time=0.043..0.043 rows=1 loops=1)
         Join Filter: (a.apk = b.ak)
         Buffers: shared hit=5
         ->  Index Only Scan using tbla_pkey on tbla a  (cost=0.42..25980.42 rows=1000000 width=8) (actual time=0.010..0.010 rows=1 loops=1)
               Heap Fetches: 0
               Buffers: shared hit=4
         ->  Materialize  (cost=0.00..21370.00 rows=1000000 width=8) (actual time=0.029..0.029 rows=1 loops=1)
               Buffers: shared hit=1
               ->  Seq Scan on tblb b  (cost=0.00..16370.00 rows=1000000 width=8) (actual time=0.025..0.025 rows=1 loops=1)
                     Buffers: shared hit=1
 Planning:
   Buffers: shared hit=8
 Planning Time: 0.299 ms
 Execution Time: 0.071 ms
(16 rows)

Something further I do not understand is the Join Filter lines.

In the ASC case the relevent lines are:

   ->  Nested Loop  (cost=0.42..15000044850.42 rows=1000000 width=8) (actual time=0.046..0.047 rows=1 loops=1)
         Join Filter: (a.apk = b.ak)
         Buffers: shared hit=5
         ->  Index Only Scan using tbla_pkey on tbla a  (cost=0.42..25980.42 rows=1000000 width=8) (actual time=0.015..0.016 rows=1 loops=1)

In the DESC case the lines are:

   ->  Nested Loop  (cost=0.42..15000044850.42 rows=1000000 width=8) (actual time=49.381..49.381 rows=1 loops=1)
         Join Filter: (a.apk = b.ak)
         Rows Removed by Join Filter: 173463
         Buffers: shared hit=1109
         ->  Index Only Scan Backward using tbla_pkey on tbla a  (cost=0.42..25980.42 rows=1000000 width=8) (actual time=0.006..0.007 rows=1 loops=1)

I do not understand the Rows Removed by Join Filter: 173463 which does not, to my thinking, correspond to anything I would calculate. Also I can't find documentation on why select count(ak) from joinv and select count(distinct ak) from joinv performs the join before counting in the second case but not the first.


testv=> select count(distinct apk) from tbla;
  count  
---------
 1000000
(1 row)

Time: 64.469 ms
testv=> select count(distinct ak) from tblb;
 count  
--------
 631816
(1 row)

Time: 157.897 ms
testv=> select count(ak) from joinv;
  count  
---------
 1000000
(1 row)

Time: 117.547 ms
testv=> select count(distinct ak) from joinv;
 count  
--------
 631816
(1 row)

Given these numbers I do not understand where Rows Removed comes from the

         Join Filter: (a.apk = b.ak)
         Rows Removed by Join Filter: 173463

when I have


testv=> select (select count(apk) from tbla) - (select count(ak) from tblb)
testv-> ;
 ?column? 
----------
        0
(1 row)

Time: 59.017 ms
testv=> select (select count(apk) from tbla) - (select count(ak) from tblb);
 ?column? 
----------
        0
(1 row)

Time: 59.960 ms
testv=> select (select count(distinct apk) from tbla) - (select count(distinct ak) from tblb);
 ?column? 
----------
   368184
(1 row)

Reading the PostgreSQL documentation on JOIN has not helped me. The section of the manual Controlling the Planner with Explicit JOIN Clauses is not clear to me but I don't see how joining just two tables with an INNER JOIN has possible alternate plans.

I have found when joining multiple (3 or more) tables with just INNER JOIN query execution planning and timing can be drastically different. Since it also is different for what I consider a simple 2 table INNER JOIN I am having trouble understanding and optimizing JOIN operations. It seems like the higher normal form one uses in table design the more JOIN operations are required for many queries.

Where can I find better information on JOIN construction and performance than the PostgreSQL manual?

As an aside are you familiar with "accepting" an answer? Because you don't appear to have accepted any despite having many helpful answers on your previous questions. — Dale K
– Dale K, Commented Jun 2 at 7:01
This has nothing to do with foreign keys or referential integrity. It's your query; it's your view definition. A join is just a different (and better readable) syntax for a WHERE condition. SELECT x FROM a JOIN b ON a.id = b.id_a; is the same as SELECT x FROM a, b WHERE a.id = b.id_a; Your question is about SQL syntax, not about PostgreSQL. Oracle, SQL Server, MySQL, all support this. — Frank Heikens
– Frank Heikens, Commented Jun 2 at 15:39

Laurenz Albe · Accepted Answer · 2025-06-02 06:57:22Z

1

Your question contains too many questions, I'll have to vote to close for lack of focus. But I will answer the question from the title; feel free to ask more questions for the rest.

If you join tbla and tblb, a row from tbla can occur several times in the result, because several rows from tblb can reference a single row in tbla. Even if you only select columns from tbla, SQL would then require that the same row occurs multiple times in the result. That necessitates that PostgreSQL actually performs the join.

If you don't want the same row multiple times in the result, you should use an EXISTS expression in the query — but that won't match your view definition.

Don't use a view that is a join of several tables unless you really need data from all of these tables.

answered Jun 2 at 6:57

Laurenz Albe

257k22 gold badges312 silver badges388 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Why does PostgreSQL scan a joined table with a foreign key reference to a primary key when it is not logically part of query? [closed]

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related