I have three large-ish tables, A (10 million rows), A_C (60 million rows) and C (6 rows). They are related via bigint fields and with primary foreign key relations. The tables have a n:m relation, this is why I added the A_C table. The schema is as follow:
A.id = A_C.a_id and
C.id = A_C.c_id
A is the primary table here. I want to display the first 100 entries from A and all corresponding rows from C via A_C. Currently the query takes very long. My first approach were two simple joins. Running EXPLAIN ANALYZE I found out, that SQL does a sequential scan on A. I think this is because it applies the limit at last and simply disregards it. This performs very poorly. Is there any way to work around this LIMIT or another solution I am missing here? When I perform a LIMIT on a simple select query, it performs just fine in ~2ms.
Edit: A.id is primary key and has a btree index. A_C has a unique constraint on the entire row and also an index on A_C.a_id and A_C.c_id. I have not yet joined C as it is already really slow.
Fast queries:
SELECT A.id FROM ASELECT A.id FROM A LIMIT 10
Slow query:
The following is fast in query planning but slow in execution since the data amount is very large:
SELECT A.id FROM A JOIN A_C ON A.id = A_C.a_id
I tried to avoid that by setting a limit on how many rows from A are retrieved:
SELECT a.id FROM (SELECT A.id AS id FROM metrics.metrics LIMIT 10) a JOIN A_C ON (a.id = A_C.a_id)
The query plan via EXPLAIN ANALYZE looks as follows:
"Hash Join (cost=5.75..1305897.53 rows=44790854 width=8) (actual time=251969.648..251969.648 rows=0 loops=1)"
" Hash Cond: (a_c.a_id = a.id)"
" -> Seq Scan on a_c (cost=0.00..690017.54 rows=44790854 width=8) (actual time=0.472..126203.253 rows=44790000 loops=1)"
" -> Hash (cost=4.50..4.50 rows=100 width=8) (actual time=1.889..1.889 rows=100 loops=1)"
" Buckets: 1024 Batches: 1 Memory Usage: 12kB"
" -> Limit (cost=0.00..3.50 rows=100 width=8) (actual time=0.527..1.550 rows=100 loops=1)"
" -> Seq Scan on a (cost=0.00..261275.88 rows=7465088 width=8) (actual time=0.520..0.912 rows=100 loops=1)"
"Planning time: 16.350 ms"
"Execution time: 251969.693 ms"
AtoA_Cand there were noWHEREclause, then a sequential scan ofAmight not be avoidable.