I am facing a problem with a specific query on postgressql.
Look the explain:
-> Nested Loop Left Join (cost=21547.86..87609.16 rows=123 width=69) (actual time=28.997..562.299 rows=32710 loops=1)
-> Hash Join (cost=21547.30..87210.72 rows=123 width=53) (actual time=28.913..74.682 rows=32710 loops=1)
Hash Cond: (registry.id = profile.registry_id)
-> Bitmap Heap Scan on registry (cost=726.99..66218.46 rows=65503 width=53) (actual time=5.123..32.794 rows=66496 loops=1)
Recheck Cond: ((tenant_id = 1009469) AND active AND (excluded_at IS NULL))
Heap Blocks: exact=12563
-> Bitmap Index Scan on registry_tenant_id_excluded_at (cost=0.00..710.61 rows=65503 width=0) (actual time=3.589..3.589 rows=66496 loops=1)
Index Cond: (tenant_id = 1009469)
-> Hash (cost=20202.82..20202.82 rows=49399 width=16) (actual time=23.738..23.738 rows=32710 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2046kB
-> Index Only Scan using profile_tenant_id_registry_id on profile (cost=0.56..20202.82 rows=49399 width=16) (actual time=0.019..19.173 rows=32710 loops=1)
Index Cond: (tenant_id = 1009469)
Heap Fetches: 29493
It misestimate the hash join, even if both the scans are accurate. I already tried to boost the statistics on the related columns but it just estimated from 117 to 123, so I guess this is not the issue.
Why it is misestimating so hard? The nested loop takes a lot of work for the database.