0

I have a PostgreSQL query that references two columns with an OR statement in the where clause.

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS) select * from "connection"
where "personOneId" = '?'
or "personTwoId" = '?'

I have an index on "personOneId" and the query is blistering fast. But when I include the OR "personTwoId" the query slows down dramatically. I initially tried having both "personOneId" and "personTwoId" indexed(multi column index) but it still does a " -> Parallel Seq Scan on connection" and the query is the same speed as it always was even with the index. Is my index wrong or is this the expected behavior with the "OR" statement? Is there a way to alter this query to achieve the same outcome that will allow PG to use the indexed properly?

Execution plan

"Gather  (cost=1000.00..24641.09 rows=302 width=117) (actual time=47.352..144.044 rows=337 loops=1)"
"  Output: redacted"
"  Workers Planned: 2"
"  Workers Launched: 2"
"  Buffers: shared hit=1892 read=15205"
"  ->  Parallel Seq Scan on public.connection  (cost=0.00..23610.89 rows=126 width=117) (actual time=41.072..134.191 rows=112 loops=3)"
"        Output: redacted"
"        Filter: ((connection.""personOneId"" = 'redacted id'::uuid) OR (connection.""personTwoId"" = 'redacted id'::uuid))"
"        Rows Removed by Filter: 347295"
"        Buffers: shared hit=1892 read=15205"
"        Worker 0: actual time=39.153..134.249 rows=170 loops=1"
"          Buffers: shared hit=667 read=5645"
"        Worker 1: actual time=37.108..132.297 rows=134 loops=1"
"          Buffers: shared hit=651 read=4768"
"Planning Time: 0.217 ms"
"Execution Time: 147.659 ms"
1
  • also provide your full execution plan (EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS) Commented Aug 4, 2021 at 17:43

1 Answer 1

1

You have the wrong index for this query. A multicolumn btree index on ("personOneId", "personTwoId") is not very good for the same reason it is inefficient to find all the people with the first name of 'Samantha' in a paper phone book, which is sorted by last name first then by first name.

If you have separate btree indexes on each column, then it can combine them with a BitmapOr and that should be fast. Or if you switch to a GIN index, a multi-column GIN index should also be useful.

Sign up to request clarification or add additional context in comments.

1 Comment

Yep, you are right, indexing each column separately instantly worked and the execution plan is now using a BitmapOr on the two indexes. This brought the execution time from an average of 150ms down to an average of 5ms. Thank you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.