PostgreSQL query gets magnitudes slower when sorting by columns from two different tables

Question

Previously, I used this query, which was fast:

cb=# explain analyze SELECT "web_route"."id", "web_crag"."id" FROM "web_route" 
INNER JOIN "web_crag" ON ( "web_route"."crag_id" = "web_crag"."id" )
WHERE "web_crag"."type" IN (1, 2) 
ORDER BY "web_crag"."name" ASC
LIMIT 20;
                                                                 QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..2.16 rows=20 width=18) (actual time=0.027..0.105 rows=20 loops=1)
   ->  Nested Loop  (cost=0.00..47088.94 rows=436055 width=18) (actual time=0.026..0.100 rows=20 loops=1)
         ->  Index Scan using web_crag_name on web_crag  (cost=0.00..503.16 rows=1776 width=14) (actual time=0.011..0.020 rows=14 loops=1)
               Filter: (type = ANY ('{1,2}'::integer[]))
         ->  Index Scan using web_route_crag_id on web_route  (cost=0.00..23.27 rows=296 width=8) (actual time=0.004..0.005 rows=1 loops=14)
               Index Cond: (crag_id = web_crag.id)
 Total runtime: 0.154 ms
(7 rows)

The problem with the query is that the order in which the rows are returned is not deterministic, which caused repeating rows across subsequent pages produced OFFSETing (i.e. pagination did not work properly in my web app). I decided to make the ordering strict by additionally sorting by "web_route".id".

cb=# explain analyze SELECT "web_route"."id", "web_crag"."id" FROM "web_route" 
INNER JOIN "web_crag" ON ( "web_route"."crag_id" = "web_crag"."id" )
WHERE "web_crag"."type" IN (1, 2)
ORDER BY "web_crag"."name", "web_route"."id" ASC 
LIMIT 20;
                                                             QUERY PLAN                                                             
------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=29189.04..29189.09 rows=20 width=18) (actual time=324.065..324.068 rows=20 loops=1)
   ->  Sort  (cost=29189.04..30279.18 rows=436055 width=18) (actual time=324.063..324.064 rows=20 loops=1)
         Sort Key: web_crag.name, web_route.id
         Sort Method: top-N heapsort  Memory: 26kB
         ->  Hash Join  (cost=135.40..17585.78 rows=436055 width=18) (actual time=0.882..195.941 rows=435952 loops=1)
               Hash Cond: (web_route.crag_id = web_crag.id)
               ->  Seq Scan on web_route  (cost=0.00..10909.55 rows=436055 width=8) (actual time=0.026..55.916 rows=435952 loops=1)
               ->  Hash  (cost=113.20..113.20 rows=1776 width=14) (actual time=0.848..0.848 rows=1775 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 82kB
                     ->  Seq Scan on web_crag  (cost=0.00..113.20 rows=1776 width=14) (actual time=0.004..0.510 rows=1775 loops=1)
                           Filter: (type = ANY ('{1,2}'::integer[]))
 Total runtime: 324.101 ms
(12 rows)

However, as you can see, the query got more than 2000x slower, which is quite a lot :). I wonder what can be done about that if anything. I plan to do really not a nice hack and duplicate "web_crag"."name" into "web_route" so that I can put an index on the two columns (crag_name, id) but if there is a better way I would be glad.

Here are schemes of "web_route" and "web_crag" in case it matters.

cb=# \d web_crag;
                                      Table "public.web_crag"
     Column      |           Type           |                       Modifiers                       
-----------------+--------------------------+-------------------------------------------------------
 id              | integer                  | not null default nextval('web_crag_id_seq'::regclass)
 name            | character varying(64)    | not null
 latitude        | double precision         | 
 longitude       | double precision         | 
 type            | integer                  | 
 description     | text                     | not null
 normalized_name | character varying(64)    | not null
 country_id      | integer                  | 
 location_index  | character(24)            | not null
 added_by_id     | integer                  | 
 date_created    | timestamp with time zone | 
 last_modified   | timestamp with time zone | 
Indexes:
    "web_crag_pkey" PRIMARY KEY, btree (id)
    "web_crag_added_by_id" btree (added_by_id)
    "web_crag_country_id" btree (country_id)
    "web_crag_location_index" btree (location_index)
    "web_crag_name" btree (name)
Foreign-key constraints:
    "added_by_id_refs_id_1745ebe43b31bec6" FOREIGN KEY (added_by_id) REFERENCES web_member(id) DEFERRABLE INITIALLY DEFERRED
    "country_id_refs_id_1384050a9bd763af" FOREIGN KEY (country_id) REFERENCES web_country(id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
    TABLE "web_route" CONSTRAINT "crag_id_refs_id_3ce1145606d12740" FOREIGN KEY (crag_id) REFERENCES web_crag(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "web_video" CONSTRAINT "crag_id_refs_id_4fc9cbf2832725ca" FOREIGN KEY (crag_id) REFERENCES web_crag(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "web_image" CONSTRAINT "crag_id_refs_id_58210dd331468848" FOREIGN KEY (crag_id) REFERENCES web_crag(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "web_eventdestination" CONSTRAINT "crag_id_refs_id_612ad57c4d76c32c" FOREIGN KEY (crag_id) REFERENCES web_crag(id) DEFERRABLE INITIALLY DEFERRED
Triggers:
    set_crag_location_index BEFORE INSERT OR UPDATE ON web_crag FOR EACH ROW EXECUTE PROCEDURE set_crag_location_index()

cb=# \d web_route
                                        Table "public.web_route"
       Column       |           Type           |                       Modifiers                        
--------------------+--------------------------+--------------------------------------------------------
 id                 | integer                  | not null default nextval('web_route_id_seq'::regclass)
 name               | character varying(64)    | not null
 crag_id            | integer                  | not null
 sector             | character varying(64)    | not null
 difficulty         | character varying(16)    | not null
 author             | character varying(64)    | not null
 build_date         | character varying(32)    | not null
 description        | text                     | not null
 difficulty_numeric | integer                  | 
 length_meters      | double precision         | 
 added_by_id        | integer                  | 
 date_created       | timestamp with time zone | 
 last_modified      | timestamp with time zone | 
 normalized_name    | character varying(64)    | not null
 rating_votes       | integer                  | not null
 rating_score       | integer                  | not null
Indexes:
    "web_route_pkey" PRIMARY KEY, btree (id)
    "web_route_added_by_id" btree (added_by_id)
    "web_route_crag_id" btree (crag_id)
Check constraints:
    "ck_rating_votes_pstv_c39bae29f3b2012" CHECK (rating_votes >= 0)
    "web_route_rating_votes_check" CHECK (rating_votes >= 0)
Foreign-key constraints:
    "added_by_id_refs_id_157791930f5e12d5" FOREIGN KEY (added_by_id) REFERENCES web_member(id) DEFERRABLE INITIALLY DEFERRED
    "crag_id_refs_id_3ce1145606d12740" FOREIGN KEY (crag_id) REFERENCES web_crag(id) DEFERRABLE INITIALLY DEFERRED

What is the history of your data? Did you truncate or import from dumps? If so, try updating the planners statistics or re-creating the indexes. PostgreSQL might just make a poor plan due to outdated assumptions. Besides this you should include all primary keys in the order criteria when you want paging to avoid any freedom in result ordering. — Augustus Kling
– Augustus Kling, Commented Apr 5, 2014 at 20:16
@AugustusKling: I have run vacuum analyze, reindex table web_route, reindex table web_crag. It didn't change anything though. I don't exactly remember where the data came from either it was imported from csv or from a sql dump. — clime
– clime, Commented Apr 5, 2014 at 20:23
I was hoping to get better estimates for query planning but it seems the statistics have already been good. Your data possibly contains many web_routes per web_crag and having to sort on web_route.id might require to sort many web_routes (or at least the planner prepares for this). I would try to create index on web_route (crag_id asc, id asc) to gather information about the distribution across columns, then use order by web_crag.name, web_route.crag_id, web_route.id and hope it suffices. — Augustus Kling
– Augustus Kling, Commented Apr 5, 2014 at 21:10
@AugustusKling: I have tried that but couldn't really see any improvement over the original query. — clime
– clime, Commented Apr 5, 2014 at 22:14
Is web_crag.name actually unique? If not, what would it mean to include it in the other table? With the index on web_route (crag_id asc, id asc), there should be better plans available, but it seems the planner is just not smart enough to find them. — jjanes
– jjanes, Commented Apr 7, 2014 at 15:39

intgr · Accepted Answer · 2014-04-30 12:31:30Z

Sadly PostgreSQL is not (yet) good at optimizing these types of sorts, it always wants to sort the whole result set at once if it can't find an index exactly matching the sort clauses.

Starting with PostgreSQL 9.3, you can trick the planner into doing the right thing with a LATERAL subquery. Try this:

SELECT "web_route"."id", "web_crag"."id"
FROM "web_crag",
LATERAL (
    SELECT * FROM "web_route"
    WHERE "web_route"."crag_id" = "web_crag"."id"
    ORDER BY "web_route"."id" ASC
) AS "web_route"
WHERE "web_crag"."type" IN (1, 2)
ORDER BY "web_crag"."name"
LIMIT 20;

With some simple test data I generated (1 million web_crags, 5 million web_routes) here's the query plan and timings... Almost identical to your 1st query plan, apart from the extra Sort for web_route.id:

 Limit  (cost=24.36..120.70 rows=20 width=14) (actual time=0.051..0.169 rows=20 loops=1)
   ->  Nested Loop  (cost=24.36..24084788.95 rows=5000000 width=14) (actual time=0.049..0.143 rows=20 loops=1)
         ->  Index Scan using web_crag_name_idx on web_crag  (cost=0.42..39131.46 rows=1000000 width=10) (actual time=0.018..0.023 rows=4 loops=1)
               Filter: (type = ANY ('{1,2}'::integer[]))
         ->  Sort  (cost=23.93..23.95 rows=5 width=8) (actual time=0.018..0.021 rows=5 loops=4)
               Sort Key: web_route.id
               Sort Method: quicksort  Memory: 25kB
               ->  Index Scan using web_route_crag_id_idx on web_route  (cost=0.43..23.88 rows=5 width=8) (actual time=0.005..0.011 rows=5 loops=4)
                     Index Cond: (crag_id = web_crag.id)
 Total runtime: 0.212 ms

You can avoid the sort with an additional index on web_route (crag_id, id):

 Limit  (cost=0.86..19.49 rows=20 width=14) (actual time=0.031..0.113 rows=20 loops=1)
   ->  Nested Loop  (cost=0.86..4659293.82 rows=5000000 width=14) (actual time=0.029..0.084 rows=20 loops=1)
         ->  Index Scan using web_crag_name_idx on web_crag  (cost=0.42..39293.82 rows=1000000 width=10) (actual time=0.017..0.021 rows=4 loops=1)
               Filter: (type = ANY ('{1,2}'::integer[]))
         ->  Index Only Scan using web_route_crag_id_id_idx on web_route  (cost=0.43..4.52 rows=5 width=8) (actual time=0.005..0.009 rows=5 loops=4)
               Index Cond: (crag_id = web_crag.id)
               Heap Fetches: 0
 Total runtime: 0.151 ms

Here's how I created the test data:

create table web_crag(id serial primary key, type int default 1, name text);
create table web_route(id serial primary key, crag_id int);
insert into web_crag (name) select generate_series(1,1000000)::text;
insert into web_route (crag_id) select id from web_crag cross join generate_series(1,5);
create index on web_crag(name);
create index on web_route(crag_id);
analyze web_route;

PostgreSQL patch

There is a "partial sort" patch to PostgreSQL to do roughly this kind of optimization automatically, but sadly it did not make the cut for PostgreSQL 9.4. Hopefully PostgreSQL 9.5 will have it (roughly second half of 2015).

kristok · Accepted Answer · 2014-05-05 21:06:06Z

2

The issue here is that before you were able to use an existing index for ordering

 ->  Index Scan using web_crag_name on web_crag  (cost=0.00..503.16 rows=1776 width=14)      (actual time=0.011..0.020 rows=14 loops=1)
           Filter: (type = ANY ('{1,2}'::integer[]))

But after adding two different columns for ordering it would need to use one index from the first table and another one from the second table. If you want to make it run fast for pagination the only reasonable thing to do is to avoid the third normalized form (meaning no data is duplicated) and just duplicate the data in the other table. It's either web.route_id inside web.crag table or vice versa (did not bother looking into your schema too much) and create a combined index CREATE INDEX ON table ("web_crag"."name", "web_route"."id");

When a combined index exists on the table and the columns are in the right order the scan time will be as good as for your first query.

Also make sure you check intgr-s solution. It works on on 9.3+ and does the trick. Materalization vs. (kind of) hard to read queries are your options. My personal preference goes for materialization but intgr gets my up vote for teaching me something new.

edited May 5, 2014 at 21:06

answered Apr 11, 2014 at 21:58

kristok

1501 silver badge6 bronze badges

5 Comments

intgr Over a year ago

Of course, duplicating data means that now every time you update a "web_crag" name, you also have to update the corresponding value in all related "web_route" rows. You might also run into concurrency problems, for example when one transaction tries to update the name and another inserts new route rows.

kristok Over a year ago

duplicating data is always a necessary evil, i agree

intgr Over a year ago

Always a necessary evil? Not at all, my solution doesn't require duplicating data.

kristok Over a year ago

What i meant is that duplicating data is always evil, but when you need to do it and it simplifies things then it's a necessary evil thing to do. To be honest i have never used lateral but i would love to see what the explain plan is that this produces.

intgr Over a year ago

I updated my answer with a query plan when running with some test data.

Collectives™ on Stack Overflow

PostgreSQL query gets magnitudes slower when sorting by columns from two different tables

2 Answers 2

PostgreSQL patch

Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

PostgreSQL patch

Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related