1

i have a rather complex query that i'm trying to optimize in postgres 9.2 - the explain analyze gives this plan (explain.depesz.com):

        Merge Right Join  (cost=194965639.35..211592151.26 rows=420423258 width=616) (actual time=15898.283..15920.603 rows=17 loops=1)
       Merge Cond: ((((p.context -> 'device'::text)) = ((s.context -> 'device'::text))) AND (((p.context -> 'physical_port'::text)) = ((s.context -> 'physical_port'::text))))
       ->  Sort  (cost=68925.49..69073.41 rows=59168 width=393) (actual time=872.289..877.818 rows=39898 loops=1)
             Sort Key: ((p.context -> 'device'::text)), ((p.context -> 'physical_port'::text))
             Sort Method: quicksort  Memory: 27372kB
             ->  Seq Scan on ports__status p  (cost=0.00..64235.68 rows=59168 width=393) (actual time=0.018..60.931 rows=41395 loops=1)
       ->  Materialize  (cost=194896713.86..199620346.93 rows=284223403 width=299) (actual time=15023.710..15024.779 rows=17 loops=1)
             ->  Merge Left Join  (cost=194896713.86..198909788.42 rows=284223403 width=299) (actual time=15023.705..15024.765 rows=17 loops=1)
                   Merge Cond: ((((s.context -> 'device'::text)) = ((l1.context -> 'device'::text))) AND (((s.context -> 'physical_port'::text)) = ((l1.context -> 'physical_port'::text))))
                   ->  Sort  (cost=194894861.42..195605419.92 rows=284223403 width=224) (actual time=14997.225..14997.230 rows=17 loops=1)
                         Sort Key: ((s.context -> 'device'::text)), ((s.context -> 'physical_port'::text))
                         Sort Method: quicksort  Memory: 33kB
                         ->  GroupAggregate  (cost=100001395.98..122028709.71 rows=284223403 width=389) (actual time=14997.120..14997.186 rows=17 loops=1)
                               ->  Sort  (cost=100001395.98..100711954.49 rows=284223403 width=389) (actual time=14997.080..14997.080 rows=17 loops=1)
                                     Sort Key: ((d.context -> 'hostname'::text)), ((a.context -> 'ip_address'::text)), ((a.context -> 'mac_address'::text)), ((s.context -> 'device'::text)), ((s.context -> 'physical_port'::text)), s.created_at, s.updated_at, d.created_at, d.updated_at
                                     Sort Method: quicksort  Memory: 33kB
                                     ->  Merge Join  (cost=339026.99..9576678.30 rows=284223403 width=389) (actual time=14996.710..14996.749 rows=17 loops=1)
                                           Merge Cond: (((a.context -> 'mac_address'::text)) = ((s.context -> 'mac_address'::text)))
                                           ->  Sort  (cost=15038.32..15136.00 rows=39072 width=255) (actual time=23.556..23.557 rows=1 loops=1)
                                                 Sort Key: ((a.context -> 'mac_address'::text))
                                                 Sort Method: quicksort  Memory: 25kB
                                                 ->  Hash Join  (cost=471.88..12058.33 rows=39072 width=255) (actual time=13.482..23.548 rows=1 loops=1)
                                                       Hash Cond: ((a.context -> 'ip_address'::text) = (d.context -> 'ip_address'::text))
                                                       ->  Seq Scan on arps__arps a  (cost=0.00..8132.39 rows=46239 width=157) (actual time=0.007..11.191 rows=46259 loops=1)
                                                       ->  Hash  (cost=469.77..469.77 rows=169 width=98) (actual time=0.035..0.035 rows=1 loops=1)
                                                             Buckets: 1024  Batches: 1  Memory Usage: 1kB
                                                             ->  Bitmap Heap Scan on ipam__dns d  (cost=9.57..469.77 rows=169 width=98) (actual time=0.023..0.023 rows=1 loops=1)
                                                                   Recheck Cond: ((context -> 'hostname'::text) = 'zglast-oracle03.slac.stanford.edu'::text)
                                                                   ->  Bitmap Index Scan on ipam__dns_hostname_index  (cost=0.00..9.53 rows=169 width=0) (actual time=0.017..0.017 rows=1 loops=1)
                                                                         Index Cond: ((context -> 'hostname'::text) = 'blah'::text)
                                           ->  Sort  (cost=323988.67..327625.84 rows=1454870 width=134) (actual time=14973.118..14973.120 rows=18 loops=1)
                                                 Sort Key: ((s.context -> 'mac_address'::text))
                                                 Sort Method: external sort  Disk: 214176kB
                                                 ->  Result  (cost=0.00..175064.84 rows=1454870 width=134) (actual time=0.016..1107.604 rows=1265154 loops=1)
                                                       ->  Append  (cost=0.00..175064.84 rows=1454870 width=134) (actual time=0.013..796.578 rows=1265154 loops=1)
                                                             ->  Seq Scan on spanning_tree__neighbour s  (cost=0.00..0.00 rows=1 width=98) (actual time=0.000..0.000 rows=0 loops=1)
                                                                   Filter: ((context -> 'physical_port'::text) IS NOT NULL)
                                                             ->  Seq Scan on spanning_tree__neighbour__vlan38 s  (cost=0.00..469.32 rows=1220 width=129) (actual time=0.011..1.019 rows=823 loops=1)
                                                                   Filter: ((context -> 'physical_port'::text) IS NOT NULL)
                                                                   Rows Removed by Filter: 403
                                                             ->  Seq Scan on spanning_tree__neighbour__vlan3 s  (cost=0.00..270.20 rows=1926 width=139) (actual time=0.017..0.971 rows=1882 loops=1)
                                                                   Filter: ((context -> 'physical_port'::text) IS NOT NULL)
                                                                   Rows Removed by Filter: 54
                                                             ->  Seq Scan on spanning_tree__neighbour__vlan466 s  (cost=0.00..131.85 rows=306 width=141) (actual time=0.032..0.340 rows=276 loops=1)
                                                                   Filter: ((context -> 'physical_port'::text) IS NOT NULL)
                                                                   Rows Removed by Filter: 32
                                                             ->  Seq Scan on spanning_tree__neighbour__vlan465 s  (cost=0.00..208.57 rows=842 width=142) (actual time=0.005..0.622 rows=768 loops=1)
                                                                   Filter: ((context -> 'physical_port'::text) IS NOT NULL)
                                                                   Rows Removed by Filter: 78
                                                             ->  Seq Scan on spanning_tree__neighbour__vlan499 s  (cost=0.00..245.04 rows=481 width=142) (actual time=0.017..0.445 rows=483 loops=1)
                                                                   Filter: ((context -> 'physical_port'::text) IS NOT NULL)
                                                             ->  Seq Scan on spanning_tree__neighbour__vlan176 s  (cost=0.00..346.36 rows=2576 width=131) (actual time=0.008..1.443 rows=2051 loops=1)
                                                                   Filter: ((context -> 'physical_port'::text) IS NOT NULL)
                                                                   Rows Removed by Filter: 538

i'm a bit of a novice at reading the plan, but i think it's all down to the fact that i have the table spanning_tree__neighbour (which i've partitioned into numerous 'vlan' tables). as you can see it's performing a seq scan.

so i write a quick and dirty bash script to create indexes for the child tables:

create index spanning_tree__neighbour__vlan1_physical_port_index ON spanning_tree__neighbour__vlan1((context->'physical_port')) wHERE ((context->'physical_port') IS NOT NULL);
create index spanning_tree__neighbour__vlan2_physical_port_index ON spanning_tree__neighbour__vlan2((context->'physical_port')) wHERE ((context->'physical_port') IS NOT NULL);
create index spanning_tree__neighbour__vlan3_physical_port_index ON spanning_tree__neighbour__vlan3((context->'physical_port')) wHERE ((context->'physical_port') IS NOT NULL);
...

but after i create a hundred or so of them, any query gives:

=> explain analyze select * from hosts where hostname='blah';
WARNING:  out of shared memory
ERROR:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction.
Time: 34.757 ms

will setting max_locks_per_transaction actually help? what value should i use given that my partitioned table has upto 4096 child tables?

or have i read the plan wrong?

3
  • 1
    Your statistics seem to be off. Commented Jan 14, 2014 at 0:21
  • 2
    What's your PostgreSQL version? Always include your PostgreSQL version in questions. Wildplasser is right, too, your stats are completely out there. ANALYZE; and determine why autovaccum hasn't been doing auto-analyze for you. Separately, yes, if you have tons of relations that a query might touch, you must raise max_locks_per_transaction and pay the associated cost in shared memory use. Commented Jan 14, 2014 at 0:21
  • thanks for the link to help analyze the... err analyze! very useful. i'm using 9.2. the explain was only a partial of the full output - i've linked to a full version. so if i'm reading it correctly, it's spending most of it's time (80%) doing a sort on the data... how how can i go about reducing that? cheers, Commented Jan 14, 2014 at 5:29

1 Answer 1

1

will setting max_locks_per_transaction actually help?

No, it won't.

Not before fixing your schema and your query first anyway.

A few things pop out… Some already mentioned in comments. In no particular order:

  1. Stats are off. ANALYZE your tables and, if you determine that autovacuum doesn't have enough memory to do its job properly, increase maintenance_work_mem.

  2. Steps like Sort Method: external sort Disk: 214176kB indicate that you're sorting rows on disk. Increase work_mem accordingly.

  3. Steps like Seq Scan on spanning_tree__neighbour__vlan176 s (cost=0.00..346.36 rows=2576 width=131) (actual time=0.008..1.443 rows=2051 loops=1) followed by append are dubious at best.

    Look… Partition tables when you want to turn something unmanageable or impractical into something more manageable, e.g. pushing a few billion rows of old data out of the way of more the couple of millions that are used on a daily basis. Not to turn a couple of million rows into 4,096 puny tables with a pathetically small 1k rows in them on average.

  4. The next offender is things like Filter: ((context -> 'physical_port'::text) IS NOT NULL) —- ARGH.

    Never, ever store things in hstore, JSON, XML or any other kind of EAV (entity-attribute-value store), if you care about the data that lands in it; in particular if it appears in a where, join or sort (!) clause. No ifs, no buts: just change your schema.

    Plus, a bunch of the fields that appear in your query could be conveniently stored using Postgres' network types instead of dumb text. Odds are they should all be indexed, too. (They wouldn't appear in the plan if they shouldn't.)

  5. You've a step that does a GroupAggregate beneath a left join. Typically, this indicates a query like: … left join (select agg_fn(…) … group by …) foo …. That's a big no no in my experience. Pull that out of your query if you can.

    The plan is too long and unreadable to guess why it's doing that exactly, but if select * from hosts where hostname='blah'; is anything to go by, you seem to be selecting absolutely every possible thing you can access in one query.

    It's a lot cheaper, and faster, to find the select few rows that you actually want, and then run a handful of other queries to select the related data. So do so.

    If you still need to join with that aggregate subquery for some reason, be sure to look into window functions. More often than not, they'll spare you the need for gory joins, by allowing you to run the aggregate on the current set of rows directly.

Once you've done these steps, the default max_locks_per_transaction should be fine.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.