1

To improve the performance, I tuned the parameters shared_buffers and work_mem. Though i tried for different combinations of above parameters, my query taking the same time. I restarted the server when ever the 'shared_buffer' is changed My Environment is

Postgres 9.1 OS windows 7 RAM 8GB

I tried shared_buffers for 1gb,2gb and 3gb. But query execution time changes very minimally( in milli seconds)

I am giving the explain plan of query

GroupAggregate  (cost=11100.99..12737.10 rows=50342 width=24) (actual time=181.789..268.733 rows=49829 loops=1)
  ->  Sort  (cost=11100.99..11226.84 rows=50342 width=24) (actual time=181.761..188.323 rows=50000 loops=1)    
        Sort Key: (date(t_proof_of_play.play_timestamp)), (date_trunc(''hour''::text, t_proof_of_play.play_timestamp)), t_device_endpoint.venue_hierarchy_id, t_device_endpoint.device_config_id, t_proof_of_play.asset_id
        Sort Method: quicksort  Memory: 5443kB
        ->  Hash Join  (cost=2629.37..7169.41 rows=50342 width=24) (actual time=15.416..112.175 rows=50000 loops=1)
              Hash Cond: (t_proof_of_play.device_endpoint_id = t_device_endpoint.id)
              ->  Bitmap Heap Scan on t_proof_of_play  (cost=2628.28..6224.41 rows=50342 width=20) (actual time=15.332..29.004 rows=50000 loops=1
                    Recheck Cond: ((is_processed = 0) AND (is_revenue = 1))
                    -> Bitmap Index Scan on process_revenue  (cost=0.00..2615.69 rows=50342 width=0) (actual time=15.081..15.081 rows=50000 loops=1)
                          Index Cond: ((is_processed = 0) AND (is_revenue = 1))
              ->   Hash  (cost=1.04..1.04 rows=4 width=12) (actual time=0.026..0.026 rows=4 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 1kB
                    ->  Seq Scan on t_device_endpoint  (cost=0.00..1.04 rows=4 width=12) (actual time=0.009..0.015 rows=4 loops=1)
Total runtime: 276.027 ms

Tabulated version on explain.depesz

The suggestions are welcomed. Thanks in advance

regards

3 Answers 3

3

As @RichardHuxton points out, you are retrieving results at about 5 microseconds per row. It's hard to complain too much about that; but that rate proves that you have a very high cache hit ratio. You might want to try adjusting the cost factors on a connection to reflect that (using the SET command), and try the query again to see if you get a better plan.

SET cpu_tuple_cost = 0.03;
SET random_page_cost = 1;
SET effective_cache_size = '4GB';
-- (your query here)

Of course, @regilero is right about shared_buffers -- on Windows it must be kept small

Even fairly simple queries rarely take less than 2 microseconds per result row, so on a query with a join, a sort, and an aggregate you don't have a whole lot of room for improvement.

Sign up to request clarification or add additional context in comments.

4 Comments

Setting random_page_cost to a lower value is the standard way of forcing the planner to switch from hashjoins to index-based joins. Another way is lowering work_mem.
Well, that is one effect of making the change; but typically my main reason for making it is to better model actual costs, so that the plan that the planner thinks is cheapest really is. When random costs are high it tends to be faster to sequentially read the table into a hashtable in RAM and find rows in the hashtable, while if random access through the index is really not more expensive than sequential reads it is often cheaper to just use the index.
Of course all estimates are just educated guesses. And lying to the optimiser is a great way to force it into one direction. (random_page_cost=1.x is not a lie these days) I still think there is an underlying reason why the aggregate (x JOIN y) type of query shows this behaviour (there have been more similar questions here the last few months). There might be something wrong with the cost calculation for the aggregation / sorting in the outer query.
@wildplasser: We're drifting a little off-topic because I don't think 5us per row necessarily indicates a bad choice of plan for this query; but the problem you are referencing is real, and caused by correlations between values which the planner cannot yet recognize. People have been working on it, but no proposed design for a solution has been accepted yet.
2

It can be difficult to pick out the important figures from a raw EXPLAIN. Luckily, a long-time community member built a useful tool to tabulate it for us - I added a link to depesz.com.

Firstly, I can't see how changing shared_buffers would have any effect on any particular query and in any case you didn't provide values. It's a general value to change the overall balance of memory usage on the machine. You would tune it based on the overall workload rather than a particular query.

The three major parts of the query are the join, the sort and the aggregate. That seems reasonable.

I don't know what aggregate you're running (you didn't give the query). Presumably you're summarising over day/hour on play_timestamp (based on the sort). Do you need both levels?

For your fairly detailed sort (5 columns, two based on function-calls) as long as work_mem is big enough for the sort there's nothing much to do there.

So - we might be able to improve the join. It's possible that a partial index might help. Something like:

CREATE INDEX idx1 ON t_proof_of_play (device_endpoint_id) WHERE is_processed=0 AND is_revenue=1

Of course, nothing is free and you'll be paying the maintenance costs on this index whenever you update t_proof_of_play.

Even then, if you halve the join-time that is only going to reduce the overall time from 276ms to about 235ms. You're processing 50,000 rows in about 0.25 of a second - that's 5 micro-seconds per row.

2 Comments

My conclusion is that the date used for aggregation and the device_endpoint used for joining are more or less orthogonal. A hashjoin spoils any structure in a inner query. If both dimensions are sufficently indexed, forcing an index join in the inner query could help the outer sort/aggregation step to use available indexes, if any. (BTW: the aggregation in the OQ does allmost nothing: reducing the count from 50K to 49K8)
Thank you Richard Huxton for valuable insight
1

In complement to @Richard Huxton answer.

As stated in the documentation on Windows you cannot use a very high amount of shared_buffers:

Also, on Windows, large values for shared_buffers aren't as effective. You may find better results keeping the setting relatively low and using the operating system cache more instead. The useful range for shared_buffers on Windows systems is generally from 64MB to 512MB.

So you should set this parameter to 512MB, PostgreSQL will not be able to optimize his memory usage with more shared memory on Windows. In my mind Windows is not the best OS for a PostgreSQL server, anyway you could also try to tweak effective_cache_size to 50% or 75% of available memory (so maybe something like 2 or 4GB). This is a hint for PostgreSQL, telling it the OS will certainly use available RAM to store recently used files. This way your query optimizer will maybe decide that accessing disk data is not as costly as it usually think.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.