1

I have to resolve a problem in my class about query optimization in postgresql.

I have to optimize the following query.

"The query determines the yearly loss in revenue if orders just with a quantity of more than the average quantity of all orders in the system would be taken and shipped to customers."

select  sum(ol_amount) / 2.0 as avg_yearly
from    orderline, (select   i_id, avg(ol_quantity) as a
            from     item, orderline
            where    i_data like '%b'
                 and ol_i_id = i_id
            group by i_id) t
where   ol_i_id = t.i_id
    and ol_quantity < t.a

Is it possible through indices or something else to optimize that query (Materialized view is possible as well)?

Execution plan can be found here. Thanks.

8
  • The problem is the like '%b'. You can't index that query because you are telling SQL to find something that has a 'b' at the end and you don't know how start Commented Dec 10, 2014 at 20:06
  • But according with execution plan the sequential scan that represents this condition is very fast so i think that the problem isn't here, i guess. Commented Dec 10, 2014 at 20:18
  • 1
    Are you trying to optimize the query or answer the question with the best query? Those are two very different answers. Commented Dec 10, 2014 at 20:20
  • Well its very hard to tell where the problem is without any sample data and espected result. I'm trying to guess here. Commented Dec 10, 2014 at 20:21
  • Also you should post table(s) structure Commented Dec 10, 2014 at 20:22

2 Answers 2

1

first if you have to do searches from the back of data, simply create an index on the reverse of the data

create index on item(reverse(i_data);

Then query it like so:

select  sum(ol_amount) / 2.0 as avg_yearly
from    orderline, (select   i_id, avg(ol_quantity) as a
            from     item, orderline
            where    reverse(i_data) like 'b%'
                 and ol_i_id = i_id
            group by i_id) t
where   ol_i_id = t.i_id
    and ol_quantity < t.a
Sign up to request clarification or add additional context in comments.

1 Comment

also, you may have a cartesian join (or be missing a join) -- you have 2 tables in one of the froms, yet no joining columns-- this could be correct or incorrect-- without seeing your data layout, there's no way of knowing.
0

Remember that making indexes may not speed up the query when you have to retreive something like 30% of the table. In this case bitmap index might help you but as far as I remember it is not available in Postgres. So, think which table to index, maybe it would be worth to index the big table by ol_i_id as the join you are making only needs to match less than 10% of the big table and small table is loaded to ram (I might be mistaken here, but at least in SAS hash join means that you load the smaller table to ram).

You may try aggregating data before doing any joins and reuse the groupped data. I assume that you need to do everything in one query without explicitly creating any staging tables by hand. Also recently, I have been working a lot on SQL Server so I may mix the syntax, but give it a try. There are many assumptions I have made about the data and the structure of the table, but hopefully it will work.

;WITH GrOrderline (
  SELECT ol_i_id, ol_quantity, SUM(ol_amount) AS Yearly, Count(*) AS cnt
  FROM orderline 
  GROUP BY ol_i_id, ol_quantity
),
WITH AvgOrderline (
  SELECT 
    o.ol_i_id, SUM(o.ol_quantity)/SUM(cnt) AS AvgQ
  FROM GrOrderline AS o 
  INNER JOIN item AS i ON (o.ol_i_id = i.i_id AND RIGHT(i.i_data, 1) = 'b')
  GROUP BY o.ol_i_id
)
  SELECT SUM(Yearly)/2.0 AS avg_yearly
  FROM GrOrderline o INNER JOIN AvgOrderline a ON (a.ol_i_id = a.ol_i_id AND o.ol_quantity < a.AvG)

1 Comment

In fact, creating a index on ol_i_id column improve a lot the performance for this query :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.