0

So I tried to use a function I created into a query and when I execute the query it runs forever. The function finds the customer that his orders had the biggest quantity of products in it. The query returns him along with any other customer that has ordered the same amount of products.

Here is the function:

CREATE OR REPLACE FUNCTION max_orderQty ()
RETURNS bigint
AS $$
     SELECT SUM(OrderQty)
     FROM Customer,SalesOrderHeader,SalesOrderDetail
     WHERE Customer.CustomerID = SalesOrderHeader.CustomerID
     AND SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID
     GROUP BY SalesOrderHeader.CustomerID
     ORDER BY SUM(OrderQty) DESC LIMIT 1;
$$ LANGUAGE SQL;  

Here is the the query:

SELECT SUM(OrderQty),SalesOrderHeader.CustomerID
FROM Customer,SalesOrderHeader,SalesOrderDetail
WHERE Customer.CustomerID = SalesOrderHeader.CustomerID
AND SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID
GROUP BY SalesOrderHeader.CustomerID
HAVING SUM(OrderQty) = max_orderQty();

Also this works and it solves the same thing:(in 220ms)

SELECT SUM(OrderQty),SalesOrderHeader.CustomerID
FROM Customer,SalesOrderHeader,SalesOrderDetail
WHERE Customer.CustomerID = SalesOrderHeader.CustomerID
AND SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID
GROUP BY SalesOrderHeader.CustomerID
HAVING SUM(OrderQty) = (SELECT SUM(OrderQty)
                        FROM Customer,SalesOrderHeader,SalesOrderDetail
                        WHERE Customer.CustomerID = SalesOrderHeader.CustomerID
                        AND SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID
                        GROUP BY SalesOrderHeader.CustomerID
                        ORDER BY SUM(OrderQty) DESC LIMIT 1);

So when I tried to split it to function/query it just runs forever and I cannot find the reason why.And I thought using a function instead would be faster.Is that wrong?

Any help is very appreciated. Thanks in advance.

1
  • Show us the execution plan (ideally from explain analyze but as your query runs "forever" that is not an option) Commented Jan 23, 2015 at 19:51

1 Answer 1

1

There is no reason to think that a function would be faster. Your problem is that the function is being called for each row in the output -- and the calculation is being re-done each time.

You should be able to fix this by declaring the function to be STABLE; the default is VOLATILE.

You can read about these nuances in the documentation.

EDIT:

You can force it to run faster by only calling the function once:

SELECT SUM(OrderQty),SalesOrderHeader.CustomerID
FROM Customer JOIN
     SalesOrderHeader
     ON Customer.CustomerID = SalesOrderHeader.CustomerID JOIN
     SalesOrderDetail
     ON SalesOrderHeader.SalesOrderID = SalesOrderDetail.SalesOrderID CROSS JOIN
     (SELECT max_orderQty() as maxoq) x
GROUP BY SalesOrderHeader.CustomerID
HAVING SUM(OrderQty) = MAX(maxoq);
Sign up to request clarification or add additional context in comments.

3 Comments

It still runs forever with stable using it like: $$ LANGUAGE SQL STABLE; I am forced to use a function because they asked me to.And I thought it would be faster,but god was I wrong.
Thank you that worked.(in 230ms) But I can't really understand why yours calls the function only once and mine doesn't work.
@DieCriminal . . . Given that the function is declared to be stable, this might suggest that a feature is missing from the Postgres optimizer. There should be a way to get the optimizer to recognize, in your version, that the function only needs to be called once. But SQL optimizers are complex and are not perfect.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.