3

Currently I have a table with close to 1 million rows, which I need to query from. What I need to be able to do is stack rank packages on the number of products they include from a given list of product id's.

SELECT count(productID) AS commonProducts, packageID
FROM supply
WHERE productID IN (2,3,4,5,6,7,8,9,10) 
GROUP BY packageID 
ORDER BY commonProducts 
DESC LIMIT 10

The query works fine, but I would like to improve upon it. I tried a multi-column index on productID and packageID, but it seemed to seek more rows than just having a separate index for each of the columns.

MySQL Explain

select_type: SIMPLE
table: supply
type: range
possible_keys: supplyID
key: supplyID
key_len: 3
ref: null
rows: 996
extra: Using where; Using temporary; Using filesort

My main concern is that the query is using a temporary table and filesort. How could I go about optimizing this query? I presume that the biggest issues is count() and the ORDER BY on the results of count().

7
  • Do an "EXPLAIN" before "SELECT" to see how MySQL is processing the query for more insight. Commented Aug 16, 2011 at 16:44
  • @IOInterrupt How supplyID index look like? Commented Aug 16, 2011 at 17:03
  • @IOInterrupt Can one packageID have two productIDs with the same value. If yes, then is it common situation? Commented Aug 16, 2011 at 17:37
  • The supplyID index is just an INDEX on the supplyID column. A packageID will never have the same productID more than once. However the productID could be associated with many packageID's. Commented Aug 16, 2011 at 18:08
  • 1
    Can you give us the DDL for creating your supply table? Commented Aug 16, 2011 at 19:19

3 Answers 3

3

You can remove the temp table using a Dependent Subquery:

select * from 
  (
   SELECT count(productID) AS commonProducts, s.productId, s.packageID 
   FROM supply as s
   WHERE EXISTS
   (
      select 1 from supply as innerS 
        where innerS.productID in (2,3,4,5,6,7,8,9,10) 
          and s.productId = innerS.productId 
   )
   GROUP BY s.packageID
  ) AS t
ORDER BY t.commonProducts 
DESC LIMIT 10

The inner query links to the outer query and preserves the index. You'll find that any query that sorts on commonProducts, including the above query, will use a filesort, as count(*) is definitely not indexed. But fear not, filesort is just a fancy word for sort -- mysql can choose to use an effective in-memory sort -- and whether you did it now or as a mergesort on the way to an indexed temporary table, you'll have to pay for that sorting somewhere. However, this case is pretty good because filesort will stop sorting once it hits the LIMIT you've put in place. It will not sort the entire list of commonProducts.

Update

If this query is going to be run all the time, I would recommend (without getting too fancy) setting triggers on the supply table to update a legitimate table that tracks counters like this one.

Sign up to request clarification or add additional context in comments.

Comments

1

Creatng a temporary resulte set:

SELECT  TMP.*
FROM (  SELECT count(productID) AS commonProducts, packageID
        FROM supply
        WHERE productID IN (2,3,4,5,6,7,8,9,10)
        GROUP BY packageID 
) AS TMP 

ORDER BY commonProducts 
DESC LIMIT 10

1 Comment

That seems to have the same issue. In fact, it increases the rows searched according to EXPLAIN.
0

Perhaps it's not the most elegant way and I cannot guarantee it will be faster because everything depends on your particular data. But in some cases this gives much better results:

SELECT count(*) AS commonProducts, packageID
FROM (
    SELECT packageID FROM supply WHERE productID = 2
    UNION ALL
    SELECT packageID FROM supply WHERE productID = 3
    UNION ALL
    .
    .
    .
    SELECT packageID FROM supply WHERE productID = 10
) AS t
GROUP BY packageID
ORDER BY commonProducts DESC
LIMIT 10

1 Comment

Thank you for the suggestion, but the example I gave below is not how it would normally be run. Typically the productID's can be very far apart and "random".

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.