1

I have quite a massive query that I want to optimize, it consists of 1 table request and 5 table left joins. This query takes 0.3428 sec to complete ( Results: 4,340 total, Query took 0.3428 sec) I am working with about 10000 entries which will definitely grow.

Now the query by it self is not the problem it is the IN statements that is the biggest problem.

  1. I have 2 IN statements
  2. Both are in the WHERE statement
  3. For this specific page load both have a big amount of ID's, 3344 amount of id entries Example: (99, 1, 5, 8458, ...)
  4. Both IN statements will have the same set of 3344 ID's Example: ((cf.catid IN ( 99, 1, 5, 8458, ... ) AND cf.cid=c.id) OR p.category IN ( 99, 1, 5, 8458, ... ))

The query looks like this:

SELECT 
    p.id, c.id AS pCid, c.name AS cName, p.name, p.seo, 
    p.description AS pDescription, cd.description,
    p.category, p.archive, cf.catid, cf.pid, p.order_nr, 
    c.order_nr AS cOrder, c.seo AS cSeo, cat.name AS catName, 
    cat.order_id, pr.price, pr.sale_price, pr.sale_expiry,
    IF( pr.sale_price > 0, pr.sale_price, pr.price ) AS `oPrices`,
    pr.member_price, p.`set`, p.get_the_look,
    c.from_text_price, c.thumb, c.code AS colour_code, 
    p.code AS product_code, p.supplier_part_number, 
    p.oem_part_number, p.make, p.model, p.year, p.sub_model
FROM 
    products p
    LEFT JOIN category_featured cf ON p.id=cf.pid
    LEFT JOIN colours c ON c.pid=p.id
    LEFT JOIN colour_descriptions cd ON c.id=cd.colour_id
    LEFT JOIN category cat ON cat.id=p.category
    LEFT JOIN pricing pr ON pr.cid=c.id
WHERE 
    (
        (cf.catid IN ( .. 3344 ID entries .. ) AND cf.cid=c.id) OR p.category IN ( .. 3344 ID entries .. )
    )
    AND p.archive='0'

    AND p.status='1' AND c.status='1' 
    AND c.archive='0'
    AND cat.status IN (1,2)

GROUP BY `c`.`id`
ORDER BY `oPrices` DESC

Is there a better way to do a check for specific ID's in a table using the IN statement or maybe use a different check all together?

Speed is the main issue here, I want to achieve the best performance possible.

So far what I did and how some of the settings are set:

  • I created indexes for those tables (only the columns that are INT (integers) that are used in this query have indexes)
  • Some tables are MyISAM some are InnoDB (other tables that are not used in the query have a relation with a few tables that are in the query so they had to be InnoDB)
  • no relations between the tables in the query exist
  • to run the query I use PHP and MySQLI

Thanks

UPDATE!!!!

I noticed why the query is so slow the new column that I create, using the IF statement oPrices and then useage of "ORDER BY oPrices DESC" makes the query slow, once I remove it the query only takes 0.00009 of a sec which is amazing!!! But now I wont get the correctly ordered data and if even I do the ordering with PHP I will have to create a new pagination function which is not ideal.

8
  • Do you have composite indexes on the tables that are mentioned multiple times in your WHERE clause, e.g. products? Commented Sep 19, 2012 at 16:05
  • From where do these long lists of IDs come? Commented Sep 19, 2012 at 16:06
  • Hi Alain, Yes I have created indexes for all the table columns that are used in the WHERE statement because all of them are INT. The ID's come based on a page request made, so I know the ID's from other functions. Commented Sep 19, 2012 at 16:14
  • I had asked about composite indexes. Assuming that there isn't a user who is selecting 3,000 items from a drop down, if the values are in the database, can you recreate the list inside the database? Much better than way. Commented Sep 19, 2012 at 16:26
  • No we are not using composite indexes, they make everything even slower. The problem is that I cannot create a list in the database of those ID's. From the looks of things the best way is to insert into a temporary table as @devek said but as I havent used them I am not sure If I'll be able to use them correctlly. Commented Sep 19, 2012 at 16:46

1 Answer 1

2

IN can make a query very difficult to optimize as the index may not be used (you can verify this by using EXPLAIN). An alternative approach would be to load these IDs into a temporary table and then perform a JOIN.

From this link:

http://explainextended.com/2009/08/18/passing-parameters-in-mysql-in-list-vs-temporary-table/

We see that for a large list of parameters, passing them in a temporary table is much faster that as a constant list, while for small lists performance is almost the same.

Using a temporary table is the best way to pass large arrays of parameters in MySQL.

Sign up to request clarification or add additional context in comments.

1 Comment

Hi @davek I haven't used mysql temporary tables before but from a quick look on google it looks like it will require to run more queries just to insert data into a temporary table and of course to create one before hand. Is that correct?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.