1

I have a MySQL database with a fairly large table where the products are. Each of them has its own id and categoryId field where there is a category id belongs to this product. Now I have a query that pulls out products from given categories such as:

SELECT * FROM products WHERE categoryId IN ( 1, 2, 3, 4, 5, 34, 6, 7, 8, 9, 10, 11, 12 )

Of course, come a WHERE clause and ORDER BY sort but not in this thing. Let's say that these products is 250k and the visits are over 100k per day. Under such conditions in the table slow_log registered weight of these queries with large generation time.

Do you have any ideas how to optimize the given problem?

Table engine is MyISAM.

1
  • You do have an index on categoryId, right? Commented Jun 13, 2014 at 20:30

1 Answer 1

2

Index on categoryId won't help in this case, IN (...) queries will produce sequence scan instead of index lookup anyway.

I would consider first redesigning the system to get rid of multiple category select and if it is not appropriate, caching query results.

For example, you can create a helper table items_category_groups(hash, item_id) and after client query on multiple categories hash their combined ids and lookup this table. If not found, make an expensive query and fill this table. If found, make a cheap query joining these tables. Other caching tools like memcached will work too.

Sign up to request clarification or add additional context in comments.

7 Comments

good idea but this query is used to pulls out product for given category and its subcategories (whole branch of this category) so it is imposible
Then caching is the only way I see, although the cache size will be large (250k * count of category combinations), so I would store them in DB, not in memcache or something.
If you just want to pull out products from ONE category (and its subcategories), you can create a helper table (category_id, item_id) and fill it with all category_id-item_id pairs including subcategories. This table won't be very large (say, if you have category nesing level of 4, it will contain not more than 1m*number of categories two-int rows, it's not a big deal). Then you will fetch products with a single fast query using indexes.
but this is the same. I will also must use IN on the helper table
No. For example, you have category 1 with subcategory 2 and 2 has a subcategory 3. Item 1 belongs to category 3. Then you will have 3 rows for one item in this table: 1 1 | 1 2 | 1 3 And if you want to select all items from, say, category 2, you write, roughly: SELECT items.* FROM items i, item_categories ic WHERE i.id=ic.item_id AND ic.category_id=2; And it will be equal to SELECT * FROM items WHERE category_id IN (2, 3); Of course, if you want to select items from two not related categories, this won't work, but in case of nested categories it will work just fine.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.