0

Optimizing MySQL LIKE Query with Pattern Matching for Large Dataset (20M+ records)

I'm struggling with performance issues in MySQL while searching through a large table containing over 20 million records. My current query using LIKE operator is timing out:

SELECT * FROM my_table WHERE columnName LIKE '%key%';

Current Setup:

  • Database: MySQL
  • Table Size: ~20 million records
  • Search Pattern Requirements:
    • '%keyword%' (contains)
    • 'keyword%' (starts with)
    • '%keyword' (ends with)
  • Search keywords can include:
    • Numbers
    • Alphabets
    • Special characters (- and _)

What I've tried:

  1. Implemented FULLTEXT SEARCH:
ALTER TABLE my_table ADD FULLTEXT(columnName);
SELECT * FROM my_table WHERE MATCH(columnName) AGAINST('*keyword*' IN BOOLEAN MODE);
  1. Configured ngram parser:
ngram_token_size=3

Issues Faced:

  • FULLTEXT SEARCH doesn't provide accurate results for all keyword patterns
  • Regular LIKE queries are too slow
  • Need to maintain accuracy while improving performance with FULL TEXT SEARCH as well
  • ngram=3 is slowing the query for few keywords

Questions:

  1. How can I tune FULLTEXT SEARCH to achieve accurate results similar to LIKE '%keyword%'?
  2. Are there any alternative approaches or indexing strategies for this use case?
  3. What would be the optimal configuration for ngram parser to handle all these patterns?

Any help on optimizing this search while maintaining accuracy would be greatly appreciated.

11
  • Your question is almost a duplicate of one I answered here: stackoverflow.com/a/49810142/20860. The short answer is you can't use MySQL's FULLTEXT indexing to satisfy all your requirements. You may be able to do it using some other product, for example Elastic Search. Read my answer for details. Commented Apr 13 at 18:28
  • Is there anything else in the 20+ million records which you can use to reduce the number of records the LIKE comparison needs to be applied to? Something like a date, if you only need to look in records of the last 5 years. Or anything like that? Maybe use multiple columns? Basically anything that would prevent a full search on all records, which will indeed be slow. Anything in the search pattern that could help? Are you looking for whole words, for instance? Commented Apr 14 at 8:38
  • 1
    Pattern matching is for finding a pattern in any text data. Fulltext search on the other hand assumes that you have proper, human text, like in a book or most websites that consist of words. As these techniques serve a different purpose, you have to choose the one that fits your purpose. We do not know how your text or searched keywords look like, so it is difficult for us to suggest any solution. Commented Apr 14 at 9:06
  • Thanks @BillKarwin !! Can't switch to Elastic Search etc. due to unavoidable decisions. Need to find some solution in MySQL only. Commented Apr 14 at 10:21
  • Any clarification on the "unavoidable decisions"? What are your restrictions? Can you add tables in MySQL? Can you work on the code that uses the result? What language is that code in? Clearly more information is needed for us to really help you. Commented Apr 14 at 10:32

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.