For the application we are developing we need to allow our searches to support accents, be case insensitive and search for partial words. For example, given the product name "La Niña" in our collection, the following searches should be expected to return the entry:
- La Niña
- niña
- nina
- nin
- La nin
Currently I have tried two approaches, each with their appear apparent limitations, based on testing and some research:
-
- supports case insensitive and partial searches
- does not support accents such that, niña != nina
-
- support case insensitive, accents and partial phrases
- does not support partial words
Example regex search, as we have used:
function escapeRegExp(text) {
return text.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}
const escapedStr = this.escapeRegExp(searchTerm);
await Product.find({ name: new RegExp(`${escapedStr}`, 'i') });
Example text search, as we have used:
// On the schema
storeSchema.index({ name: 'text' });
// Searching:
await Product.find($text: { $search: searchTerm })
.collation({locale: 'en', strength: 1});
BTW We have set the schemas in question to use collation strength level 1.
Some approaches I am considering, if MongoDB doesn't provide a solution:
- shadow name field (not sure the right term?), with the accents removed
- a separate full text search engine
Can anyone help here?
Note, we are leveraging mongoose 5.9.5, with node 12.16.2 and mongodb 4.3.8 running in mongo cloud.