1

I am trying to create a mongo find query that takes a string input to use when searching for mongo documents, but I am unable to find a syntax that will give me all three requirements:

The document should be found if and only if:

  • the input string is contained within (or equal to) the field value.

  • ignores case sensitivity (matches if only difference is uppercase or lowercase of letters)

  • diacritic sensitive (input does NOT match for letters with different diacritics, ie. treats o different from ö)

Suppose I have these documents in my collection:

[
  {
    _id: <some object id>,
    title: 'home',
  },
  {
    _id: <some other object id>,
    title: 'HoMe',
 },
 {
    _id: <some other object id>,
    title: 'AllTheWayHome.',
 },
 {
    _id: <some other object id>,
    title: 'höme',
 }
]

The correct implementation for my project should return all the documents above except for the last one (since the diacritic makes it not match).

Here's What I've Tried...

1) With "RegExp"

When creating a new "RegExp" object and using that as the query object, I am able to do a "contains" search, and the "i" on the end makes it case insensitive.

const query = { title: new RegExp(`.*${searchText}*.`, 'i') }

return collection.find(query).toArray();

^ The issue with this approach is that I haven't found a way to make it diacritic sensitive.

2. Using a "Regex Literal"

Interestingly, if I try this it never matches any documents (I would expect it to work the same as "new RegExp").

const query = { title: /.*${searchText}*./i }

return collection.find(query).toArray();

^ This regex seems to match literally anything and returns all documents for every possible value of searchText.

3. Using "$text" and "$search"

After looking through the Mongo docs I found this handy "$text" syntax that could be used so I tried a query like this:

const query =  {
  $text: {
    $search: searchText,
    $caseSensitive: false,
    $diacriticSensitive: true
  }  
}

return collection.find(query).toArray();

^ This one, however, doesn't seem to do the "contains" search that I'm looking for. (eg. "home" does not match with "AllTheWayHome." when it should).

After reading through some answers on this question, I found an interesting comment with a lot of upvotes saying that this "$text, $search" syntax CANNOT do "contains" searches, period. (is this still true?)

Note: I have added a text index on the "title" field, but I am getting the exact same results still (only matches for exact match strings, ignore diacritics).

no, infact text operator does not allow to execute "contains", so it will only return exact word match, the only option currently as of 3.0 is to use regex , i.e. db.users.find( { username:/son/i } ) this one looksup every user containing "son" (case-insenstive)


So, in conclusion I am totally baffled and would appreciate any advice on how to achieve the goal I'm going for here. 🙏

Thanks!

1 Answer 1

1

I tried the third approach and it is working for me.

Only thing i think you may have missed could be adding an text index on title field.

From mongo docs :-

$text performs a text search on the content of the fields indexed with a text index.

$text docs

Add the index using below command and try to run the query.

db.collection.createIndex( { title: "text" } )

EDIT:-

  • These are the documents i tried with.

enter image description here

  • For testing purpose i inserted records in student collection. HEre is the screenshot of indexes on this collection.

enter image description here

  • And here is the query and resultset. It included All the way home and exluded höme.

enter image description here

Sign up to request clarification or add additional context in comments.

8 Comments

Can you please share a working example? I have already added the index on this field and am still getting the results I described above. When I try to add your code I get an error saying the index already exists.
I have updated the question a bit. Can you confirm that it does a "contains" search and that the input "home" finds the document with title "AllTheWayHome."?
Normally I share working code by creating mongo playground. But this kind of query is not supported by mongo playground. So I couldn't share it.
Yeah, I tried without spaces and its not working. Partial/fuzzy search is not yet supported by mongo. There is a relevant improvement request you can watch/upvote in the MongoDB issue tracker: jira.mongodb.org/browse/SERVER-15090 to support partial word match.
You are Welcome @Jim
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.