4

I have the following kind of json in my database:

{ 
    "_id" : "519817e508a16b447c00020e", "keyword" : "Just an example query", 
    "results" : 
    {
        "1" : {"base_domain" : "example1.com", "href" : "http://www.example1.com/"},
        "2" : { "base_domain" : "example2.com", "href" : "http://www.example2.com/"},
        "3" : { "base_domain" : "example3.com", "href" : "http://www.example3.com/"},
        "4" : { "base_domain" : "example4.com", "href" : "http://www.example4.com/"},
        "5" : { "base_domain" : "example5.com", "href" : "http://www.example5.com/"},
        "6" : { "base_domain" : "example6.com", "href" : "http://www.example6.com/"},
        "7" : { "base_domain" : "example7.com", "href" : "http://www.example7.com/"},
        "8" : { "base_domain" : "example8.com", "href" : "http://www.example8.com/"},
        "9" : { "base_domain" : "example9.com", "href" : "http://www.example9.com/"},
        "10" : { "base_domain" : "example10.com", "href" : "http://www.example10.com/"}
    } 
}

My goal is to have results for the following query:

> db.ranking.find({ $text: { $search: "http://www.example9.com"}})

It works when I create an index on all text fields

> db.ranking.ensureIndex({ "$**": "text" }))

But not when I create an index only over "results" field:

> db.ranking.ensureIndex( {"results" : "text"} )

Why?

1 Answer 1

4

The problem is that "results" is not a field, it's a sub-document. The syntax for creating an index on text fields for MongoDB requires either the notation for all fields, "$*", which you are using correctly, or a list of all text fields:

Create a text Index

You can create a text index on the field or fields whose value is a string or an array of string elements. When creating a text index on multiple fields, you can specify the individual fields or you can use wildcard specifier ($**).

http://docs.mongodb.org/manual/tutorial/create-text-index-on-multiple-fields/

In your case that would look like:

db.ranking.ensureIndex(
                           {
                             "keyword": "text",
                             "results.1.href": "text",
                             "results.1.href": "text",
                             "results.2.href": "text",
                             "results.3.href": "text",
                             "results.4.href": "text",
                             "results.5.href": "text",
                             "results.6.href": "text",
                             "results.7.href": "text",
                             "results.8.href": "text",
                             "results.9.href": "text",
                             "results.10.href": "text"
                           }
                       )
Sign up to request clarification or add additional context in comments.

5 Comments

Presumably this could be made a lot more efficient if results were an array.
Very interesting! Indexing the first href also worked for the rest of them: db.ranking.ensureIndex({"results.1.href": "text"}) Now, is there a way to retrieve directly the key and the value without looping through all the dictionary? @JohnBarça Yes, I wish it were an array!
@JohnBarça You mentioned using an array. Do you mean changing the db structure for this: { "_id" : "519817e508a16b447c00020e", "keyword" : "Just an example query", "results" : [{"base_domain" : "example1.com", "href" : "http://www.example1.com/"},{"base_domain" : "example2.com", "href" : "http://www.example2.com/"},...}] The index would be hashed or text too?
@JohnBarça Any additional hints?
I just asked another question since I ran into an Index key pattern too large error with code 67 because I have 100 results and not 10!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.