2

I am running a quite ordinary MongoDB query, nothing really complex or special and I am wondering if the time it takes (> 1 sec) is normal or if there's something wrong with my indexes.

I provided an index for this specific query and explain() also tells me it is used, but it makes a full scan of the collection each time and slows down the whole webpage by > 1 sec.

The query:

db.tog_artikel.find({"art_filter":{"$exists":false},"$where":"this._id == this.art_hauptartikelnr"})

Explained:

> db.tog_artikel.find({"art_filter":{"$exists":false},"$where":"this._id == this.art_hauptartikelnr"}).explain()
{
    "cursor" : "BtreeCursor art_filter_1_art_hauptartikelnr_1",
    "nscanned" : 21306,
    "nscannedObjects" : 21306,
    "n" : 21306,
    "millis" : 1180,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "isMultiKey" : false,
    "indexOnly" : false,
    "indexBounds" : {
        "art_filter" : [
            [
                null,
                null
            ]
        ],
        "art_hauptartikelnr" : [
            [
                {
                    "$minElement" : 1
                },
                {
                    "$maxElement" : 1
                }
            ]
        ]
    }
}

The index:

{
   "v": 1,
   "key": {
     "art_filter": 1,
     "art_hauptartikelnr": 1 
  },
   "ns": "togshop.tog_artikel",
   "background": true,
   "name": "art_filter_1_art_hauptartikelnr_1" 
}

Why is the full collection scanned every time? Why is isMultiKey false and how can I optimize this query/index?

Environment is a standalone server, MongoDB 2.0.1, 64-Bit Linux, accessed from PHP w/ php-mongo 1.2.6

1 Answer 1

8

Why is the full collection scanned every time?

It is not. It is going through the index:

"cursor" : "BtreeCursor art_filter_1_art_hauptartikelnr_1",

This means that the index "art_filter_1_art_hauptartikelnr_1" is used to satisfy the $exists condition.

If that filter is not very selective (i.e. there are many records that satisfy it), the query will still take a lot of time.

Why is isMultiKey false

It is false because no multikey index was used. A multikey index is an index that includes fields with arrays as values. It has nothing to do with the index being composite (i.e. having multiple fields).

 $where":"this._id == this.art_hauptartikelnr"

Your second condition is a Javascript expression, and an index cannot be used here (since the query analyzer does not understand what you are doing). Even if it did, you would need an index that also included _id.

how can I optimize this query/index?

Denormalize your data to have a new field "idIsHauptArtikelNr" with values true or false. Create an index on (art_filter, idIsHauptArtikelNr) , and replace your query with

  { art_filter :{ $exists :false}, idIsHauptArtikelNr : true }
Sign up to request clarification or add additional context in comments.

4 Comments

I'll add to that that although $exists does use an index these days it still has to walk most of the b-tree to satisfy it. In other words, you managed to use two of the worst operations ($exists and $where) in one query in terms of performance.
@RemonvanVliet: I was about to mention that, but then I thought a $exists:false should be relatively efficient on a non-sparse index (while a $exists:true does have to walk most of the tree). YMMV depending on data skew.
Very true. $exists:false is the more efficient approach due to the early out in tree walks. Regardless it's generally better to avoid $exists if you can and $where pretty much always ;)
Thanks a lot for your helpful reactions! :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.