3

I have a collection with a sub-document consisting of more than 40K records. My aggregate query takes about 300 secs. I have tried optimizing the same using compound as well as multi-key indexing, which completes in 180 secs.

I still require a reduced query time execution.

here is my collection:

{
    "_id" : ObjectId("545b32cc7e9b99112e7ddd97"),
    "grp_id" : 654,
    "user_id" : 2,
    "mod_on" : ISODate("2014-11-06T08:35:40.857Z"),
    "crtd_on" : ISODate("2014-11-06T08:35:24.791Z"),
    "uploadTp" : 0,
    "tp" : 1,
    "status" : 3,
    "id_url" : [
     {"mid":"xyz12793"},
     {"mid":"xyz12794"},
     {"mid":"xyz12795"},
     {"mid":"xyz12796"}
    ],
    "incl" : 1,
    "total_cnt" : 25,
    "succ_cnt" : 25,
    "fail_cnt" : 0
}

and following is my query

db.member_id_transactions.aggregate([ { '$match':
                           { id_url: { '$elemMatch': { mid: 'xyz12794' } } } },
                           { '$unwind': '$id_url' },
                           { '$match': { grp_id: 654, 'id_url.mid': 'xyz12794' } } ])

has anyone faced the same issue?

here's the o/p for aggregate query with explain option

{
    "result" : [ 
        {
            "_id" : ObjectId("546342467e6d1f4951b56285"),
            "grp_id" : 685,
            "user_id" : 2,
            "mod_on" : ISODate("2014-11-12T11:24:01.336Z"),
            "crtd_on" : ISODate("2014-11-12T11:19:34.682Z"),
            "uploadTp" : 1,
            "tp" : 1,
            "status" : 3,
            "id_url" : [
            {"mid":"xyz12793"},
            {"mid":"xyz12794"},
            {"mid":"xyz12795"},
            {"mid":"xyz12796"}
            ],
            "incl" : 1,
            "__v" : 0,
            "total_cnt" : 21406,
            "succ_cnt" : 21402,
            "fail_cnt" : 4
        }
    ],
    "ok" : 1,
    "$gleStats" : {
        "lastOpTime" : Timestamp(0, 0),
        "electionId" : ObjectId("545c8d37ab9cc679383a1b1b")
    }
}
0

1 Answer 1

3

One way to reduce the number of records being filtered further is to include the field grp_id, in the first $match operator.

db.member_id_transactions.aggregate([ 
{$match:{ "id_url.mid": 'xyz12794',"grp_id": 654 } },
{$unwind: "$id_url" },
{$match: { "id_url.mid": "xyz12794" } } 
])

See how the performance is now. Add grp_id to the index to get better response time.

The above aggregation query though it works, is unnecessary. since you are not altering the structure of the document, and you expect only one element in the array to match the filter condition, you could just use a simple find and project.

db.member_id_transactions.find(
{ "id_url.mid": "xyz12794","grp_id": 654 },
{"_id":0,"grp_id":1,"id_url":{$elemMatch:{"mid":"xyz12794"}},
 "user_id":1,"mod_on":1,"crtd_on":1,"uploadTp":1,
 "tp":1,"status":1,"incl":1,"total_cnt":1,
 "succ_cnt":1,"fail_cnt":1
}
)
Sign up to request clarification or add additional context in comments.

3 Comments

Agreed. Just to add. Quite particularly when you even "think" you need to use $unwind you "should" try to filter down the documents to your "minimal" matching set first, but that should be a general query rule. Clarifying the second statement, you don't need aggregate and therefore $unwind when you only expect "one" element in the array to match the filter condition. $elemMatch is also unnecessary here, as "id_url.mid": "xyz12794" will do. Same principle, in only "one" field being tested.
Nice Explanation. Have updated my answer based on your comments.(Removed the unnecessary $elemMatch operator.)
Thanks BatScream for the solution...tried the find query and it works like a charm.... thanks a ton for saving my day....

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.