42

I've got a collection "accounts" which contains documents similar to this structure:

{
    "email" : "[email protected]",
    "groups" : [
        {
            "name" : "group1",
            "contacts" : [
                { "localId" : "c1", "address" : "some address 1" },
                { "localId" : "c2", "address" : "some address 2" },
                { "localId" : "c3", "address" : "some address 3" }
            ]
        },
        {
            "name" : "group2",
            "contacts" : [
                { "localId" : "c1", "address" : "some address 1" },
                { "localId" : "c3", "address" : "some address 3" }
            ]
        }
    ]
}

Via

q = { "email" : "[email protected]", "groups" : { $elemMatch: { "name" : "group1" } } }
p = { "groups.name" : 0, "groups" : { $elemMatch: { "name" : "group1" } } }
db.accounts.find( q, p ).pretty()

I'll successfully get just the group of a specified account I'm interested in.

Question: How can I get a limited list of "contacts" within a certain "group" of a specified "account"? Let's suppose I've got the following arguments:

  • account: email - "[email protected]"
  • group: name - "group1"
  • contact: array of localIds - [ "c1", "c3", "Not existing id" ]

Given these arguments I'd like to have the following result:

{
    "groups" : [
        {
            "name" : "group1", (might be omitted)
            "contacts" : [
                { "localId" : "c1", "address" : "some address 1" },
                { "localId" : "c3", "address" : "some address 3" }
            ]
        }
    ]
}

I don't need anything else apart from the resulting contacts.

Approaches

All queries try to fetch just one matching contact instead of a list of matching contacts, for the sake of simplicity. I've tried the following queries without any success:

p = { "groups.name" : 0, "groups" : { $elemMatch: { "name" : "group1", "contacts" : { $elemMatch: { "localId" : "c1" } } } } }
p = { "groups.name" : 0, "groups" : { $elemMatch: { "name" : "group1", "contacts.localId" : "c1" } } }
not working: returns whole array or nothing depending on localId


p = { "groups.$" : { $elemMatch: { "localId" : "c1" } } }
error: {
    "$err" : "Can't canonicalize query: BadValue Cannot use $elemMatch projection on a nested field.",
    "code" : 17287
}


p = { "groups.contacts" : { $elemMatch: { "localId" : "c1" } } }
error: {
    "$err" : "Can't canonicalize query: BadValue Cannot use $elemMatch projection on a nested field.",
    "code" : 17287
}

Any help is appreciated!

1
  • 5
    Another for the viewers. This is how you "ask" here. Show what you have tried and give specific errors. Good way to ask. Commented Mar 11, 2015 at 9:08

2 Answers 2

49

2017 Update

Such a well put question deserves a modern response. The sort of array filtering requested can actually be done in modern MongoDB releases post 3.2 via simply $match and $project pipeline stages, much like the original plain query operation intends.

db.accounts.aggregate([
  { "$match": {
    "email" : "[email protected]",
    "groups": {
      "$elemMatch": { 
        "name": "group1",
        "contacts.localId": { "$in": [ "c1","c3", null ] }
      }
    }
  }},
  { "$addFields": {
    "groups": {
      "$filter": {
        "input": {
          "$map": {
            "input": "$groups",
            "as": "g",
            "in": {
              "name": "$$g.name",
              "contacts": {
                "$filter": {
                  "input": "$$g.contacts",
                  "as": "c",
                  "cond": {
                    "$or": [
                      { "$eq": [ "$$c.localId", "c1" ] },
                      { "$eq": [ "$$c.localId", "c3" ] }
                    ]
                  } 
                }
              }
            }
          }
        },
        "as": "g",
        "cond": {
          "$and": [
            { "$eq": [ "$$g.name", "group1" ] },
            { "$gt": [ { "$size": "$$g.contacts" }, 0 ] }
          ]
        }
      }
    }
  }}
])

This makes use of of the $filter and $map operators to only return the elements from the arrays as would meet the conditions, and is far better for performance than using $unwind. Since the pipeline stages effectively mirror the structure of "query" and "project" from a .find() operation, the performance here is basically on par with such and operation.

Note that where the intention is to actually work "across documents" to bring details together out of "multiple" documents rather than "one", then this would usually require some type of $unwind operation in order to do so, as such enabling the array items to be accessible for "grouping".


This is basically the approach:

db.accounts.aggregate([
    // Match the documents by query
    { "$match": {
        "email" : "[email protected]",
        "groups.name": "group1",
        "groups.contacts.localId": { "$in": [ "c1","c3", null ] },
    }},

    // De-normalize nested array
    { "$unwind": "$groups" },
    { "$unwind": "$groups.contacts" },

    // Filter the actual array elements as desired
    { "$match": {
        "groups.name": "group1",
        "groups.contacts.localId": { "$in": [ "c1","c3", null ] },
    }},

    // Group the intermediate result.
    { "$group": {
        "_id": { "email": "$email", "name": "$groups.name" },
        "contacts": { "$push": "$groups.contacts" }
    }},

    // Group the final result
    { "$group": {
        "_id": "$_id.email",
        "groups": { "$push": {
            "name": "$_id.name",
            "contacts": "$contacts" 
        }}
    }}
])

This is "array filtering" on more than a single match which the basic projection capabilities of .find() cannot do.

You have "nested" arrays therefore you need to process $unwind twice. Along with the other operations.

Sign up to request clarification or add additional context in comments.

5 Comments

I'm just curious about the 3rd argument within the first stage's matching criteria ("groups.contacts.localId": { "$in": [ "c1","c3", null ] }). This seems to be applied to all my groups within a certain account, thus checking group contacts I'm not interested in. There will always be at least one matching contact within the given group due to my business case. Based upon this condition, I would omit the 3rd argument, in stage 1, in favor of increased performance. Or am I wrong?
@cbopp Well if you consider the criteria you asked for then this is an exact representation.The null match occurs when the field is not actually present or actually contains that value in some form, single value or array. 99.999% of the time you don't want to ask for a null match and just accept that "supplied values" are all you need. I would strongly suggest you follow that and just use the distinct values without null.
Okay, thanks a lot. I'll mark your answer as the correct answer, since it is working as intended with a nice explanation. Everyone else ain't matching the email before unwinding, which probably results in unwinding my whole collection :'(
@cbopp There is a bit more wrong with the other responses than just that. I tried to let them understand that point, but sometimes people just don't want to listen. Anyhow, glad you see the road ahead and took something away from this. That's the point.
@Neil Lunn could you please query the same in Java,It would more useful,thanks in advance
7

You could use the $unwind operator of the aggregation framework. For example:

db.contact.aggregate({$unwind:'$groups'}, {$unwind:'$groups.contacts'}, {$match:{email:'[email protected]', 'groups.name':'group1', 'groups.contacts.localId':{$in:['c1', 'c3', 'whatever']}}});

Should give the following result:

{ "_id" : ObjectId("5500103e706342bc096e2e14"), "email" : "[email protected]", "groups" : { "name" : "group1", "contacts" : { "localId" : "c1", "address" : "some address 1" } } }
{ "_id" : ObjectId("5500103e706342bc096e2e14"), "email" : "[email protected]", "groups" : { "name" : "group1", "contacts" : { "localId" : "c3", "address" : "some address 3" } } }

If you want only one object, you can then use the $group operator.

1 Comment

That probably should have been a comment at best. You don't actually explain how to get to the result.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.