7

I'm trying to count distinct values of multiple fields By one MongoDB Aggregation query.

So here's my data:

{
    "car_type": "suv",
    "color": "red",
    "num_doors": 4
},
{
    "car_type": "hatchback",
    "color": "blue",
    "num_doors": 4
},
{
    "car_type": "wagon",
    "color": "red",
    "num_doors": 4
}

I want a distinct count of each field:

distinct_count_car_type=3
distinct_count_color=2
distinct_count_num_doors=1

I was able to group multiple fields and then do a distinct count but it could only give me a count on the first field. Not all of them. And also it's a large set of data.

2 Answers 2

5

Running the following aggregate pipeline should give you the desired result:

db.collection.aggregate([
    {
        "$group": {
            "_id": null,
            "distinct_car_types": { "$addToSet": "$car_type" },
            "distinct_colors": { "$addToSet": "$color" },
            "distinct_num_doors": { "$addToSet": "$num_doors" }
        }
    },
    {
        "$project": {
            "distinct_count_car_type": { "$size": "$distinct_car_types" },
            "distinct_count_color": { "$size": "$distinct_colors" },
            "distinct_count_num_doors": { "$size": "$distinct_num_doors" }
        }
    }
])
Sign up to request clarification or add additional context in comments.

3 Comments

nice, and you can also add _id: 0 in projection ;).
This works. Thanks! It slows down a little but not too much. I might need to make a new question but what if those objects consisted of an dynamic number of fields(meaning I wouldn't be able to hard code the field name ahead of time). Is there a way to get distinct counts on any number of fields in my map? How might I do this?
@Deckard see below.
1

You're looking for the power of ... $objectToArray!

db.foo.aggregate([
  {$project: {x: {$objectToArray: "$$CURRENT"}}}
  ,{$unwind: "$x"}
  ,{$match: {"x.k": {$ne: "_id"}}}
  ,{$group: {_id: "$x.k", y: {$addToSet: "$x.v"}}}
  ,{$addFields: {size: {"$size":"$y"}} }
                    ]);

This will yield:

{ "_id" : "num_doors", "y" : [ 4 ], "size" : 1 }
{ "_id" : "color", "y" : [ "blue", "red" ], "size" : 2 }
{
    "_id" : "car_type",
    "y" : [
        "wagon",
        "hatchback",
        "suv"
    ],
    "size" : 3
}

You can $projector $addFieldsas you see fit to include or exclude the set of unique values or the size.

6 Comments

Unfortunately I'm using mongodb 3.2 and $objectToArray I think according to manual was introduced in 3.4.
Correct. I'd recommend the upgrade to 3.4.4.
So I upgraded & tried your solution.The issue I now have is those dynamic fields that I wanted to do a distinct count on are actually in an array because they where from a $lookup from another collection. The collection at the top of the question is in an array on my results because of a $lookup. So when I try to do an $objectToArray on the field I get "$objectToArray requires a document input, found: array". And I tried $arrayToObject first to maybe be able to then call $objectToArray on it and I get "$arrayToObject requires an object keys of 'key' and 'v'. Found incorrect number of keys:5".
At a high level I see $map in concert with $objectToArray achieving your goal. Perhaps you should repost another question with this high level change?
I posted the new question here stackoverflow.com/questions/46591045/…
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.