1

I have the following records:

{ "_id" : ObjectId("55889370ba09474fd178d8b8"), "url" : "http://stackoverflow.com/questions/ask"} 
{ "_id" : ObjectId("55889370ba09474fd178d8b4"), "url" : "http://stackoverflow.com"}
{ "_id" : ObjectId("55889370ba09474fd178d8b2"), "url" : "http://espn.com"}

And I want to do an aggregation to get the count of each site by their root. Basically I want both of the first two records to fall under the same group (they have the same root).

I created an user defined function to transform the url into its root. My idea was to use the user defined function to first project the records (changing the url field) and then grouping by the url. The problem is that apparently user defined functions can't be used in aggregations. They can be used in where clauses in a projection but projections with where clauses can't be used in an aggregation.

Is there any way I can the aggregation I need?

EDIT:

Maybe to make the example more illustrative I should add that if I for example wanted to group by the root website and count them I would get something like:

{ "_id" : "http://stackoverflow.com", "count" : 2}
{ "_id" : "http://espn.com", "count" : 1}
6
  • You are essentially looking for the $project filter using $regex but currently the aggregation framework doesn't have this functionality, there is an open JIRA for it here SERVER-11947. Commented Jun 29, 2015 at 15:13
  • So there is no way to do it currently? No other alternative? Commented Jun 29, 2015 at 15:15
  • Doesn't this fit stackoverflow.com/a/16252753/4573999 ? Commented Jun 29, 2015 at 15:17
  • An alternative is to use Map-Reduce Commented Jun 29, 2015 at 15:17
  • @chridam Yeah, Map-Reduce might be the only way but I was looking for something simpler. Commented Jun 29, 2015 at 15:26

2 Answers 2

1

Try to use regex when aggregating. You might skip the user defined function for that purpose I think.

This question makes use of it for example.

In your particular case a workaround is described here. Not sure if that is what you want.

Otherwise I'm afraid you'd have to map-reduce it.

Sign up to request clarification or add additional context in comments.

2 Comments

Isn't the regex used in that case in the match clause? I don't want to filter any records. I just want to transform a field during the projection so that they can fall under the same key during the grouping.
True, but I thought maybe this workaround was of interest: stackoverflow.com/a/17493547/1566187 Can you confirm? Otherwise simply use map-reduce I'd say.
1

Here is a simple solution. Example data is:

> db.test.find()
{ "_id" : ObjectId("559178703535798edab41c36"), "text" : "aaaasfadf" }
{ "_id" : ObjectId("559178743535798edab41c37"), "text" : "bfasdfasdf" }
{ "_id" : ObjectId("559178783535798edab41c38"), "text" : "aasdfsdf" }
{ "_id" : ObjectId("5591787b3535798edab41c39"), "text" : "asdf" }
{ "_id" : ObjectId("5591787e3535798edab41c3a"), "text" : "csfd" }

I want to group items based on first letter of the string (you place you function that extracts base of the URL here):

db.test.group({
    $keyf : function(doc){
        return {
            key : doc.text.substring(0,1) // extract URL base here
        }
    },
    $reduce : function(curr, result){
        result.count++
    },
    initial : {
        count: 0
    }
})

The result is:

[
    {
        "key" : "a",
        "count" : 3
    },
    {
        "key" : "b",
        "count" : 1
    },
    {
        "key" : "c",
        "count" : 1
    }
]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.