1

I'm developing a monitoring system for our company's production system. This means the nature of the data I'll be storing is time-series. I picked MongoDB for this purpose, after reviewing several other databases. Events from the production system will arrive all the time, but I intend to store events in a 10-minute interval document. Eventually, documents in the collection will look like this:

{
   _id: '04/25/2015 13:00',
   event1_count : 130,
   event2_count : 50,
   event3_count : 200
},

{
    _id: '04/25/2015 13:10',
    event1_count : 230,
    event2_count : 20,
    event3_count : 400
}

The document _id: '04/25/2015 13:00' simply means it has all the events the arrived between 04/25/2015 13:00 and 04/25/2015 13:10.

Ultimately, I'll want different reports to run on the data. For example - count of events within the last 20 minutes. The result I would like to get for event count in last 20 minute is:

{
event1_count : 360,
event2_count : 70,
event3_count : 600
}

My question - is there a way to aggregate multiple fields from different documents, in one query?

BTW - it's important for me to keep the data at a 10 minute interval, because other reports will need that time resolution.

1
  • In my question you can see 2 documents. Each document has several fields - event1_count, event2_count etc. I'd like to sum event_count1 from both documents, and the same goes for event_count2. So the result should be sum of event_count1, sum of event_count2 and so on. I'm looking for a way to do that in the one query, assuming I'll have plenty of other events, and not only 3 as in my example. Commented Apr 25, 2015 at 16:12

1 Answer 1

2

Yes it's indeed possible. Suppose your collection will be storing the documents in the above-mentioned structure, you could modify the structure by adding another field say date which stores the _id as an ISODate, rather than the string timestamp so that you can do the aggregation using Date operators. To do the conversion, you can use mongo's forEach() cursor method to do an atomic update with the $set operator:

db.collection.find().forEach(function (doc){
    var dateObject = new Date(doc._id);    
    db.collection.update({_id: doc._id}, { $set: { date: dateObject } });               
});

The above will create an extra field date in your documents that contains an ISODate object representation of the _id string.

Suppose you now have the following sample documents in your collection after the update above:

/* 0 */
{
    "_id" : "04/25/2015 13:00",
    "event1_count" : 130,
    "event2_count" : 50,
    "event3_count" : 200,
    "date" : ISODate("2015-04-25T13:00:00.000Z")
}

/* 1 */
{
    "_id" : "04/25/2015 13:10",
    "event1_count" : 230,
    "event2_count" : 20,
    "event3_count" : 400,
    "date" : ISODate("2015-04-25T13:10:00.000Z")
}

/* 2 */
{
    "_id" : "04/25/2015 13:20",
    "event1_count" : 240,
    "event2_count" : 30,
    "event3_count" : 350,
    "date" : ISODate("2015-04-25T13:20:00.000Z")
}

/* 3 */
{
    "_id" : "04/25/2015 13:30",
    "event1_count" : 180,
    "event2_count" : 60,
    "event3_count" : 500,
    "date" : ISODate("2015-04-25T13:30:00.000Z")
}

The following aggregation pipeline will give you the desired result, based on the 20 minute interval:

var interval = 20,
    pipeline = [
    { 
        "$group": {
            "_id": {
                "year": { "$year": "$date" },
                "dayOfYear": { "$dayOfYear": "$date" },
                "interval": {
                    "$subtract": [ 
                        { "$minute": "$date" },
                        { "$mod": [{ "$minute": "$date" }, interval ] }
                    ]
                }
            },
            "event1_count": { "$sum": "$event1_count" },
            "event2_count": { "$sum": "$event2_count" },
            "event3_count": { "$sum": "$event3_count" }
        }
    },
    {
        "$project": {
            "_id": 0,
            "event1_count": 1,
            "event2_count": 1,
            "event3_count": 1
        }
    }
];

db.collection.aggregate(pipeline);

Output:

/* 0 */
{
    "result" : [ 
        {
            "event1_count" : 420,
            "event2_count" : 90,
            "event3_count" : 850
        }, 
        {
            "event1_count" : 360,
            "event2_count" : 70,
            "event3_count" : 600
        }
    ],
    "ok" : 1
}
Sign up to request clarification or add additional context in comments.

1 Comment

@assafm No worries :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.