0

I am wondering if i can use sort before grouping in this type of query because i want the sort stage to use the index on samples.timestamp1

mydb1.mongodbbucketnocpu.aggregate(
  [
      {
          "$match": {
              "samples.timestamp1": {"$gte": datetime.strptime("2010-01-01 00:00:00", "%Y-%m-%d %H:%M:%S"),
                        "$lte": datetime.strptime("2015-01-01 01:55:00", "%Y-%m-%d %H:%M:%S")},

          }
      },
{ "$unwind": "$samples" },

      {
          "$group": {

              "_id": {"$dateToString": {"format": "%Y-%m-%d %H", "date": "$samples.timestamp1"}},
              "max_id13": {
                  "$max": "$samples.id13"
              }
          }
      },
      {
          "$project": {
              "_id": 0,
              "day": "$_id",
              "max_id13": 1
          }
      },
      {"$sort": {"hour": -1}},
     { "$limit": 5}


    ]
)

My data contains about 96k documents that contain 12 subdocuments each.

{
    "_id" : ObjectId("607f185f2a477a621641cded"),
    "nsamples" : 12,
    "samples" : [
        {
            "id1" : 3758,
            "id6" : 2,
            "id7" : -79.09,
            "id8" : 35.97,
            "id9" : 5.5,
            "id10" : 0,
            "id11" : -99999,
            "id12" : 0,
            "id13" : -9999,
            "c14" : "U",
            "id15" : 0,
            "id16" : 99,
            "id17" : 0,
            "id18" : -99,
            "id19" : -9999,
            "id20" : 33,
            "id21" : 0,
            "id22" : -99,
            "id23" : 0,
            "timestamp1" : ISODate("2010-01-01T00:00:00Z"),
            "timestamp2" : ISODate("2009-12-31T19:05:00Z")
        },
        {
            "id1" : 3758,
            "id6" : 2,
        .
        .
        .

When i tried using {"$sort": {"samples.timestamp1": -1}}, before group stage my output was thisSort exceeded memory limit of 104857600 bytes Is ti possible to use sort before group?What do you think i should do to optimize my query?

8
  • Can you provide some sample data? Commented May 13, 2021 at 20:01
  • If you want the items sorted within the documents generated by $group, sort before. If you want the generated documents in a particular order, sort after. Commented May 13, 2021 at 20:10
  • @WernfriedDomscheit Sure! Commented May 13, 2021 at 20:33
  • @joe Yes i want the generated documents in particular order but i also want to optimize the query so the sort can use the index Commented May 13, 2021 at 20:37
  • 1
    There is no way for an index to help with the documents that are dynamically generated by a group stage. Commented May 13, 2021 at 20:48

1 Answer 1

1

Looks like you are only interested in id13 and timestamp1, so as first stage I would suggest this:

db.mongodbbucketnocpu.aggregate([
  {$match: ... },
  {
    $set: {
      samples: {
        $map: {
          input: "$samples",
          in: { id13: "$$this.id13", timestamp1: "$$this.timestamp1" }
        }
      }
    }
  },
  { $unwind: "$samples" },
  ...
])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.