0

To be honest I really know sql but I'm kind of new to mongodb noSql so I'm a bit lost. I have made a pipeline that's just working fine. The point was to group by day and mindmapId to count number of user viewed it and sum watching time and save it into a collection in order to make request on it after.

here's sample of data MindMap

{
  "_id": "Yg5uGI3Iy0",
  "data": {
    "id": "root",
    "topic": "Main topic",
    "expanded": true
  },
  "theme": "orange",
  "_p_author": "_User$zqPzSKD7EM",
   "_created_at": {
    "$date": {
      "$numberLong": "1658497264836"
    }
  },
  "_updated_at": {
    "$date": {
      "$numberLong": "1661334292749"
    }
  }
}

MindmapView

{
  "_id": "qWR6HVIcvT",
  "startViewDate": {
    "$date": {
      "$numberLong": "1658669095261"
    }
  },
  "_p_user": "_User$VnrxG9gABO",
  "_p_mindmap": "MindMap$Yg5uGI3Iy0",
  "_created_at": {
    "$date": {
      "$numberLong": "1658669095274"
    }
  },
  "_updated_at": {
    "$date": {
      "$numberLong": "1658669095274"
    }
  }
}

Pipeline

[{
 $group: {
  _id: {
   day: {
    $dateToString: {
     format: '%Y-%m-%d',
     date: '$startViewDate'
    }
   },
   mindmapId: {
    $substr: [
     '$_p_mindmap',
     8,
     -1
    ]
   }
  },
  watchTime: {
   $sum: {
    $dateDiff: {
     startDate: '$_created_at',
     endDate: '$_updated_at',
     unit: 'second'
    }
   }
  },
  uniqueCount: {
   $addToSet: '$_p_user'
  }
 }
}, {
 $project: {
  _id: 1,
  total: {
   $size: '$uniqueCount'
  },
  watchTime: {
   $sum: '$watchTime'
  }
 }
}]

pipeline results

[{
  "_id": {
    "day": "2022-08-01",
    "mindmapId": "oGCQDQmaNK"
  },
  "total": 1,
  "watchTime": 7
},{
  "_id": {
    "day": "2022-08-11",
    "mindmapId": "7YlZ6FPwiD"
  },
  "total": 1,
  "watchTime": 21
},{
  "_id": {
    "day": "2022-08-15",
    "mindmapId": "7YlZ6FPwiD"
  },
  "total": 1,
  "watchTime": 13
},{
  "_id": {
    "day": "2022-07-25",
    "mindmapId": "7YlZ6FPwiD"
  },
  "total": 1,
  "watchTime": 3
},{
  "_id": {
    "day": "2022-08-01",
    "mindmapId": "YXa8omyChc"
  },
  "total": 2,
  "watchTime": 1306837
},{
  "_id": {
    "day": "2022-07-25",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 7
},{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60
},{
  "_id": {
    "day": "2022-08-06",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 0
},{
  "_id": {
    "day": "2022-08-11",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 69
},{
  "_id": {
    "day": "2022-08-10",
    "mindmapId": "oGCQDQmaNK"
  },
  "total": 1,
  "watchTime": 4
},{
  "_id": {
    "day": "2022-08-15",
    "mindmapId": "Yg5uGI3Iy0"
  },
  "total": 1,
  "watchTime": 9
},
...
]

However to exploit this data faster I need to include the mindmap author inside the result collection. The point is to group by day and mindmapId to count number of user viewed it and sum watching time and get the mindmap author and save it into a collection.

To do that I need to use $lookup but the result is kind of messy and the lookup act like a full join in sql. I've tried so much combination before this post.

Here's what I have mainly tried

[{
 $group: {
  _id: {
   day: {
    $dateToString: {
     format: '%Y-%m-%d',
     date: '$startViewDate'
    }
   },
   mindmapId: {
    $substr: [
     '$_p_mindmap',
     8,
     -1
    ]
   }
  },
  watchTime: {
   $sum: {
    $dateDiff: {
     startDate: '$_created_at',
     endDate: '$_updated_at',
     unit: 'second'
    }
   }
  },
  uniqueCount: {
   $addToSet: '$_p_user'
  }
 }
}, {
 $lookup: {
  from: 'MindMap',
  localField: '_objectId',
  foreignField: '_id.mindmapId',
  as: 'tempMindmapPointer'
 }
}, {
 $unwind: '$tempMindmapPointer'
}, {
 $match: {
  'tempMindmapPointer._id': '_id.mindmapId'
 }
}, {
 $project: {
  _id: 1,
  total: {
   $size: '$uniqueCount'
  },
  watchTime: {
   $sum: '$watchTime'
  },
  author: {
   $substr: [
    '$tempMindmapPointer._p_author',
    6,
    -1
   ]
  }
 }
}]

the $match doesn't work here it make me have no results If I remove $match it act like a full join user list with mindmap id list which I don't want

[{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60,
  "author": "zqPzSKD7EM"
},{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60,
  "author": "zqPzSKD7EM"
},{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60,
  "author": "zqPzSKD7EM"
},{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60,
  "author": "VnrxG9gABO"
},{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60,
  "author": "zqPzSKD7EM"
},{
  "_id": {
    "day": "2022-08-17",
    "mindmapId": "YXa8omyChc"
  },
  "total": 1,
  "watchTime": 60,
  "author": "x6kNvG2O0X"
},...
]

I have tried to switch localField: '_objectId' foreignField:'_id.mindmapId' values. I have also tried to make the lookup first and group by id{day,mindmapId,authorId} but I have never been able to make this compiling.

What could I do to make this request working ? I'm sure there is something to do with $match and $lookup

1 Answer 1

1

If I understand you correctly (since you didn't add the requested result), the simple option is:

db.MindmapView.aggregate([
  {$group: {
      _id: {
        day: {$dateToString: {format: "%Y-%m-%d", date: "$startViewDate"}},
        mindmapId: {$substr: ["$_p_mindmap", 8, -1]}
      },
      watchTime: {
        $sum: {
          $dateDiff: {startDate: "$_created_at", endDate: "$_updated_at", unit: "second"}
        }
      },
      uniqueCount: {$addToSet: "$_p_user"}
    }
  },
  {$project: {_id: 1, total: {$size: "$uniqueCount"}, watchTime: 1}},
  {$lookup: {
      from: "MindMap",
      localField: "_id.mindmapId",
      foreignField: "_id",
      as: "author"
    }
  },
  {$set: {author: {$first: "$author._p_author"}}}
])

See how it works on the playground example.

There is another option that may be a little more efficient, which is using the '$lookup' with a pipeline, to bring only the author from the MindMap collection instead of bringing the entire document and then filter it. In this case the $lookup stage will be:

  {
    $lookup: {
      from: "MindMap",
      let: {id: "$_id.mindmapId"},
      pipeline: [
        {$match: {$expr: {$eq: ["$$id", "$_id"]}}},
        {$project: {_p_author: 1, _id: 0}}
      ],
      as: "author"
    }
  }
Sign up to request clarification or add additional context in comments.

2 Comments

thanks for you help i don't really understand how it works but it works well. you saved my day :) But I don't understand how this{$lookup: { from: "MindMap", localField: "_id.mindmapId", foreignField: "_id", as: "author" } }, could find the correct author when I saw this I was conviced it would take the same author for every mindmap. When i've tried the exact same lookup i got about 10 authors per mindmap Is that because the $project was after the $lookup ?
The playground example can help to understand, as you can see the results of each step. Before the $lookup each document on the pipeline have _id.mindmapId that was created during the $group step. Each document on MindMap has one '_id. The $lookup` matches them and bring the matching document from MindMap into the relevant document on the pipeline. Since each document on MindMap have only one _p_author in it, we will have only one author on each document of the pipeline.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.