I have a large timeseries collection filled with millions of documents called "Events" that have the following structure
{
"_id": {"$oid": "123456"},
"class": "...",
"event_data": {...},
"ts": {"$date": "2025-01-01T01:00:00.000Z"}
}
With indexes on the ts field both ascending and descending. I am trying to run a query to pull the documents between two timestamps using Java. Like so:
ZonedDateTime lowerBound = ...;
ZonedDateTime upperBound = ...;
var query = Query.query(new Criteria().andOperator(
Criteria.where("ts").gte(lowerbound.toInstant().toEpochMilli()),
Criteria.where("ts").lt(upperbound.toInstant().toEpochMilli()),
)
var result = mongoTemplate.find(query, Events.class)
This query works... eventually after like 20 minutes since it scans the entire collections not using the indexes. When I use IntelliJ's debugger I can see the query getting formatted as follows:
Query: { "$and" : [{ "ts": { "$gte" : 1733852133000}}. { "ts" : { "$lt": 1733853933000}} ] }
Which I translate into the MongoDB console's code as such:
db.events.find({
"$and": [
{ "ts": { "$gte": 1733852133000}},
{ "ts": { "$lt": 1733853933000}},
]
})
Running this mimics what is happening in the Java code exactly, it technically works but is scanning the entire collection, which I can further see if a do a .explain() and only see the COLLSCAN stage.
However, if I write the following snippet, the code executes in under a couple seconds:
db.events.find({
"$and": [
{ "ts": { "$gte": new Date("2025-01-01T01:00:00Z)}},
{ "ts": { "$lt": new Date("2025-01-02T01:00:00Z)}},
]
})
Which also matches up when I run .explain() since I can see the FETCH and IXSCAN stages.
What is the underlying difference between these two queries? How can I get my Java code to translate into the second version and actually make use of the indexing we have in place?
Some things I have tried:
- Using
InstantorDateinstead ofLongas the values, these just have the same issue of doing aCOLSCAN - Adding a
hint("ts_1")to force the usage of an index. When running this in the MongoDB console it errors out withhint provided does not correspond to an existing indexwhich is completely false since I can see the index when running.getIndexes(). Running it in Java code seems to have the same issue as before where it does aCOLLSCAN
Edit 3/21/25
Here are some more details of what is going on. First just to show the indexes do exists:
db.events.getIndexes();
// returns
{
"key": {
"ts": 1
},
"name": "ts_1",
"v": 2
},
{
"key": {
"ts": -1
},
"name": "ts_-1",
"v": 2
}
Then the stages from .explain() on the working query (I have stripped out some details):
db.events.find({
"$and": [
{ "ts": { "$gte": new Date("2024-12-10T17:35:33Z")} },
{ "ts": { "$lt": new Date("2024-12-10T18:05:33Z")} }
],
}, {}, {}).explain();
// returns
{
...
"stages": [
{
"$cursor": {
"queryPlanner": {
"namespace": "database.system.buckets.events",
"indexFilterSet": false,
"parsedQuery": {
"$and": [
{
"_id": {
"$lt": {"$oid": "675882ed0000000000000000"}
}
},
{
"_id": {
"$gte": {"$oid": "67572a650000000000000000"}
}
},
{
"control.max.ts": {
"$_internalExprGte": {"$date": "2024-12-10T17:35:33.000Z"}
}
},
{
"control.min.ts": {
"$_internalExprGte": {"$date": "2024-12-09T17:35:33.000Z"}
}
},
{
"control.max.ts": {
"$_internalExprLt": {"$date": "2024-12-11T18:05:33.000Z"}
}
},
{
"control.min.ts": {
"$_internalExprLt": {"$date": "2024-12-10T18:05:33.000Z"}
}
}
]
},
...
"winningPlan": {
"stage": "FETCH",
"filter": {
"$and": [
{
"_id": {
"$lt": {"$oid": "675882ed0000000000000000"}
}
},
{
"_id": {
"$gte": {"$oid": "67572a650000000000000000"}
}
}
]
},
"inputStage": {
"stage": "IXSCAN",
"keyPattern": {
"control.min.ts": 1,
"control.max.ts": 1
},
"indexName": "ts_1",
"isMultiKey": false,
"multiKeyPaths": {
"control.min.ts": [],
"control.max.ts": []
},
"isUnique": false,
"isSparse": false,
"isPartial": false,
"indexVersion": 2,
"direction": "forward",
"indexBounds": {
"control.min.ts": ["[new Date(1733765733000), new Date(1733853933000))"],
"control.max.ts": ["[new Date(1733852133000), new Date(1733940333000))"]
}
}
}
}
And the stages from .explain() on the non-working query (also stripped out some details):
db.events.find({
$and: [
{ ts: { $gte: 1733852133000 } },
{ ts: { $lt: 1733853933000 } }
],
}, {}, {})
// returns
{
...
"stages": [
{
"$cursor": {
"queryPlanner": {
"namespace": "db.system.buckets.events",
"indexFilterSet": false,
"parsedQuery": {
},
"queryHash": "5F5FC979",
"planCacheKey": "5F5FC979",
"maxIndexedOrSolutionsReached": false,
"maxIndexedAndSolutionsReached": false,
"maxScansToExplodeReached": false,
"winningPlan": {
"stage": "COLLSCAN",
"direction": "forward"
},
"rejectedPlans": []
}
}
},
{
"$_internalUnpackBucket": {
"exclude": [],
"timeField": "ts",
"metaField": "data",
"bucketMaxSpanSeconds": 86400,
"assumeNoMixedSchemaData": true,
"eventFilter": {
"$and": [
{
"ts": {
"$gte": 1733852133000
}
},
{
"ts": {
"$lt": 1733853933000
}
}
]
}
}
}
]
}
Finally here is the Java Query object's value which is how I am deriving the above MongoDB query:
Query: { "$and" : [{ "ts" : { "$gte" : 1733852133000}}, { "ts" : { "$lt" : 1733853933000}}]}, Fields: {}, Sort: {}
Even if I used $date instead of the epoch milli for the filters the results are the same.
From these explain comments we can see the large difference between the two where the working one with new Date() is adding additional filtering on the _id column I am assuming this has something to do with how time series collections work but am not sure. In addition to that the filters get broken down and there is the IXSCAN input stage before the FETCH.
I have found another question on the MongoDB Forum that is extremely similar but with no real answer on to what is going on.. link
explain()outputs (perhaps liking to a gist or pastebin if needed)?hint()to the suggested query which did at least include one index but there's a ton more going on that I can show in theexplain()output if I can get to it.{ "$date": "2025-01-01T01:00:00Z" }is the Extended-JSON notation for Dates, just like in your example doc structure above. So it should have used the index - but as per your discussion comment, it doesn't. :-/.explain()output