Slow array $push mongoDB query

Question

I have a very simple collection with no indexes except _id. Each document contains a single array field, mies. Current collection size is around 100 millions, and I can see the following anomalies in the profiler:

{
  "op": "update",
  "ns": "DB.links",
  "command": {
    "q": {
      "_id": "f1b54f37-7f92-4e75-9ec6-5329349ce792_eb370c8a-6f33-4989-aa59-a26e1c9df46c"
    },
    "u": {
      "$push": {
        "mies": {
          "$each": [
            {
              "$oid": "5e39d07bec34b8000e7f86b7"
            }
          ]
        }
      }
    },
    "multi": true,
    "upsert": true
  },
  "keysExamined": 0,
  "docsExamined": 0,
  "nMatched": 0,
  "nModified": 0,
  "upsert": true,
  "keysInserted": 1,
  "numYield": 0,
  "locks": {
    "Global": {
      "acquireCount": {
        "r": 2,
        "w": 2
      }
    },
    "Database": {
      "acquireCount": {
        "w": 2
      },
      "acquireWaitCount": {
        "w": 1
      },
      "timeAcquiringMicros": {
        "w": 19486143
      }
    },
    "Collection": {
      "acquireCount": {
        "w": 1
      }
    },
    "oplog": {
      "acquireCount": {
        "w": 1
      }
    }
  },
  "millis": 19490,
  "planSummary": "IDHACK",
  "execStats": {
    "stage": "UPDATE",
    "nReturned": 0,
    "executionTimeMillisEstimate": 0,
    "works": 2,
    "advanced": 0,
    "needTime": 1,
    "needYield": 0,
    "saveState": 0,
    "restoreState": 0,
    "isEOF": 1,
    "invalidates": 0,
    "nMatched": 0,
    "nWouldModify": 0,
    "nInvalidateSkips": 0,
    "wouldInsert": true,
    "fastmodinsert": false,
    "inputStage": {
      "stage": "IDHACK",
      "nReturned": 0,
      "executionTimeMillisEstimate": 0,
      "works": 1,
      "advanced": 0,
      "needTime": 0,
      "needYield": 0,
      "saveState": 0,
      "restoreState": 0,
      "isEOF": 1,
      "invalidates": 0,
      "keysExamined": 0,
      "docsExamined": 0
    }
  }
}

As you can see, a simple upsert with a single $push to array took 19 seconds. I believe most of the time was spent here:

      "timeAcquiringMicros": {
        "w": 19486143
      }

What should I check? How can I improve the performance?

As you're only querying on _id which is an default indexed field, then query is quiet ok, but what concerns is 100 millions of docs, maybe why don't you think of sharding or somehow archiving old records if not needed on much of reads ? — whoami - fakeFaceTrueSoul
– whoami - fakeFaceTrueSoul, Commented Feb 4, 2020 at 20:39
Yes, but due to business logic, the data should be accumulated for a period of time. I can see that index size for hot collections exceeds MongoDB's cache size. Is that the reason for slow locks? — silent-box
– silent-box, Commented Feb 4, 2020 at 20:42
I would say at least separate that collection from others & maintain it alone in a different cluster if possible(if you do not want to do sharding & want all docs to be on same cluster) - that way you can have the cluster size(RAM + disk) good enough for that collection & also high I/O... Check this documentation :: docs.mongodb.com/manual/tutorial/ensure-indexes-fit-ram — whoami - fakeFaceTrueSoul
– whoami - fakeFaceTrueSoul, Commented Feb 4, 2020 at 20:53

laventnc · Accepted Answer · 2021-10-15 16:15:02Z

While this doesn't directly provide a solution to your problem, I didn't see any else mention this as a potential cause of it. One of the reasons why your operation is slow, is due to how mongo handles this operation; specifically, in regards to mongo's replication design. If you don't care for the explanation, just skip to the bottom.

Per the link above:

Every operation that modifies the data on the primary is stored in a special capped collection called oplog (short for operations log). The oplog is asynchronously synchronized to all the secondaries which replay the operations to eventually be consistent with the primary.

For oplog to work, it needs to be idempotent. Given that $push is not, mongo converts this operation into a $set - replacing the entire array.

The bottom line is that whenever you push something to an array the entire array is replaced as far as replication goes.

Valijon · Accepted Answer · 2020-02-04 21:01:34Z

0

You cannot

MongoDB uses B-tree algorithm for indexing opeations like: insert, search, delete.

Algorithm   Average     Worst case
Space       O(n)        O(n)
Search      O(log n)    O(log n)
Insert      O(log n)    O(log n)
Delete      O(log n)    O(log n)

MongoDB needs to drill down ordered index to find a value. As _id is unique and default index, MongoDB needs 27 iteration to find a document in the worst case ( 2²⁷ for 134.217.728 [~19sec], 2²⁸ for 268.435.456 [~21sec], etc...)

You can improve search operation by creating compound index, but will penalize insert as MongoDB needs to update _id index + cound index.

answered Feb 4, 2020 at 21:01

Valijon

13.2k4 gold badges40 silver badges74 bronze badges

4 Comments

silent-box Over a year ago

Which compound index are you talking about? And how the search is related to 19 sec lock operation?

Valijon Over a year ago

Imagine you have only 100 documents and MongoDB need 1 sec for each iteration. If you don't have index, MonogDB scans from 1, 2, 3, ... 100 (100 secs) (known also as O(n)). But with B-tree you need only 6 tries (6 secs) (discussed here). Compound index discussed here

silent-box Over a year ago

I know what BTree is, and the compound index. Unfortunately it has nothing to do with the problem

Valijon Over a year ago

@silent-box I've answered to your main question: How can I improve the performance? and explained why it's not possible

Collectives™ on Stack Overflow

Slow array $push mongoDB query

2 Answers 2

Comments

You cannot

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

You cannot

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related