3

UPDATE: the real purpose of this question is to find a FAST solution, I'm new to MongoDB so I thought it's fast if use a single query, however, any fast solution is okay.

Basically I'm looking for a solution to solve the following problem. Since there're hundreds of such operations at each moment, the operation needs to be FAST.

So I have a curves collection with documents like this:

{
    curveId: 12,
    values: [1, 2, 12, 7, ...]
}

Suppose I'd like to set a curve value at index 3 and result in:

{
    curveId: 12,
    values: [1, 2, 12, new_value, ...]
}

in case of no matched curveId, a new curve is created:

{
    curveId: 12,
    values: [null, null, null, new_value]
}

so I write this upsert query:

db.curves.update({
    curveId: 12
},{
    $set: { "values.3": new_value }
},{
    upsert: true
})

This works when there's a matched document. However, if there's no match, it will create a new document like this:

{
    curveId: 12,
    values: { '3': new_value }
}

The values is not an array, not what I was expecting.

I've googled quite some time but found no solution yet. Is it even possible to solve the problem with one query?

Thank you.

14
  • Does this answer your question? MongoDB: How do I update a single subelement in an array, referenced by the index within the array? Commented May 25, 2024 at 10:42
  • @Eric No, that question is to 'upsert an array element of an existing document', not the same scenario. Commented May 25, 2024 at 10:48
  • @cmgchess In this case, the result will be fine, the MongoDB will automatically insert null in the other elements before your specific index. So the point is: have to find a way to make the values array exists before $set to work properly. Commented May 25, 2024 at 10:58
  • mongoplayground.net/p/W5uwDXo3_3M this doesnt solve that case unfortunately. might have to improve the logic with more conditions Commented May 25, 2024 at 11:05
  • mongoplayground.net/p/aargpYQ5xWU maybe. notice the places where i have used n and n+1 n is the index Commented May 25, 2024 at 11:21

2 Answers 2

2

This method uses $concatArrays and $slice when the value array exists and is the minimum size needed. And when any of those conditions is not met, it uses $zip & $map to fill in missing values. So it has the benefit of not using $map when the requirements are met.

  • Btw, $map isn't a "slow" operation, at least not in this case or the one discussed in the comments.
  • Document retrieval for the update is much much slower in comparison.
  • I have posted a separate answer which uses $map every time, like in cmghess's comment, but with concatArrays & slice.
  • If nano-performance gains of not using using $map are necessary, then you should use multiple conditions with $switch-case which individually optimise each of the cases mentioned below (exists, size ok, size not ok, null, does not exist, etc.)

This Update Pipeline handles these cases:

[
  { "_id": "curveId exists and is larger", "curveId": 12, "values": [1, 2, 3, 7, 8, 9] },
  { "_id": "curveId exists and values is exact", "curveId": 12, "values": [1, 2, 3, 4] },
  { "_id": "curveId exists but values is short by 1", "curveId": 12, "values": [1, 2, 3] },
  { "_id": "curveId exists but values is short by 2", "curveId": 12, "values": [1, 2] },
  { "_id": "curveId exists but values is empty", "curveId": 12, "values": [] },
  { "_id": "curveId exists but values is null", "curveId": 12, "values": null },
  { "_id": "curveId exists but values does not exist", "curveId": 12 },
  { "_id": "curveId 99 doesn't exist" }
]

Update pipeline with aggregation expressions:

db.collection.update({ curveId: 12 },
[
  {
    $set: {
      values: {
        $let: {
          vars: {
            // index & new value to set
            idx: 3,
            new_val: 1000
          },
          in: {
            $cond: {
              if: {
                $and: [
                  { $isArray: "$values" },
                  { $lte: [ "$$idx", { $size: "$values" }] }  // "lte" is correct here
                ]
              },
              then: {
                $concatArrays: [
                  { $slice: ["$values", "$$idx"] },
                  ["$$new_val"],
                  {
                    $slice: [
                      "$values",
                      { $add: ["$$idx", 1] },
                      { $add: [{ $size: "$values" }, 1] }
                    ]
                  }
                ]
              },
              else: {
                $let: {
                  vars: {
                    vals_nulls: {
                      $map: {
                        input: {
                          $zip: {
                            inputs: [
                              { $ifNull: ["$values", []] },
                              { $range: [0, "$$idx"] }
                            ],
                            useLongestLength: true
                          }
                        },
                        in: { $first: "$$this" }
                      }
                    }
                  },
                  in: {
                    $concatArrays: [
                      { $slice: ["$$vals_nulls", "$$idx"] },
                      ["$$new_val"],
                      {
                        $slice: [
                          "$$vals_nulls",
                          { $add: ["$$idx", 1 ] },
                          { $add: [{ $size: "$$vals_nulls" }, 1] }
                        ]
                      }
                    ]
                  }
                }
              }
            }
          }
        }
      }
    }
  }
],
{ upsert: true, multi: true }
)

Note:

  • The new_value and index to set only needs to be put in the first $set -> values -> $let part
    • All other references to it are done are variables.
    • Use your language's let parameter if available instead of this; mongoose & playground don't have it
  • multi: true is only needed to demo all the updates scenarios in one execution
  • Use curveId: 99 (anything not 12) in the query to see the upsert behaviour when the document does not exist.
  • Notice the repetition which occurs in the else part with $values & $$vals_nulls.

Mongo Playground

Mongo Playground with index=0

Sign up to request clarification or add additional context in comments.

3 Comments

The script works except when idx is 0, in which case it'll raise the exception: [Third argument to $slice must be positive: 0]. Regardless of the issue, I write a script to test this aggregate performance, comparing with the 'find and upsert' method, the result seems very promising: with the curveId being indexed, aggregate is much faster than the upsert (double the speed if you ask), it's even much faster than a single updsert query. Does this suggest in general that aggregation is better than a simple update/upsert query ?
I've also updated this one with the fix for index=0.
I gotta correct my earlier statement that aggression is faster than single update, which was wrong. The two tests run different times hence wrong results.
1

This alternate uses $map every time, with $concatArrays and $slice. It has the advantage of being concise and much cleaner than my previous solution. And there's no real performance impact of using $map when document retrieval has already occurred.

Like before, an array of "values or nulls" is created which either fills in, creates, or pads null values up to the size requirement with the given index. Using $zip and $first: "$$this", it results in an array which either has values from values or null when missing or padded.

db.collection.update({ curveId: 12 },
[
  {
    $set: {
      values: {
        $let: {
          vars: {
            // index & new value to set
            idx: 3,
            new_val: 1000,
            vals_nulls: {
              $map: {
                input: {
                  $zip: {
                    inputs: [
                      { $ifNull: ["$values", []] },
                      { $range: [0, 3] }  // repeat index here :-(
                    ],
                    useLongestLength: true
                  }
                },
                in: { $first: "$$this" }
              }
            }
          },
          in: {
            $concatArrays: [
              { $slice: ["$$vals_nulls", "$$idx"] },
              ["$$new_val"],
              {
                $slice: [
                  "$$vals_nulls",
                  { $add: ["$$idx", 1] },
                  { $add: [{ $size: "$$vals_nulls" }, 1] }
                ]
              }
            ]
          }
        }
      }
    }
  }
],
{ upsert: true, multi: true }
)

Mongo Playground

Mongo Playground with index=0

7 Comments

This script works only when the idx falls in the current array range, for an index out of the current, say 50, it just append the new value at the end of the array, without inserting null for the missing elements.
It does work correctly. See the result for "curveId exists but values is short by 2" - with index 3 and input "values": [1, 2]; result is "values": [1, 2, null, 1000]. Here's an example with index 50: mongoplayground.net/p/gNJk0LV-25z Perhaps you forgot to update the index in the second usage where it says // repeat index here
I'm sorry, I missed the 2nd idx, it must be set in 2 places. Yes you're right, it's correct when the 2 idx are set properly.
Actually, both aggregations fail with index=0. I've updated this one to work correctly with idx=0. The only change is the last line { $add: [{ $size: "$$vals_nulls" }, 1] }. Will update the other. Which do you mean by "aggregation 1" and "aggregation 2" ? The other answer has a $cond separating the parts, this answer doesn't use $cond. So let's call it "aggregation without $cond" & "aggregation with $cond" :-)
Oh I made a terrible mistake in tests, the Aggregate-without-cond only runs for 10k times while other 3 tests runs for 20k times, no wonder it's much faster. The fixed tests result: Find+Update: 22.3s, Aggregate-with-cond: 13.2s, Aggregate-without-cond: 12.6s, Update-only: 10.6s, the result seems reasonable. The test script: pastebin.com/720w9hNL
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.