Use MongoDB field value to fill in separate blank field

Question

Following the guidance here, I am using Mongo 4.4 to try and find all documents where field_A is null and fill field_A with the value of field_B.

I tried:

db.collection.updateMany({field_A:{$nin:[null,""]}},[{"$set": {field_A : "$field_B" }}])

This returns:

{ "acknowledged" : true, "matchedCount" : 176394, "modifiedCount" : 9952 }

I don't understand why most of them aren't modifying . . .?

Despite the warnings against it, I also tried running this without the square brackets in the second half of the query:

db.collection.updateMany({field_A:{$nin:[null,""]}},{"$set": {field_A : "$field_B" }})

But that just replaces field_A with the literal phrase "$field_B".

Can anyone explain what I'm doing wrong? Thanks!

UPDATE: Per @prasad_'s suggestion, I tried:

db.collection.updateMany({field_A:null},[{"$set": {field_A : "$field_B" }}])

This worked fine. All matched documents updated.

However, I still have documents where field_A is not truly null but rather "". I tried:

db.collection.updateMany({field_A:""},[{"$set": {field_A : "$field_B" }}])

And I got:

{ "acknowledged" : true, "matchedCount" : 2815, "modifiedCount" : 0 }

I just can't figure out why the "" won't update to the value of field_B.

See stackoverflow.com/questions/68255363/or-with-if-and-in-mongodb/… — Wernfried Domscheit
– Wernfried Domscheit, Commented Dec 6, 2022 at 20:40
Two questions. 1) If you're trying to find all documents where field_A is null, then why are you filtering using $nin? 2) When a document matches but is not modified, that suggests that the update would not result in any changes to the document. Do most of the documents that match the query predicate already have the same values for field_A and field_B? One additional thing - the meaning of square brackets is something I outline here — user20042973
– user20042973, Commented Dec 6, 2022 at 21:08
@user20042973 1) I found that just searching only for null in field_A somehow didn't get all of the documents where field_A was actually null. So maybe they have blank spaces in them or something rather than being truly nullI(?). 2) No, by definition I'm just trying to find the documents where field_A is empty and fill it with field_B. The matched count of 176394 is just a small subset of the total documents, the rest of which have content in field_A. — jbat
– jbat, Commented Dec 6, 2022 at 22:20
@WernfriedDomscheit I saw this answer, but as far as I can tell it just fills in field_A with a static phrase, "missing," when I need it to fill in from field_B in the relevant document. Please let me know if I misunderstood what it says! — jbat
– jbat, Commented Dec 6, 2022 at 22:24
{field_A:{$nin:\[null,""\]}},\[{"$set": {field_A : "$field_B" }}\] - The update query requires a condition and an update. The condition would be { filedA: null }. Further, you can use the Updates With Aggregation Pipeline feature available with MongoDB version 4.2 or higher. — prasad_
– prasad_, Commented Dec 7, 2022 at 2:02

user20042973 · Accepted Answer · 2022-12-07 15:15:48Z

There was a lot of discussion in the comments that helped clear up the confusion between the $nin and $in operators, the latter seemingly being what is desired in this situation. There's a separate piece of confusion about matching versus modifying that I mentioned here and will expand on in this answer.

I just can't figure out why the "" won't update to the value of field_B.

In short, the value of field_B appears to be "" for those 2815 matching documents. Since that is also already the value of field_A there is nothing for the database to change.

The output you got from the database is:

{ "acknowledged" : true, "matchedCount" : 2815, "modifiedCount" : 0 }

Referring to the documentation, the matchedCount field represents the following:

The number of documents selected for update. If the update operation results in no change to the document, e.g. $set expression updates the value to the current value, nMatched can be greater than nModified.

And the modifiedCount field is:

The number of existing documents updated. If the update/replacement operation results in no change to the document, such as setting the value of the field to its current value, nModified can be less than nMatched.

Both of those definitions already state that the database will avoid modifying the document (and incrementing the counter) if the update would not change the document from its current state.

Consider a collection with the following three documents:

> db.collection.find()
[
  { _id: 1, field_A: 123, field_B: 456 },
  { _id: 2, field_A: '', field_B: 'ABC' },
  { _id: 3, field_A: '', field_B: '' }
]

When we execute your update (in the mongosh shell), we receive the following response:

> db.collection.updateMany({field_A:""},[{"$set": {field_A : "$field_B" }}])
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 2,
  modifiedCount: 1,
  upsertedCount: 0
}

Here we have matchedCount: 2. There were 2 documents that matched the query predicates of {field_A: ""}. That would be the documents with _id:2 and _id:3.

But we only have modifiedCount: 1. Only one of the documents (id:2) needed to change. The other document (id:3) already had the same value for field_A and field_B (an empty string) so applying the update wouldn't change the document. After the operation completes, the documents now look as follows with the second one having been modified:

> db.collection.find()
[
  { _id: 1, field_A: 123, field_B: 456 },
  { _id: 2, field_A: 'ABC', field_B: 'ABC' },
  { _id: 3, field_A: '', field_B: '' }
]

If we run the update again, we get:

> db.collection.updateMany({field_A:""},[{"$set": {field_A : "$field_B" }}])
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 1,
  modifiedCount: 0,
  upsertedCount: 0
}

This time the second document no longer matched (since we updated field_A to no longer be an empty string). However, the third document still matches but still does not get modified because field_A and field_B are both empty strings. Here are some related commands that help confirm that:

> //Count the number of documents where both `field_A` and `field_B` are empty strings:
> db.collection.countDocuments({field_A:'', field_B:''})
1
> //Change the filter of the update to exclude documents where `field_B` is an empty string:
> db.collection.updateMany({field_A:"", field_B:{$ne:""}},[{"$set": {field_A : "$field_B" }}])
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 0,
  modifiedCount: 0,
  upsertedCount: 0
}

I would expect the count above to return 2815 in your environment. If you do not expect field_B to contain an empty string for those documents then you may need to take additional actions to correct that.

this was EXACTLY my problem. My data is extremely dense and hard to quickly look at and see all the fields, so I wasn't catching this. Given that I completely messed up $nin and $in, I just assumed I was continuing to have problems understanding MongoDB syntax, rather than problems with the quality of the underlying data. Many thanks -- and thanks to everyone who jumped in to help previously!

Collectives™ on Stack Overflow

Use MongoDB field value to fill in separate blank field

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related