1

Is there a way to remove literally all null or empty string values from an object? We have an aggregation which creates an object with empty fields and empty objects should the value be null. What we wish to do is remove all null properties and empty objects and recreate the object, in order to keep the data as small as possible.

e.g. in the following object, only 'test' and 'more-nested-data' should be taken into account, the rest can be removed


{
    "test": "some",
    "test2": {
    },
    "test3": {
        "some-key": {
        },
        "some-other-key": {
            "more-nested-data": true,
            "more-nested-emtpy": null
        }
    }
}

which should become:

{
    "test": "some",
    "test3": {
        "some-other-key": {
            "more-nested-data": true
        }
    }
}

I tried a lot, but I think by using objectToArray that something could be done, but I have not found the solution yet. The required aggregation should need to recursively (or by defined levels) remove null properties and empty objects.

3
  • Some answers await. Commented Feb 7, 2022 at 16:38
  • Solved it by doing: mongodb.com/community/forums/t/… Commented May 17, 2022 at 10:55
  • ? The solution there is for top-level fields only. Your request was to trim out any null field at any level of depth. You need recursion for that; see answer below. Commented May 17, 2022 at 11:06

2 Answers 2

1

Use the $function operator available in 4.4 (Aug 2021) to do this recursively as you note. Given this input which is a slightly expanded version of that supplied in the question:

var dd = {
    "test": "some",
    "test2": { },
    "test3": {
    "some-key": { },
        "some-other-key": {
        "more-nested-data": true,
            "more-nested-emtpy": null,
            "emptyArr": [],
            "notEmptyArr": [
                "XXX",
                null,
                {"corn":"dog"},
                {"bad":null},
                {"other": {zip:null, empty:[], zap:"notNull"}}
            ]
        }
    }
}
db.foo.insert(dd);

then this pipeline:

db.foo.aggregate([
    {$replaceRoot: {newRoot: {$function: {
        body: function(obj) {
            var process = function(holder, spot, value) {
                var remove_it = false;
                // test FIRST since [] instanceof Object is true!                                   
                if(Array.isArray(value)) {
                    // walk BACKWARDS due to potential splice() later                               
                    // that will change the length...                                               
                    for(var jj = value.length - 1; jj >= 0; jj--) {
                        process(value, jj, value[jj]);
                    }
                    if(0 == value.length) {
                        remove_it = true;
                    }

                } else if(value instanceof Object) {
                    walkObj(value);
                    if(0 == Object.keys(value).length) {
                        remove_it = true;
                    }

                } else {
                    if(null == value) {
                        remove_it = true;
                    }
                }

                if(remove_it) {
                    if(Array.isArray(holder)) {
                        holder.splice(spot,1); // snip out the val                                  
                    } else if(holder instanceof Object) {
                        delete holder[spot];
                    }
                }
            };

            var walkObj = function(obj) {
                Object.keys(obj).forEach(function(k) {
                    process(obj, k, obj[k]);
                });
            }

            walkObj(obj); // entry point!                                                           
            return obj;
    },
        args: [ "$$CURRENT" ],
        lang: "js"
      }}
    }}
]);

produces this result:

{
    "_id" : 0,
    "test" : "some",
    "test3" : {
        "some-other-key" : {
            "more-nested-data" : true,
            "notEmptyArr" : [
                "XXX",
                {
                    "corn" : "dog"
                },
                {
                    "other" : {
                        "zap" : "notNull"
                    }
                }
            ]
        }
    }
}

A convenient way to debug such complex functions is by declaring them as variables outside of the pipeline and running data through them to simulate the documents (objects) coming out the database, e.g.:

ff = function(obj) {                                  
            var process = function(holder, spot, value) {
                var remove_it = false;
                // test FIRST since [] instanceof Object is true!                                   
                if(Array.isArray(value)) {
...
printjson(ff(dd));  // use the same doc as above

You can put print and other debugging aids into the code and then when you are done, you can remove them and call the pipeline to process the real data as follows:

db.foo.aggregate([
    {$replaceRoot: {newRoot: {$function: {
        body: ff,  // substitute here!
        args: [ "$$CURRENT" ],
        lang: "js"
      }}
    }}
]);
Sign up to request clarification or add additional context in comments.

Comments

0

Sounds like the unwind operator would help. Checkout the unwind operator at https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.