3

note: I'm using Mongodb 4 and I must use aggregation, because this is a step of a bigger aggregation

Problem

How to find in a collection documents that contains fields that starts with value from another field in same document ?

Let's start with this collection:

db.regextest.insert([
{"first":"Pizza", "second" : "Pizza"},
{"first":"Pizza", "second" : "not pizza"},
{"first":"Pizza", "second" : "not pizza"}
])

and an example query for exact match:

db.regextest.aggregate([
{
    $match :  { $expr: { $eq: [ "$first" ,"$second" ] }   }    }
])

I will get a single document

{
    "_id" : ObjectId("5c49d44329ea754dc48b5ace"),
    "first" : "Pizza",    "second" : "Pizza"
}

And this is good.

But how to do the same, but with startsWith ? My plan was to use regex but seems that is not supported in aggregation so far.

With a find and a custom javascript function works fine:

db.regextest.find().forEach(
    function(obj){
        if (obj.first.startsWith(obj.second)){
            print(obj);
        }
    }
)

And returns correctly:

{
    "_id" : ObjectId("5c49d44329ea754dc48b5ace"),
    "first" : "Pizza",
    "second" : "Pizza"
}

How it's possible to get same result with aggregation framework ?

One idea is to use existing aggregation framework pipeline, out to a temp colletion and then run the find above, to get match I'm looking for. This seems to be a workaround, I hope someone have a better idea.

Edit: here the solution

db.regextest.aggregate([{
        $project : {
            "first" : 1,
            "second" : 1,
            fieldExists : {
                $indexOfBytes : ['$first', '$second' , 0]
            }
        }
    }, {
        $match : {
            fieldExists : {
                $gt : -1
            }
        }
    }
]);
5
  • 2
    You can use $indexOfCP Commented Jan 25, 2019 at 12:32
  • thanks for suggestions! I've found my answer here: stackoverflow.com/questions/3305561/… Commented Jan 28, 2019 at 12:47
  • 1
    @AsyaKamsky: Could you be so kind and explain your solution a bit, please? Why is it better? Commented Jul 24, 2019 at 8:56
  • 2
    @B--rian sure - it uses aggregation expression (via $expr) referring to as "f1" and "f2" field names (called in the question "first" and "second") to check for equality between two strings: "$f2" and substring of f1 from position 0 which is size of f2 string). So if I have f1: "pizza is good" and f2: "pizza" then substring of of f1 from position 0 that's 5 characters is 'pizza' and it's a match. Come to think of it, I don't think this is a duplicate of marked question, I'll see about getting it unduped and adding the answer. Commented Jul 28, 2019 at 15:51
  • 2
    as far as why it's better - the request was for a $match and it does a $match in a single stage without adding any superfluous fields to the document. Not to mention the fact that $project loses the rest of the fields except _id. Commented Jul 28, 2019 at 15:55

1 Answer 1

1

The simplest way is to use $expr, first available in 3.6 like this:

{
  $match: {
    $expr: {
      $eq: [
        '$second',
        {
          $substr: ['$first', 0, { $strLenCP: '$second' }]
        }
      ]
    }
  }
}

This compares the string in field second with the first N characters of first where N is the length of second string. If they are equal, then first starts with second.

4.2 adds support for $regex in aggregation expressions, but starts with is much simpler and doesn't need regular expressions.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.