1

I want to code a little app where I can store the incoming request url, named reqUrl below, and check if it already exists by using the compareUrls function.

It returns true if both websites are in the same domain and false otherwise, for example when doing compareUrls(stackoverflow.com, http://www.stackoverflow.com). This is used so as not to add duplicate urls.

I am trying to use that function inside a MongoDB query like this:

app.get("/:reqUrl", function(req, res)
{
    var reqUrl = req.params.reqUrl;

    MongoClient.connect(Url, function(err, db)
     { 
       if (err) throw err;  
       db.collection("mydb").find({$where: function() {

         if (compareUrls(reqUrl, this.url) //if true, simply return the url
         {
            return this.url;
         } else { //if not existing insert it into the database
            db.collection("mydb").insert({"url":reqUrl});
         };          

     }}).toArray();

//Code continues below

Now the problem is that because of scoping, the reqUrl variable is not recognized, and I don't know any workaround. And even when using local variables with compareUrls I get back the whole collection of elements. I thought about retrieving back all results to an array by simply calling .find and checking reqUrl against each item, but that would be far more than efficient.

Please note that I am very new to MongoDB.

Any feedback would be appreciated, thanks.

4
  • Where is reqUrl initialised? Commented Jun 5, 2017 at 23:57
  • Outside of MongoClient.connect() stuff, I'll edit my question Commented Jun 6, 2017 at 0:02
  • You can't do database operations like an insert inside of a $where function. Commented Jun 6, 2017 at 0:15
  • I am out of luck then :/ I'll try something else and see if I can cook up an answer Commented Jun 6, 2017 at 0:23

1 Answer 1

2

The bottom line here is that you cannot perform other database operations inside the logic of a $where clause, nor should you since it is completely unnecessary and your actions are actually supported in existing standard operators and methods.

What you really want here is .findOneAndUpdate(). You do not need $where for the sort of match condition you are doing which is to simply check a value. This is actually a $regex search condition for the "query" portion to select.

As for the "insert" part, then that is what "upserts" are for. So when the data is not "found", then the "upsert" creates/inserts the new document in the collection, otherwise when found it "updates". You can tune that in this case with the $setOnInsert modifier so that a "found" document is not actually modified, and the data is only touched on "insertion":

db.collection("mydb").findOneAndUpdate(
  { "url": new RegExp(reqUrl) },
  { "$setOnInsert": { "url": reqUrl } },
  { "upsert": true, "returnOriginal": false },
  function(err, doc) {
    // deal with result here
  }
)

Of course the $regex usage here is just a basic "is this string present in the properties string" condition. There are more advanced regular expressions specific to "domain matching", such as you could find in the existing answers here: Regex to match simple domain

But the basic logic remains the same that a "regular expression" does the match condition and then you simply "upsert".

That said, there is nothing actually stopping you from using a $where clause for the match condition. It's just that the actual operation remains an "upsert" instead of trying to call a database method "within" the supplied function which can either call a server function or be included inline:

db.collection("mydb").findOneAndUpdate(
  { "$where": function() { return compareUrls(reqUrl, this.url); }  },
  { "$setOnInsert": { "url": reqUrl } },
  { "upsert": true, "returnOriginal": false },
  function(err, doc) {
    // deal with result here
  }
)

Just make sure that under the conditions of $where the server function or any result is actually returning a boolean true/false, since that is how $where operates.

Also note the usage of "returnOriginal": false here, as the default behavior of the .findOneAndUpdate() method is to return the "original" document before modification. In some cases this would be desired, but most common usage is to return the document in it's modified state.

Of course if you do not need the document in response at all, then .updateOne() will suffice as a method, and reduces the overhead of returning the document content "over the wire".

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks a lot for your detailled answer, I thought before about going the regex route, but I wanted absolutely to use the compareUrls function. Too bad that it is difficult to use custom functions inside MongoDB queries
@Valilutzik It's very much by design. And not "too bad" at all. I cannot imagine there is anything happening in your function that cannot be done in a regular expression. In fact to be "really optimal" you probably should be storing the "domain" only as a property if that is your test of uniqueness, then this becomes a simple "equality" match and the most performant option. But what you apparent really need to wrap your head around is understanding that JavaScript Evaluation === BAD in terms of performance and whole host of reasons. Use the native operators.
Okay, got it :)
@Valilutzik Ran off to lunch in the middle of that, but what I also meant to say is that there is nothing actually stopping you using a $where clause for the "selection" logic, it's just that you probably should not if you can avoid it. The real issue is that your presumption on using .insert() as a database method here is incorrect. Example added in the answer with explanation. Which should be useful information you do not seem to know.
I actually kept thinking about your previous comment about design and performance and things are starting to make sense now. I'll reflect on your updated answer. Thanks again a ton for your help

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.