10

After I found that the common/latest Javascript implementations are using String Interning for perfomance boost (Do common JavaScript implementations use string interning?), I thought === for strings would get the constant O(1) time. So I gave a wrong answer to this question:

JavaScript string equality performance comparison

Since according to the OP of that question it is O(N), doubling the string input doubles the time the equality needs. He didn't provide any jsPerf so more investigation is needed,

So my scenario using string interning would be:

var str1 = "stringwithmillionchars"; //stored in address 51242

var str2 = "stringwithmillionchars"; //stored in address 12313

The "stringwithmillionchars" would be stored once let's say in address 201012 of memory and both str1 and str2 would be "pointing" to this address 201012. This address could then be determined with some kind of hashing to map to specific locations in memory.

So when doing

"stringwithmillionchars" === "stringwithmillionchars"

would look like

getContentOfAddress(51242)===getContentOfAddress(12313)

or 201012 === 201012

which would take O(1)/constant time

JSPerfs/Performance updates:

JSPerf seems to show constant time even if the string is 16 times longer?? Please have a look:

http://jsperf.com/eqaulity-is-constant-time

Probably the strings are too small on the above: This probably show linear time (thanks to sergioFC) the strings are built with a loop. I tried without functions - still linear time / I changed it a bit http://jsfiddle.net/f8yf3c7d/3/ .

According to https://www.dropbox.com/s/8ty3hev1b109qjj/compare.html?dl=0 (12MB file that sergioFC made) when you have a string and you already have assigned the value in quotes no matter how big the t1 and t2 are (e.g 5930496 chars), it is taking it 0-1ms/instant time.

It seems that when you build a string using a for loop or a function then the string is not interned. So interning happens only when you directly assign a string with quotes like var str = "test";

15
  • 1
    I think its because === operator compares memory addresses only when comparing objects (similar to Java). But "something" its not an object, its type is the builtin string. The same as comparing numbers, var a=2; var b=2;, if you do a===b you are not comparing objects nor memory adresses. Commented Oct 24, 2014 at 14:26
  • 1
    I know you can do var str = new String("test"); but I don't know the implications there either.. Commented Oct 24, 2014 at 14:57
  • 1
    Even doing so typeof str would be 'string', not object. Commented Oct 24, 2014 at 15:09
  • 2
    I have removed the fiddle in order to not crush more browsers. I think they are too small. Important: i have just testd that a comparaison of two equals strings with 5930496chars constructed using var s1='...';var s2='...'; tooks 0ms, while the comparaison of the same string constructed char by char tooks 20ms. Commented Oct 24, 2014 at 15:40
  • 1
    Don't know, I'm not familiar with interning. The strings are so long that the file is 12MB. I'm gonna upload it to Dropbox and update this comment with the link. Commented Oct 24, 2014 at 15:47

3 Answers 3

13

Based on all the Performance Tests (see original post) for strings a and b the operation a === b takes:

  • constant time O(1) if the strings are interned. From the examples it seems that interning only happens with directly assigned strings like var str = "test"; and not if you build it with concatenation using for-loops or functions.

  • linear time O(N) since in all the other cases the length of the two strings is compared first. If it is equal then we have character by character comparison. Else of course they are not equal. N is the length of the string.

Sign up to request clarification or add additional context in comments.

Comments

3

According to the ECMAScript 5.1 Specification's Strict Equal Comparison algorithm, even if the type of Objects being compared is String, all the characters are checked to see if they are equal.

  1. If Type(x) is String, then return true if x and y are exactly the same sequence of characters (same length and same characters in corresponding positions); otherwise, return false.

Interning is strictly an implementation thingy, to boost performance. The language standard doesn't impose any rules in that regard. So, its up to the implementers of the specification to intern strings or not.

13 Comments

Thanks. Could you think of a reason why this is a better choice than handling strings as objects two? Is there such an overhead?
I fail to see how this forces the engine to actually make the full string comparison. If two variables point to the same string kept somewhere in memory, then they will satisfy this requirement.
@MichailMichailidis Interning is strictly an implementation thingy, to boost performance. The language standard doesn't impose any rules in that regard.
Well we are talking about the latest/common implementations- Is it actually being used? Can someone more familiar with jsperf make one please? Case1: "test"==="test", Case2: "testtest"==="testtest" (double the length) Case3: strings are different but same length and Case 4: different and same length.
@thefourtheye No, sorry. I did not refer to interning at all. Your point is that "the implementations still have to check all the characters" and I don't agree that it follows from the section you quoted. You can achieve this "same length/same chars" without full comparison.
|
1

First of all, it would be nice to see a JSPerf test which demonstrates the claim that doubling the string size doubles the execution time.

Next, let's take that as granted. Here's my (unproven, unchecked, and probably unrelated to reality) theory.

Compairing two memory addresses is fast, no matter how much data is references. But you have to INTERN this strings first. If you have in your code

var a = "1234";
var b = "1234";

Then the engine first has to understand that these two strings are the same and can point to the same address. So at least once these strings has to be compared fully. So basically here are the following options:

  • The engine compares and interns strings directly when parsing the code. In this case equals strings should get the same address.
  • The engine may say "these strings are two big, I don't want to intern them" and has two copies.
  • The engine may intern these strings later.

In the two latter cases string comparison will influence the test results. In the last case - even if the strings are finally interned.

But as I wrote, a wild theory, for theory's sage. I'd first like to see some JSPerf.

1 Comment

I agree with you - The original poster of the other question said it and I took it for granted - I will soon prepare a JSPerf to elaborate more on the numbers if he doesn't.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.