7

I am looking for the PHP equelent to JAVA's

 "SomeString".hashCode();

function. The hashCode i am looking for should be the same which is used for indexing Hashmaps in PHP. I Hope you can help me :)

EDIT:

Okay found the function i was searching for its written in C and is not available in PHP itself but thanks for your help !

ulong zend_inline_hash_func(char *arKey, uint nKeyLength)
{
        ulong $h = 5381;
        char *arEnd = arKey + nKeyLength;

        while (arKey < arEnd) {
                $h += ($h << 5);
                $h += (ulong) *arKey++;
        }
        return $h;
}
1
  • Why are all the answers below using 31? Isn't $h += ($h << 5) equivalent to $h *= 33? Commented Dec 13, 2021 at 12:17

6 Answers 6

9

Arkh and the github solution referenced by guiguoz are in the right direction, but both fail to take into account that PHP will upconvert the integer hash value to a double as soon as it exceeds 2^61. The java function, which is calculated using fixed hardware 32-bit signed values, involves 32-bit arithmetic overflow (intrinsic to the CPU) to keep the value as a 32-bit signed integer.

In PHP, you will need to manually perform that arithmetic overflow each time the $hash is updated:

function overflow32($v)
{
    $v = $v % 4294967296;
    if ($v > 2147483647) return $v - 4294967296;
    elseif ($v < -2147483648) return $v + 4294967296;
    else return $v;
}

function hashCode( $s )
{
    $h = 0;
    $len = strlen($s);
    for($i = 0; $i < $len; $i++)
    {
        $h = overflow32(31 * $h + ord($s[$i]));
    }

    return $h;
}

(edit: corrected %v typo)

Sign up to request clarification or add additional context in comments.

4 Comments

The overflow32 method is wrong (%v instead of $v, and it divides by 0 on a 32bit machine). Instead, the $h = line should read:$h = (int)(31 * $h + ord($s[$i])) & 0xffffffff;
@xryl669, your line will return wrong on hashCode("153193cc3139f12e"). it returns 3369976574 instead of -924990722.
This is still not applicable for 32-bit systems.
Is it possible to replace overflow32() with & 0x7FFFFFFF? Or would there be other issues?
3

There is no such method available in php. So you will have to implement the correct method. Wikipedia gives the algorithm used by Java.lang.hashCode which is used by strings I think, so here is a quick php version of it:

<?php
function getStringHashCode($string){
  $hash = 0;
  $stringLength = strlen($string);
  for($i = 0; $i < $stringLength; $i++){
    $hash = 31 * $hash + $string[$i];
  }
  return $hash;
}

2 Comments

thx for this code but i need EXACTLY the same used internal for building hashmaps.
This function is wrong, gist.github.com/andreyknupp/5061911 has a correct implementation, but it still yields different hashes if the String has spaces on it.
1

spl_object_hash is probably the closest to what you want, but despite the name it does not really return a hash of the passed in value, merely an internal unique identifier. I don't know if it's the hash actually used under the hood for arrays etc.

Comments

1

Here is my 2 cents for implementing Java's hashCode in PHP:

/**
 * Simulates java hashCode function
 * hash a string to 32 bit
 * @param str the string to hash
 * @return hashed 32 bit integer
 */
function hashCode($str) {
    $str = (string)$str;
    $hash = 0;
    $len = strlen($str);
    if ($len == 0 )
        return $hash;

    for ($i = 0; $i < $len; $i++) {
        $h = $hash << 5;
        $h -= $hash;
        $h += ord($str[$i]);
        $hash = $h;
        $hash &= 0xFFFFFFFF;
    }
    return $hash;
};

Comments

0

Not equal, but very fast:

function hashCode32Signed($str)
{
    return unpack('l', hash('crc32', $str, true))[1];
} 

1 Comment

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.
-1

a utf-8 version with emoji support

function str_hashcode($s){
    $hash = 0;
    $len = mb_strlen($s, 'UTF-8');
    if($len == 0 )
        return $hash;
    for ($i = 0; $i < $len; $i++) {
        $c = mb_substr($s, $i, 1, 'UTF-8');
        $cc = unpack('V', iconv('UTF-8', 'UCS-4LE', $c))[1];
        $hash = (($hash << 5) - $hash) + $cc;
        $hash &= $hash; // 16bit > 32bit
    }
    return $hash;
}

1 Comment

This function returns the same hash for 'aaaa-bbbb-cccc-dddd-eeee-ffff-gggg-hhhh-iiii-jjjj-kkkk-11111' as 'aaaa-bbbb-cccc-dddd-eeee-ffff-gggg-hhhh-iiii-jjjj-kkkk-22222'. jshouchin's accepted answer also supports emoji's.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.