10

I just want to find some fastest set bits count function in the php.

For example, 0010101 => 3, 00011110 => 4

I saw there is good Algorithm that can be implemented in c++. How to count the number of set bits in a 32-bit integer?

Is there any php built-in function or fastest user-defined function?

5
  • What function di you try .? Commented May 31, 2013 at 2:40
  • I think the '00110101' is not a string but the binary representation of the integer. Commented May 31, 2013 at 2:45
  • Just convert it to a string with decbin(). It's almost certainly going to be faster than running a loop. Commented May 31, 2013 at 2:57
  • @cleong Not sure because string search do loop Commented May 31, 2013 at 2:59
  • @MatRt build-in loop. Benchmark needed. I would be more concerned about double function call. function countSetBits($int){ return substr_count(decbin($int), '1'); } Commented May 31, 2013 at 3:00

6 Answers 6

11

You can try to apply a mask with a binary AND, and use shift to test bit one by one, using a loop that will iterate 32 times.

function getBitCount($value) {

    $count = 0;
    while($value)
    {
        $count += ($value & 1);
        $value = $value >> 1;
    }

    return $count;
}

You can also easily put your function into PHP style

function NumberOfSetBits($v)
{
    $c = $v - (($v >> 1) & 0x55555555);
    $c = (($c >> 2) & 0x33333333) + ($c & 0x33333333);
    $c = (($c >> 4) + $c) & 0x0F0F0F0F;
    $c = (($c >> 8) + $c) & 0x00FF00FF;
    $c = (($c >> 16) + $c) & 0x0000FFFF;
    return $c;
}
Sign up to request clarification or add additional context in comments.

7 Comments

Indeed, little mistake, it is a BINARY AND
NumberOfSetBits() function is faster ten times that getBitCount(), but if $i = 1025, then it returns 258, can you fix this problem? My Input limit is 2000.
Indeed, it seems to be an error in this algo, check my update.
Be careful with the first approach. Any number higher than 0x7FFFFFFF (negative values) might cause an infinite loop (at least on 32-bit PHP).
getBitCount doesn't work with negative numbers it is infinite loop (using 32 bit integers). As you don't mention this in the answer I should vote you down. But I will not do it coz you bring the next approach.
|
4

There are a number of other ways; but for a decimal 32 bit integer, NumberOfSetBits is definitely the fastest.

I recently stumbled over Brian Kernighan´s algorithm, which has O(log(n)) instead of most of the others having O(n). I don´t know why it´s not appearing that fast here; but it still has a measurable advantage over all other non-specialized functions.

Of course, nothing can beat NumberOfSetBits with O(1).

my benchmarks:

function getBitCount($value) { $count = 0; while($value) { $count += ($value & 1); $value = $value >> 1; } return $count; }

function getBitCount2($value) { $count = 0; while($value) { if ($value & 1)$count++; $value >>= 1; } return $count; }
// if() instead of +=; >>=1 instead of assignment: sometimes slower, sometimes faster
function getBitCount2a($value) { for($count = 0;$value;$value >>= 1) if($value & 1)$count ++; return $count; }
// for instead of while: sometimes slower, sometimes faster

function getBitCount3($value) { for($i=1,$count=0;$i;$i<<=1) if($value&$i)$count++; return $count; }
// shifting the mask: incredibly slow (always shifts all bits)
function getBitCount3a($value) { for($i=1,$count=0;$i;$i<<=1) !($value&$i) ?: $count++; return $count; }
// with ternary instead of if: even slower

function NumberOfSetBits($v) {
// longest (in source code bytes), but fastest
    $c = $v - (($v >> 1) & 0x55555555); $c = (($c >> 2) & 0x33333333) + ($c & 0x33333333);
    $c = (($c >> 4) + $c) & 0x0F0F0F0F; $c = (($c >> 8) + $c) & 0x00FF00FF;
    $c = (($c >> 16) + $c) & 0x0000FFFF;    return $c;
}

function bitsByPregReplace($n) { return strlen(preg_replace('_0_','',decbin($n))); }
function bitsByNegPregReplace($n) { return strlen(preg_replace('/[^1]/','',decbin($n))); }
function bitsByPregMatchAll($n) { return preg_match_all('/1/',decbin($n)); }

function bitsBySubstr($i) { return substr_count(decbin($i), '1'); }
function bitsBySubstrInt($i) { return substr_count(decbin($i), 1); }
// shortest (in source code bytes)

function bitsByCountChars($n){ return count_chars(decbin($n))[49]; }
// slowest by far
function bitsByCountChars1($n) { return count_chars(decbin($n),1)[49]; }
// throws a notice for $n=0

function Kernighan($n) { for(;$n;$c++)$n&=$n-1;return$c; }
// Brian Kernighan’s Algorithm

function benchmark($function)
{
    gc_collect_cycles();
    $t0=microtime();
    for($i=1e6;$i--;) $function($i);
    $t1=microtime();
    $t0=explode(' ', $t0); $t1=explode(' ', $t1);
    echo ($t1[0]-$t0[0])+($t1[1]-$t0[1]), " s\t$function\n";
}

benchmark('getBitCount');
benchmark('getBitCount2');
benchmark('getBitCount2a');
benchmark('getBitCount3');
benchmark('getBitCount3a');
benchmark('NumberOfSetBits');
benchmark('bitsBySubstr');
benchmark('bitsBySubstrInt');
benchmark('bitsByPregReplace');
benchmark('bitsByPregMatchAll');
benchmark('bitsByCountChars');
benchmark('bitsByCountChars1');
benchmark('decbin');

banchmark results (sorted)

> php count-bits.php
 2.286831 s     decbin

 1.364934 s     NumberOfSetBits
 3.241821 s     Kernighan

 3.498779 s     bitsBySubstr*
 3.582412 s     getBitCount2a
 3.614841 s     getBitCount2
 3.751102 s     getBitCount
 3.769621 s     bitsBySubstrInt*

 5.806785 s     bitsByPregMatchAll*
 5.748319 s     bitsByCountChars1*
 6.350801 s     bitsByNegPregReplace*
 6.615289 s     bitsByPregReplace*

13.863838 s     getBitCount3
16.39626 s      getBitCount3a
19.304038 s     bitsByCountChars*

Those are the numbers from one of my runs (with PHP 7.0.22); others showed different order within the 3.5 seconds group. I can say that - on my machine - four of those five are pretty equal, and bitsBySubstrInt is always a little slower due to the typecasts.

Most other ways require a decbin (which mostly takes longer than the actual counting; I marked them with a * in the benchmark results); only BitsBySubstr would get close to the winner without that gammy leg.

I find it noticeable that you can make count_chars 3 times faster by limiting it to only existing chars. Seems like array indexing needs quite some time.


edit:

  • added another preg_replace version
  • fixed preg_match_all version
  • added Kernighan´s algorithm (fastest algorithm for arbitrary size integers)
  • added garbage collection to benchmarking function
  • reran benchmarks

2 Comments

Is there a typo in the timing for NumberOfSetBits? If not, it is in the wrong clump.
I would expect Kernighan to be faster when there are fewer bits, not "arbitrary size ints".
3

I could figure out a few ways to but not sure which one would be the fastest :

  • use substr_count()
  • replace all none '1' characters by '' and then use strlen()
  • use preg_match_all()

PS : if you start with a integer these examples would involve using decbin() first.

Comments

2

My benchmarking code

start_benchmark();
for ($i = 0; $i < 1000000; $i++) {
    getBitCount($i);
}
end_benchmark();

start_benchmark();
for ($i = 0; $i < 1000000; $i++) {
    NumberOfSetBits($i);
}
end_benchmark();
start_benchmark();
for ($i = 0; $i < 1000000; $i++) {
    substr_count(decbin($i), '1');
}
end_benchmark();

Benchmarking result:

benchmark (NumberOfSetBits()) : 1.429042 milleseconds

benchmark (substr_count()) : 1.672635 milleseconds

benchmark (getBitCount()): 10.464981 milleseconds

I think NumberOfSetBits() and substr_count() are best. Thanks.

1 Comment

PHP has always been 'weirdly' fast with strings, if you limit your benchmark to smaller numbers you will see that substr_count(decbin($i), '1'); will become faster then NumberOfSetBits().
2

This option is a little faster than NumberOfSetBits($v)

function bitsCount(int $integer)
{
    $count = $integer - (($integer >> 1) & 0x55555555);
    $count = (($count >> 2) & 0x33333333) + ($count & 0x33333333);
    $count = ((((($count >> 4) + $count) & 0x0F0F0F0F) * 0x01010101) >> 24) & 0xFF;

    return $count;
}

Benckmark (PHP8)

1.78 s  bitsBySubstr
1.42 s  NumberOfSetBits
1.11 s  bitsCount

Comments

0

Here is another solution. Maybe not the fastet but therefor the shortest solution. It also works for negative numbers:

function countBits($num)
{
   return substr_count(decbin($num), "1");
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.