Count the number of non zero values in a numpy array in Numba

Question

Very simple. I am trying to count the number of non-zero values in an array in NumPy jit compiled with Numba (njit()). The following I've tried is not allowed by Numba.

a[a != 0].size
np.count_nonzero(a)
len(a[a != 0])
len(a) - len(a[a == 0])

I don't want to use for loops if there is still a faster, more pythonic and elegant way.

For that commenter that wanted to see a full code example...

import numpy as np
from numba import njit

@njit()
def n_nonzero(a):
    return a[a != 0].size

Please show at least one actual, complete piece of code you tried, including import statements, decorators and sample test harness. — Mark Setchell
– Mark Setchell, Commented Feb 22, 2019 at 15:48

javidcf · Accepted Answer · 2019-02-22 17:25:58Z

6

You may also consider, well, counting the nonzero values:

import numba as nb

@nb.njit()
def count_loop(a):
    s = 0
    for i in a:
        if i != 0:
            s += 1
    return s

I know it seems wrong, but bear with me:

import numpy as np
import numba as nb

@nb.njit()
def count_loop(a):
    s = 0
    for i in a:
        if i != 0:
            s += 1
    return s

@nb.njit()
def count_len_nonzero(a):
    return len(np.nonzero(a)[0])

@nb.njit()
def count_sum_neq_zero(a):
    return (a != 0).sum()

np.random.seed(100)
a = np.random.randint(0, 3, 1000000000, dtype=np.uint8)
c = np.count_nonzero(a)
assert count_len_nonzero(a) == c
assert count_sum_neq_zero(a) == c
assert count_loop(a) == c

%timeit count_len_nonzero(a)
# 5.94 s ± 141 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit count_sum_neq_zero(a)
# 848 ms ± 80.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit count_loop(a)
# 189 ms ± 4.41 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

It is in fact faster than np.count_nonzero, which can get quite slow for some reason:

%timeit np.count_nonzero(a)
# 4.36 s ± 69.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

edited Feb 22, 2019 at 17:25

answered Feb 22, 2019 at 17:20

javidcf

59.9k7 gold badges87 silver badges134 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

MSeifert Over a year ago

Yeah, numba really excels when it sees a loop it can optimize. +1

MSeifert · Accepted Answer · 2019-02-22 17:40:56Z

3

In case you need it really fast for large arrays you could even use numbas prange to process the count in parallel (for small arrays it will be slower due to the parallel-processing overhead).

import numpy as np
from numba import njit, prange

@njit(parallel=True)
def parallel_nonzero_count(arr):
    flattened = arr.ravel()
    sum_ = 0
    for i in prange(flattened.size):
        sum_ += flattened[i] != 0
    return sum_

Note that when you use numba you normally want to write out your loops because that's what numba is really very good at optimizing.

I actually timed it against the other solutions mentioned here (using my Python module simple_benchmark):

Code to reproduce:

import numpy as np
from numba import njit, prange

@njit
def n_nonzero(a):
    return a[a != 0].size

@njit
def count_non_zero(np_arr):
    return len(np.nonzero(np_arr)[0])

@njit() 
def methodB(a): 
    return (a!=0).sum()

@njit(parallel=True)
def parallel_nonzero_count(arr):
    flattened = arr.ravel()
    sum_ = 0
    for i in prange(flattened.size):
        sum_ += flattened[i] != 0
    return sum_

@njit()
def count_loop(a):
    s = 0
    for i in a:
        if i != 0:
            s += 1
    return s

from simple_benchmark import benchmark

args = {}
for exp in range(2, 20):
    size = 2**exp
    arr = np.random.random(size)
    arr[arr < 0.3] = 0.0
    args[size] = arr

b = benchmark(
    funcs=(n_nonzero, count_non_zero, methodB, np.count_nonzero, parallel_nonzero_count, count_loop),
    arguments=args,
    argument_name='array size',
    warmups=(n_nonzero, count_non_zero, methodB, np.count_nonzero, parallel_nonzero_count, count_loop)
)

edited Feb 22, 2019 at 17:40

answered Feb 22, 2019 at 17:29

MSeifert

154k41 gold badges356 silver badges377 bronze badges

2 Comments

javidcf Over a year ago

Is it safe sharing sum_ between parallel loop iterations? (I don't know much about the guarantees of parallelized Numba)

MSeifert Over a year ago

Yes, numba has a few reductions that it can safely parallelize. summation and multiplications are one of them. That's because numba realizes it can process in parallel using sum_ = 0 for each and then just add these after each process finished. I also checked for consistency against np.count_nonzero.

Chris · Accepted Answer · 2019-02-22 15:57:23Z

1

You can use np.nonzero and induce the length of it:

@njit
def count_non_zero(np_arr):
    return len(np.nonzero(np_arr)[0])

count_non_zero(np.array([0,1,0,1]))
# 2

answered Feb 22, 2019 at 15:57

Chris

29.8k3 gold badges34 silver badges56 bronze badges

2 Comments

SARose Over a year ago

that [0] seems to be the thing that did it. Thank you very much!

hpaulj Over a year ago

Funny thing is that np.nonzero uses np.count_nonzero (at the c-api level) to determine the size of the arrays that it will fill on a second iteration. I though the whole point to using numba was to be able to iterate with impunity. :)

Mark Setchell · Accepted Answer · 2019-02-22 17:05:52Z

Not sure if I have made a mistake here, but this seems 6x faster:

# Make something worth checking
a=np.random.randint(0,3,1000000000,dtype=np.uint8)  

In [41]: @njit() 
    ...: def methodA(a): 
    ...:     return len(np.nonzero(a)[0])                                                                                           

# Call and check result
In [42]: methodA(a)                                                                                 
Out[42]: 666644445

In [43]: %timeit methodA(a)                                                                         
4.65 s ± 28.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [44]: @njit() 
    ...: def methodB(a): 
    ...:     return (a!=0).sum()                                                                                         

# Call and check result    
In [45]: methodB(a)                                                                                 
Out[45]: 666644445

In [46]: %timeit methodB(a)                                                                         
724 ms ± 14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Collectives™ on Stack Overflow

Count the number of non zero values in a numpy array in Numba

4 Answers 4

1 Comment

2 Comments

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

2 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related