Numpy find number of occurrences in a 2D array

Question

Is there a numpy function to count the number of occurrences of a certain value in a 2D numpy array. E.g.

np.random.random((3,3))

array([[ 0.68878371,  0.2511641 ,  0.05677177],
       [ 0.97784099,  0.96051717,  0.83723156],
       [ 0.49460617,  0.24623311,  0.86396798]])

How do I find the number of times 0.83723156 occurs in this array?

Ritwik Bose · Accepted Answer · 2016-07-06 15:37:36Z

8

arr = np.random.random((3,3))
# find the number of elements that get really close to 1.0
condition = arr == 0.83723156
# count the elements
np.count_nonzero(condition)

The value of condition is a list of booleans representing whether each element of the array satisfied the condition. np.count_nonzero counts how many nonzero elements are in the array. In the case of booleans it counts the number of elements with a True value.

To be able to deal with floating point accuracy, you could do something like this instead:

condition = np.fabs(arr - 0.83723156) < 0.001

edited Jul 6, 2016 at 15:37

answered Jul 6, 2016 at 15:31

Ritwik Bose

6,1598 gold badges35 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Imanol Luengo · Accepted Answer · 2016-07-06 16:15:46Z

3

For floating point arrays np.isclose is much better option than either comparing with the exactly same element or defining a custom range.

>>> a = np.array([[ 0.68878371,  0.2511641 ,  0.05677177],
                  [ 0.97784099,  0.96051717,  0.83723156],
                  [ 0.49460617,  0.24623311,  0.86396798]])

>>> np.isclose(a, 0.83723156).sum()
1

Note that real numbers are not represented exactly in a computer, that is why np.isclose will work while == doesn't:

>>> (0.1 + 0.2) == 0.3
False

Instead:

>>> np.isclose(0.1 + 0.2, 0.3)
True

edited Jul 6, 2016 at 16:15

answered Jul 6, 2016 at 15:59

Imanol Luengo

16k3 gold badges52 silver badges68 bronze badges

1 Comment

senderle Over a year ago

I don't think it's a "much better option" than defining a range. The default tolerance values (rtol and atol) provided by isclose are arbitrary, and the results it generates are not always obvious or easy to predict -- to deal with complex floating point arithmetic, it does even more floating point arithmetic. A simple range is much easier to reason about precisely. Still, I agree that isclose is a useful alternative sometimes, so I linked to your answer from mine.

miki · Accepted Answer · 2022-09-28 09:35:22Z

1

If it may be of use to anyone: for very large 2D arrays, if you want to count how many time all elements appear within the entire array, one could flatten the array into a list and then count how many times each element appeared:

from itertools import chain
import collections
from collections import Counter

#large array is called arr
flatten_arr = list(chain.from_iterable(arr))
dico_nodeid_appearence = Counter(flatten_arr)
#how may times x appeared in the arr
dico_nodeid_appearence[x]

answered Sep 28, 2022 at 9:35

miki

6792 gold badges7 silver badges18 bronze badges

Comments

Community · Accepted Answer · 2017-05-23 12:33:00Z

To count the number of times x appears in any array, you can simply sum the boolean array that results from a == x:

>>> col = numpy.arange(3)
>>> cols = numpy.tile(col, 3)
>>> (cols == 1).sum()
3

It should go without saying, but I'll say it anyway: this is not very useful with floating point numbers unless you specify a range, like so:

>>> a = numpy.random.random((3, 3))
>>> ((a > 0.5) & (a < 0.75)).sum()
2

This general principle works for all sorts of tests. For example, if you want to count the number of floating point values that are integral:

>>> a = numpy.random.random((3, 3)) * 10
>>> a
array([[ 7.33955747,  0.89195947,  4.70725211],
       [ 6.63686955,  5.98693505,  4.47567936],
       [ 1.36965745,  5.01869306,  5.89245242]])
>>> a.astype(int)
array([[7, 0, 4],
       [6, 5, 4],
       [1, 5, 5]])
>>> (a == a.astype(int)).sum()
0
>>> a[1, 1] = 8
>>> (a == a.astype(int)).sum()
1

You can also use np.isclose() as described by Imanol Luengo, depending on what your goal is. But often, it's more useful to know whether values are in a range than to know whether they are arbitrarily close to some arbitrary value.

The problem with isclose is that its default tolerance values (rtol and atol) are arbitrary, and the results it generates are not always obvious or easy to predict. To deal with complex floating point arithmetic, it does even more floating point arithmetic! A simple range is much easier to reason about precisely. (This is an expression of a more general principle: first, do the simplest thing that could possibly work.)

Still, isclose and its cousin allclose have their uses. I usually use them to see if a whole array is very similar to another whole array, which doesn't seem to be your question.

thanks @senderle, how about comparing against a floating point number like -100.0 i.e. only a 0 after the decimal point

Collectives™ on Stack Overflow

Numpy find number of occurrences in a 2D array

4 Answers 4

Comments

1 Comment

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related