4

Is there a numpy function to count the number of occurrences of a certain value in a 2D numpy array. E.g.

np.random.random((3,3))

array([[ 0.68878371,  0.2511641 ,  0.05677177],
       [ 0.97784099,  0.96051717,  0.83723156],
       [ 0.49460617,  0.24623311,  0.86396798]])

How do I find the number of times 0.83723156 occurs in this array?

4 Answers 4

8
arr = np.random.random((3,3))
# find the number of elements that get really close to 1.0
condition = arr == 0.83723156
# count the elements
np.count_nonzero(condition)

The value of condition is a list of booleans representing whether each element of the array satisfied the condition. np.count_nonzero counts how many nonzero elements are in the array. In the case of booleans it counts the number of elements with a True value.

To be able to deal with floating point accuracy, you could do something like this instead:

condition = np.fabs(arr - 0.83723156) < 0.001
Sign up to request clarification or add additional context in comments.

Comments

3

For floating point arrays np.isclose is much better option than either comparing with the exactly same element or defining a custom range.

>>> a = np.array([[ 0.68878371,  0.2511641 ,  0.05677177],
                  [ 0.97784099,  0.96051717,  0.83723156],
                  [ 0.49460617,  0.24623311,  0.86396798]])

>>> np.isclose(a, 0.83723156).sum()
1

Note that real numbers are not represented exactly in a computer, that is why np.isclose will work while == doesn't:

>>> (0.1 + 0.2) == 0.3
False

Instead:

>>> np.isclose(0.1 + 0.2, 0.3)
True

1 Comment

I don't think it's a "much better option" than defining a range. The default tolerance values (rtol and atol) provided by isclose are arbitrary, and the results it generates are not always obvious or easy to predict -- to deal with complex floating point arithmetic, it does even more floating point arithmetic. A simple range is much easier to reason about precisely. Still, I agree that isclose is a useful alternative sometimes, so I linked to your answer from mine.
1

If it may be of use to anyone: for very large 2D arrays, if you want to count how many time all elements appear within the entire array, one could flatten the array into a list and then count how many times each element appeared:

from itertools import chain
import collections
from collections import Counter

#large array is called arr
flatten_arr = list(chain.from_iterable(arr))
dico_nodeid_appearence = Counter(flatten_arr)
#how may times x appeared in the arr
dico_nodeid_appearence[x]

Comments

0

To count the number of times x appears in any array, you can simply sum the boolean array that results from a == x:

>>> col = numpy.arange(3)
>>> cols = numpy.tile(col, 3)
>>> (cols == 1).sum()
3

It should go without saying, but I'll say it anyway: this is not very useful with floating point numbers unless you specify a range, like so:

>>> a = numpy.random.random((3, 3))
>>> ((a > 0.5) & (a < 0.75)).sum()
2

This general principle works for all sorts of tests. For example, if you want to count the number of floating point values that are integral:

>>> a = numpy.random.random((3, 3)) * 10
>>> a
array([[ 7.33955747,  0.89195947,  4.70725211],
       [ 6.63686955,  5.98693505,  4.47567936],
       [ 1.36965745,  5.01869306,  5.89245242]])
>>> a.astype(int)
array([[7, 0, 4],
       [6, 5, 4],
       [1, 5, 5]])
>>> (a == a.astype(int)).sum()
0
>>> a[1, 1] = 8
>>> (a == a.astype(int)).sum()
1

You can also use np.isclose() as described by Imanol Luengo, depending on what your goal is. But often, it's more useful to know whether values are in a range than to know whether they are arbitrarily close to some arbitrary value.

The problem with isclose is that its default tolerance values (rtol and atol) are arbitrary, and the results it generates are not always obvious or easy to predict. To deal with complex floating point arithmetic, it does even more floating point arithmetic! A simple range is much easier to reason about precisely. (This is an expression of a more general principle: first, do the simplest thing that could possibly work.)

Still, isclose and its cousin allclose have their uses. I usually use them to see if a whole array is very similar to another whole array, which doesn't seem to be your question.

1 Comment

thanks @senderle, how about comparing against a floating point number like -100.0 i.e. only a 0 after the decimal point

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.