2

I have a numpy array like below. I need a count of rows where the first element is 2. So in the array below, four rows start with 2 - the answer would be 4. How is this best accomplished in numpy? (I cannot use pandas, but can use scipy).

array([[1, 4, 5],
       [1, 4, 5],
       [2, 4, 5],
       [2, 4, 5],
       [2, 4, 5],
       [2, 4, 5],
       [3, 4, 5],
       [3, 4, 5],
       [3, 4, 5],
       [3, 4, 5],
       [3, 4, 5],
       [3, 4, 5]])

3 Answers 3

5

First, take the first column, all rows:

a[:,0]

Then, find the 2s:

a[:,0] == 2

That gives you a boolean array. Which you can then sum:

(a[:,0] == 2).sum()
Sign up to request clarification or add additional context in comments.

Comments

3

There is np.count_nonzero which in a common idiom is applied to logical arrays generated by evaluating a condition

np.count_nonzero(data[:, 0] == 2)

Btw. it's probably just for the sake of example, but if your array is sorted like yours you can also use np.searchsorted

np.diff(np.searchsorted(data[:, 0], (2, 3)))[0]

Comments

1

One more approach in addition to above approaches

>>> x[:,0]==2
array([False, False,  True,  True,  True,  True, False, False, False,
       False, False, False], dtype=bool)

will give you TRUE for the rows which have first column as 2.

>>> x[x[:,0]==2]
array([[2, 4, 5],
       [2, 4, 5],
       [2, 4, 5],
       [2, 4, 5]])

gives you corresponding rows and which satisfy the required condition. Now, you can use shape function to get length.

x[x[:,0]==2].shape[0]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.