58

I have a boolean mask array a of length n:

a = np.array([True, True, True, False, False])

I have a 2d array with n columns:

b = np.array([[1,2,3,4,5], [1,2,3,4,5]])

I want a new array which contains only the "True"-values, for example

c = ([[1,2,3], [1,2,3]])

c = a * b does not work because it contains also "0" for the false columns what I don't want

c = np.delete(b, a, 1) does not work

Any suggestions?

3 Answers 3

79

You probably want something like this:

>>> a = np.array([True, True, True, False, False])
>>> b = np.array([[1,2,3,4,5], [1,2,3,4,5]])
>>> b[:,a]
array([[1, 2, 3],
       [1, 2, 3]])

Note that for this kind of indexing to work, it needs to be an ndarray, like you were using, not a list, or it'll interpret the False and True as 0 and 1 and give you those columns:

>>> b[:,[True, True, True, False, False]]   
array([[2, 2, 2, 1, 1],
       [2, 2, 2, 1, 1]])
Sign up to request clarification or add additional context in comments.

4 Comments

I've used this solution and it works well! But on scaling up to a ndarray of shape (2800000,600), trying to use a mask with 200 True values is slow. Are there any optimisations?
2.8M? Normally I would suggest just in time compiling - numba.pydata.org - not actually sure it will help here.
Try numpy.compress (for bools) or numpy.take (for indices), see stackoverflow.com/q/46041811/882436
I did calculate the mask and had to cast the mask to type np.bool. So I added b[:,a.astype(np.bool)]
5

You can use numpy.ma module and use np.ma.masked_array function to do so.

>>> x = np.array([1, 2, 3, -1, 5])                                                
>>> mx = ma.masked_array(x, mask=[0, 0, 0, 1, 0])
masked_array(data=[1, 2, 3, --, 5], mask=[False, False,  False, True, False], fill_value=999999)

Comments

1

Hope I'm not too late! Here's your array:

X = np.array([[1, 2, 3, 4, 5], 
              [1, 2, 3, 4, 5]])

Let's create an array of zeros of the same shape as X:

mask = np.zeros_like(X)
# array([[0, 0, 0, 0, 0],
#        [0, 0, 0, 0, 0]])

Then, specify the columns that you want to mask out or hide with a 1. In this case, we want the last 2 columns to be masked out.

mask[:, -2:] = 1
# array([[0, 0, 0, 1, 1],
#        [0, 0, 0, 1, 1]])

Create a masked array:

X_masked = np.ma.masked_array(X, mask)
# masked_array(data=[[1, 2, 3, --, --],
#                    [1, 2, 3, --, --]],
#              mask=[[False, False, False,  True,  True],
#                    [False, False, False,  True,  True]],
#              fill_value=999999)

We can then do whatever we want with X_masked, like taking the sum of each column (along axis=0):

np.sum(X_masked, axis=0)
# masked_array(data=[2, 4, 6, --, --],
#              mask=[False, False],
#              fill_value=1e+20)

Great thing about this is that X_masked is just a view of X, not a copy.

X_masked.base is X
# True

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.