9

Above all, sorry for my bad English.

I have this array t:

array([[ 0,  1,  2,  0,  4,  5,  6,  7,  8,  9],
       [ 0, 11,  0, 13,  0, 15,  0, 17, 18,  0]])

I would like to delete the columns where the value of second line is null. Here, I would like to delete the columns 0, 2, 4, 6 and 9, to obtain this array:

array([[  1,   0,   5,   7,  8 ],
       [ 11,  13,  15,  17, 18 ]])

I tried with np.sum() but didn't succeed.

5 Answers 5

20

Similar to Juh_, but more expressive, and avoiding some minor unnecessary performance overhead. A grand total of 12 highly pythonic, explicit and unambigious characters. This is really numpy 101; if you are still trying to wrap your head around this, you would do yourself a favor by reading a numpy primer.

import numpy as np
a = np.array([[ 0,  1,  2,  0,  4,  5,  6,  7,  8,  9],
              [ 0, 11,  0, 13,  0, 15,  0, 17, 18,  0]])
print a[:,a[1]!=0]
Sign up to request clarification or add additional context in comments.

Comments

6

With numpy.delete:

a = np.array([[0, 1, 2, 0, 4, 5, 6, 7, 8, 9], [0, 11, 0, 13, 0, 15, 0, 17, 18, 0]])

indices = [i for (i,v) in enumerate(a[1]) if v==0]
# [0, 2, 4, 6, 9]

a = np.delete(a, indices, 1)
# array([[ 1,  0,  5,  7,  8], [11, 13, 15, 17, 18]])

4 Comments

This is a highly non-numpythonic solution. It nullifies one of the main points of numpy, to move loops from python to the C level
That is if you care about performance. The main reason why I use numpy, is for convenience. But I agree that there are better solutions.
Upvoting to balance Eelco's unnecessary downvote. This is useful even though it is not the best.
Upvoting because this is the only answer which actually does what the OP wanted - to remove data from their array. Yes its nice to play with the strides/indexes to obtain views with just the data you want - but sometimes you just need to delete a row.
2

Simple (fully numpy) solution:

import numpy as np

t = np.array([[ 0, 1, 2, 0, 4, 5, 6, 7, 8, 9], [ 0, 11, 0, 13, 0, 15, 0, 17, 18, 0]])
indices_to_keep = t[1].nonzero()[0]

print t[:,indices_to_keep]
# [[ 1  0  5  7  8]
#  [11 13 15 17 18]]

Comments

2

Using np.where:

>>> t.T[np.where(t[1])].T
array([[ 1,  0,  5,  7,  8],
       [11, 13, 15, 17, 18]])

1 Comment

Boolean indexing, as in Eelco's answer, is typically faster than where + fancy indexing.
0

I got this working like this:

data = array([[ 0, 1, 2, 0, 4, 5, 6, 7, 8, 9], [ 0, 11, 0, 13, 0, 15, 0, 17, 18, 0]])
res = array([(a, b,) for a, b in zip(data[0], data[1]) if b]).transpose()

got the result

In [23]: res
Out[23]: 
array([[ 1,  0,  5,  7,  8],
       [11, 13, 15, 17, 18]])

2 Comments

See eskaev. This is not numpythonic, and loses orders of magnitude of performance relative to the simple numpy solution that exists for this.
Upvoting to balance Eelco's unnecessary downvote. This is useful even though it is not the best.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.