0

I have two lists of indices (idx[0] and idx[1]), and I should delete the corresponding rows from numpy array y_test.

y_test

12  11  10
1   2   2
3   2   3
4   1   2
13  1   10

idx[0] = [0,2]
idx[1] = [1,3]

I tried to delete the rows as follows (using ~). But it didn't work:

result = y_test[(~idx[0]+~idx[1]+~idx[2])]

Expected result:

result =

13  1   10
3
  • not +, but & for boolean AND operation Commented Feb 15, 2019 at 20:30
  • what was ~[0,2] supposed to do? [0,2]+[1,3] works to join 2 lists producing [0,2,1,3]. Commented Feb 15, 2019 at 20:45
  • If idx is a numpy array, using the bitwise negation operator on it is not doing what you think it is. Commented Feb 15, 2019 at 20:54

3 Answers 3

1

Instead of removing elements, just make a new array with the desired ones. This will keep any future indexing from getting jumbled up and maintain the old array.

import numpy as np
y_test = np.asarray([[12, 11, 10], [1, 2, 2], [3, 2, 3], [4, 1, 2], [13, 1, 10]])
idx = [[0, 2], [1, 3]]

# flatten list of lists
idx_flat = [i for j in idx for i in j]

# assign values that are NOT in your idx list to a new array
result = [row for num, row in enumerate(y_test) if num not in idx_flat]

# cast this however you want it, right now 'result' is a list of np.arrays
print result

[array([13,  1, 10])]


For an understanding of the flatten step using list comprehensions check this out

Sign up to request clarification or add additional context in comments.

Comments

1

You can use numpy.delete which deletes the subarrays along the axis.

np.delete(y_test, idx, axis=0)

Make sure that idx.dtype is an integer type and use numpy.astype if not.

Your approach did not work because idx is not a boolean index array but holds the indices. So ~ which is binary negation will produce ~[0, 2] = [-1, -3] (where both should be numpy arrays).

I would definitely recommend reading up on the difference between index arrays and boolean index arrays. For boolean index arrays I would suggest using numpy.logical_not and numpy.logical_or.

+ concatenates Python lists but is the standard plus for numpy arrays.

2 Comments

This solution gives me weird result. In my real data set the shape of y_test is (60000,1). Then the length of idx is 6000. Thus I expect the result of np.delete(y_test, idx, axis=0) to have the shape (54000,1), but it is 56846
Without the actual data I can only guess, but maybe the entries in idx are not unique. You can try numpy.unique and compare the sizes.
0

Since you are using NumPy I'd suggest masking in this way.


Setup:

import numpy as np

y_test = np.array([[12,11,10],
                   [1,2,2],
                   [3,2,3],
                   [4,1,2],
                   [13,1,10]])

idx = np.array([[0,2], [1,3]])


Generate the mask:

Generate a mask of ones then set to zero elements at index in idx:

mask = np.ones(len(y_test), dtype = int).reshape(5,1)
mask[idx.flatten()] = 0


Finally apply the mask:

y_test[~np.all(y_test * mask == 0, axis=1)]
#=> [[13  1 10]]

y_test has not been modified.

1 Comment

What do you mean?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.