How to delete rows of numpy array by multiple row indices?

Question

I have two lists of indices (idx[0] and idx[1]), and I should delete the corresponding rows from numpy array y_test.

y_test

12  11  10
1   2   2
3   2   3
4   1   2
13  1   10

idx[0] = [0,2]
idx[1] = [1,3]

I tried to delete the rows as follows (using ~). But it didn't work:

result = y_test[(~idx[0]+~idx[1]+~idx[2])]

Expected result:

result =

13  1   10

what was ~[0,2] supposed to do? [0,2]+[1,3] works to join 2 lists producing [0,2,1,3]. — hpaulj
– hpaulj, Commented Feb 15, 2019 at 20:45
If idx is a numpy array, using the bitwise negation operator on it is not doing what you think it is. — user3483203
– user3483203, Commented Feb 15, 2019 at 20:54

Jello · Accepted Answer · 2019-02-15 21:06:03Z

1

Instead of removing elements, just make a new array with the desired ones. This will keep any future indexing from getting jumbled up and maintain the old array.

import numpy as np
y_test = np.asarray([[12, 11, 10], [1, 2, 2], [3, 2, 3], [4, 1, 2], [13, 1, 10]])
idx = [[0, 2], [1, 3]]

# flatten list of lists
idx_flat = [i for j in idx for i in j]

# assign values that are NOT in your idx list to a new array
result = [row for num, row in enumerate(y_test) if num not in idx_flat]

# cast this however you want it, right now 'result' is a list of np.arrays
print result

[array([13,  1, 10])]

For an understanding of the flatten step using list comprehensions check this out

answered Feb 15, 2019 at 21:06

Jello

4206 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Lukas Koestler · Accepted Answer · 2019-02-15 20:55:37Z

1

You can use numpy.delete which deletes the subarrays along the axis.

np.delete(y_test, idx, axis=0)

Make sure that idx.dtype is an integer type and use numpy.astype if not.

Your approach did not work because idx is not a boolean index array but holds the indices. So ~ which is binary negation will produce ~[0, 2] = [-1, -3] (where both should be numpy arrays).

I would definitely recommend reading up on the difference between index arrays and boolean index arrays. For boolean index arrays I would suggest using numpy.logical_not and numpy.logical_or.

+ concatenates Python lists but is the standard plus for numpy arrays.

edited Feb 15, 2019 at 20:55

answered Feb 15, 2019 at 20:45

Lukas Koestler

3603 silver badges12 bronze badges

2 Comments

ScalaBoy Over a year ago

This solution gives me weird result. In my real data set the shape of y_test is (60000,1). Then the length of idx is 6000. Thus I expect the result of np.delete(y_test, idx, axis=0) to have the shape (54000,1), but it is 56846

Lukas Koestler Over a year ago

Without the actual data I can only guess, but maybe the entries in idx are not unique. You can try numpy.unique and compare the sizes.

iGian · Accepted Answer · 2019-02-15 21:36:25Z

0

Since you are using NumPy I'd suggest masking in this way.

Setup:

import numpy as np

y_test = np.array([[12,11,10],
                   [1,2,2],
                   [3,2,3],
                   [4,1,2],
                   [13,1,10]])

idx = np.array([[0,2], [1,3]])

Generate the mask:

Generate a mask of ones then set to zero elements at index in idx:

mask = np.ones(len(y_test), dtype = int).reshape(5,1)
mask[idx.flatten()] = 0

Finally apply the mask:

y_test[~np.all(y_test * mask == 0, axis=1)]
#=> [[13  1 10]]

y_test has not been modified.

answered Feb 15, 2019 at 21:36

iGian

11.2k3 gold badges24 silver badges38 bronze badges

1 Comment

iGian Over a year ago

What do you mean?

Collectives™ on Stack Overflow

How to delete rows of numpy array by multiple row indices?

3 Answers 3

Comments

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related