0

I have a 2d numpy array that contains some numbers like:

data = 
[[1.1, 1.2, 1.3, 1.4],
[2.1, 2.2, 2.3, -1.0],
[-1.0, 3.2, 3.3, -1.0],
[-1.0, -1.0. -1.0, -1.0]]

I want to remove every row that contains the value -1.0 2 or more times, so I'm left with

data = 
[[1.1, 1.2, 1.3, 1.4],
[2.1, 2.2, 2.3, -1.0]]

I found this question which looks like it's very close to what I'm trying to do, but I can't quite figure out how I can rewrite that to fit my use case.

1
  • You can break this down into a series of steps. First, determine whether a row contains two or more -1 values. Then create an array of True and False values indicating whether each row satisfies the condition. Then mask the original array with the boolean array to remove the rows. And to help you get started, here's how you can do the first two steps... (data == -1).sum(axis=1) >= 2 Commented Aug 23, 2022 at 16:08

2 Answers 2

1

You can easily do it with this piece of code:

new_data = data[(data == -1).sum(axis=1) < 2]

Result:

>>> new_data
array([[ 1.1,  1.2,  1.3,  1.4],
       [ 2.1,  2.2,  2.3, -1. ]])
Sign up to request clarification or add additional context in comments.

Comments

0
def remove_rows(data, threshold):
    mask = np.array([np.sum(row == -1) < threshold for row in data])
    return data[mask]

This function will return a new array with no rows having -1's more than or equal to the threshold

You need to pass in a Numpy array for it to work.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.