Remove rows of a numpy array based on a specific condition

Question

I have an array of four rows A = array([[-1, -1, -1, -1], [-1, -1, 1, 2], [-1, -1, 1, 1], [2, 1, -1, 2]]). In each row there are 4 numbers. How do I remove row#3 and row#4? In row#3 and row#4, 1 and 2 appear more than once respectively.

Is there a faster way to do it for arbitrary number of rows and columns? The main aim is to remove those rows where a non negative number appear more than once.

Sam Maule · Accepted Answer · 2019-12-20 15:40:29Z

2

You can use something like this: first create dictionary of occurrences of each value in the sub arrays using np.unique and only keep arrays where no positive number appears more than once.

A = np.array([[-1, -1, -1, -1], [-1, -1, 1, 2], [-1, -1, 1, 1], [2, 1, -1, 2]])

new_array = []

# loop through each array
for array in A:
    # Get a dictionary of the counts of each value
    unique, counts = np.unique(array, return_counts=True)
    counts = dict(zip(unique, counts))
    # Find the number of occurences of postive numbers
    positive_occurences = [value for key, value in counts.items() if key > 0]
    # Append to new_array if no positive number appears more than once
    if any(y > 1 for y in positive_occurences):
        continue
    else:
        new_array.append(array)

new_array = np.array(new_array)

this returns:

array([[-1, -1, -1, -1],
       [-1, -1,  1,  2]])

answered Dec 20, 2019 at 15:40

Sam Maule

1209 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Lalu Over a year ago

is there a way to know the indices of the rows which are not considered?

Lalu Over a year ago

i modifed a bit. Please see my answer and let me know what i did is ok.

9mat · Accepted Answer · 2019-12-20 15:53:53Z

2

My fully-vectorized approach:

sort each row
detect duplicates by shifting the sorted array to the left by one and compare with itself
mark rows with positive duplicates
drop

import numpy as np
a = np.array([[-1, -1, -1, -1], [-1, -1, 1, 2], [-1, -1, 1, 1], [2, 1, -1, 2]])

# sort each row
b = np.sort(a)

# mark positive duplicates
drop = np.any((b[:,1:]>0) & (b[:,1:] == b[:,:-1]), axis=1)

# drop
aa = a[~drop, :]

Output:
array([[-1, -1, -1, -1],
       [-1, -1,  1,  2]])

answered Dec 20, 2019 at 15:53

9mat

1,23410 silver badges13 bronze badges

Comments

Lalu · Accepted Answer · 2020-01-07 20:31:35Z

I modified also to store the indices:

A = np.array([[-1, -1, -1, -1], [-1, -1, 1, 2], [-1, -1, 1, 1], [2, 1, -1, 2]])

new_array = []
**indiceStore = np.array([])**

# loop through each array
for array in A:
    # Get a dictionary of the counts of each value
    unique, counts = np.unique(array, return_counts=True)
    counts = dict(zip(unique, counts))
    # Find the number of occurences of postive numbers
    positive_occurences = [value for key, value in counts.items() if key > 0]
    # Append to new_array if no positive number appears more than once
    if any(y > 1 for y in positive_occurences):
        **indiceStore = np.append(indiceStore, int(array))**
        continue
    else:
        new_array.append(array)

new_array = np.array(new_array)

Let me kniow if this is right.

Collectives™ on Stack Overflow

Remove rows of a numpy array based on a specific condition

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related