Remove duplicate index in 2D array

Question

I have this 2D numpy array here:

arr = np.array([[1,2],
                [2,2],
                [3,2],
                [4,2],
                [5,3]])

I would like to delete all duplicates corresponding to the previous index at index 1 and get an output like so:

np.array([[1,2],
          [5,3]])

However, when I try my code it errors. Here is my code:

for x in range(0, len(arr)):
    if arr[x][1] == arr[x-1][1]:
        arr = np.delete(arr, x, 0)

>>> IndexError: index 3 is out of bounds for axis 0 with size 2

Mark · Accepted Answer · 2022-07-03 00:02:34Z

1

Rather than trying to delete from the array, you can use np.unique to find the indices of first occurrences of the unique values in the second columns and use that to pull those values out:

import numpy as np   

arr = np.array([[1,2],
                [2,2],
                [3,2],
                [4,2],
                [5,3]])

u, i = np.unique(arr[:,1], return_index=True)

arr[i]    
# array([[1, 2],
#       [5, 3]])

answered Jul 3, 2022 at 0:02

Mark

92.6k8 gold badges116 silver badges156 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

mozway Over a year ago

Note that this doesn't follow the "delete all duplicates corresponding to the previous index" rule, if there were another group of 1s after the 2s it would be deleted completely (which my be wanted, or not...)

Mark Over a year ago

That's a fair point @mozway, I certainly didn't read it that way, but on re-reading, it's a reasonable interpretation.

Collectives™ on Stack Overflow

Remove duplicate index in 2D array

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related