10

I would like to remove duplicates which follow each other, but not duplicates along the whole array. Also, I want to keep the ordering unchanged.

So if the input is [0 0 1 3 2 2 3 3] the output should be [0 1 3 2 3]

I found a way using itertools.groupby() but I am looking for a faster NumPy solution.

1
  • Does anyone have a way to do this in a 2d array? Commented Sep 30, 2022 at 20:58

3 Answers 3

18
a[np.insert(np.diff(a).astype(np.bool), 0, True)]
Out[99]: array([0, 1, 3, 2, 3])

The general idea is to use diff to find the difference between two consecutive elements in the array. Then we only index those which give non-zero differences elements. But since the length of diff is shorter by 1. So before indexing, we need to insert the True to the beginning of the diff array.

Explanation:

In [100]: a
Out[100]: array([0, 0, 1, 3, 2, 2, 3, 3])

In [101]: diff = np.diff(a).astype(np.bool)

In [102]: diff
Out[102]: array([False,  True,  True,  True, False,  True, False], dtype=bool)

In [103]: idx = np.insert(diff, 0, True)

In [104]: idx
Out[104]: array([ True, False,  True,  True,  True, False,  True, False], dtype=bool)

In [105]: a[idx]
Out[105]: array([0, 1, 3, 2, 3])
Sign up to request clarification or add additional context in comments.

Comments

3

For NumPy version >= 1.16.0 you can use the prepend argument:

a[np.diff(a, prepend=np.nan).astype(bool)]

Comments

1

For pure python wich also works with numpy arrays use this:

def modify(l):
    last = None
    for e in l:
        if e != last:
            yield e

        last = e

pure = modify([0, 0, 1, 3, 2, 2, 3, 3])

import numpy
num = numpy.array(modify(numpy.array([0, 0, 1, 3, 2, 2, 3, 3])))

I don't know if there are any numpy functions wich would speed this up.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.