3

I have an array a = np.array([2, 2, 2, 3, 3, 15, 7, 7, 9]) that continues like that. I would like to shift this array but I'm not sure if I can use np.roll() here.

The array I would like to produce is [0, 0, 0, 2, 2, 3, 15, 15, 7].

As you can see, the first like numbers which are in array a (in this case the three '2's) should be replaced with '0's. Everything should then be shifted such that the '3's are replaced with '2's, the '15' is replaced with the '3' etc. Ideally I would like to do this operation without any for loop as I need it to run quickly.

I realise this operation may be a bit confusing so please ask questions.

3
  • 1
    Kudos for the nicely asked, interesting NumPy challenge. If only np.unique() had an unsorted option, this would be a one-liner. Commented Aug 13, 2021 at 12:56
  • Do you have recurring values in a? Commented Aug 13, 2021 at 13:47
  • Yep, I do @Ivan. Commented Aug 13, 2021 at 13:56

4 Answers 4

2

If you want to stick with NumPy, you can achieve this using np.unique by returning the counts per unique elements with the return_counts option.

Then, simply roll the values and construct a new array with np.repeat:

>>> s, i, c = np.unique(a, return_index=True, return_counts=True)
(array([ 2,  3,  7,  9, 15]), array([0, 3, 6, 8, 5]), array([3, 2, 2, 1, 1]))

The three outputs are respectively: unique sorted elements, indices of first encounter unique element, and the count per unique element.

np.unique sorts the value, so we need to unsort the values as well as the counts first. We can then shift the values with np.roll:

>>> idx = np.argsort(i)
>>> v = np.roll(s[idx], 1)
>>> v[0] = 0
array([ 0,  2,  3, 15,  7])

Alternatively with np.append, this requires a whole copy though:

>>> v = np.append([0], s[idx][:-1])
array([ 0,  2,  3, 15,  7])

Finally reassemble:

>>> np.repeat(v, c[idx])
array([ 0,  0,  0,  2,  2,  3, 15, 15,  7])

Another - more general - solution that will work when there are recurring values in a. This requires the use of np.diff.

You can get the indices of the elements with:

>>> i = np.diff(np.append(a, [0])).nonzero()[0] + 1
array([3, 5, 6, 8, 9])

>>> idx = np.append([0], i)
array([0, 3, 5, 6, 8, 9])

The values are then given using a[idx]:

>>> v = np.append([0], a)[idx]
array([ 0,  2,  3, 15,  7,  9])

And the counts per element with:

>>> c = np.append(np.diff(i, prepend=0), [0])
array([3, 2, 1, 2, 1, 0])

Finally, reassemble:

>>> np.repeat(v, c)
array([ 0,  0,  0,  2,  2,  3, 15, 15,  7])
Sign up to request clarification or add additional context in comments.

10 Comments

Clever — I got frustrated with trying to unsort the uniques.
The resulting array seems to be incorrect though. It should be [0, 0, 0, 2, 2, 3, 15, 15, 7]. Also, what if some value of the array changes and then shows up again e.g. [2, 2, 2, 3, 3, 2, 2]?
Indeed this won't work with recurring values in a. I have fixed the error, the array of counts c also needs to be unsorted...
Thanks for the quick reply and well-explained solution!
@bb1, and OP - I have an alternative solution which will work with recurring values in a.
|
2

This is not using numpy, but one approach that comes to mind is to itertools.groupby to collect contiguous runs of the same elements. Then shift all the elements (by prepending a 0) and use the counts to repeat them.

from itertools import chain, groupby

def shift(data):
    values = [(k, len(list(g))) for k,g in groupby(data)]
    keys = [0] + [i[0] for i in values]
    reps = [i[1] for i in values]
    return list(chain.from_iterable([[k]*rep for k, rep in zip(keys, reps)]))

For example

>>> a = np.array([2,2,2,3,3,15,7,7,9])
>>> shift(a)
[0, 0, 0, 2, 2, 3, 15, 15, 7]

Comments

1

You can try this code:

import numpy as np
a = np.array([2, 2, 2, 3, 3, 15, 7, 7, 9])
diff_a=np.diff(a)
idx=np.flatnonzero(diff_a)
val=diff_a[idx]
val=np.insert(val[:-1],0, a[0]) #update value
diff_a[idx]=val
res=np.append([0],np.cumsum(diff_a))
print(res)

3 Comments

Although this was the fastest solution, it does not work when elements are repeated. For example, it does not work for array a = np.array([2, 2, 2, 3, 3, 15, 7, 7, 9, 7, 7, 8, 7])
a is [ 2 2 2 3 3 15 7 7 9 7 7 8 7] result is [ 0 0 0 2 2 3 15 15 7 9 9 7 8] Where is my mistake?
Ah yes, sorry, it seems when I updated your code to work in cupy it did not work for more complex examples like the one I gave above. Apparently cupy does not have cp.insert() so I had to find a work around for that.
0

You can try this:

import numpy as np
a = np.array([2, 2, 2, 3, 3, 15, 7, 7, 9])

z = a - np.pad(a, (1,0))[:-1]
z[m] = np.pad(z[(m := z!=0)], (1,0))[:-1]
print(z.cumsum())

It gives:

[ 0  0  0  2  2  3 15 15  7]

2 Comments

print(z) did not give array([ 0, 0, 0, 2, 2, 3, 15, 15, 7])
The result is z.cumsum(), so try print(z.cumsum()) instead.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.