python function or specifically numpy that returns an array with numbers of repetitions of an item in a row

Question

the thing that I'm looking for, is a function that given "a" will return "b" by the following:

a = numpy.array([1, 1, 1, 1, 5, 5, 5, 5, 5, 6, 5, 2, 2, 2, 2])

which at first 1 shows 4 times in a row, after that 5 shows 5 times, 6 shows 1 time, 5 shows 1 and 2 shows 4 times

and what will return is an array like this:

b = numpy.array([4, 5, 1, 1, 4])

the function that im looking for will treat 5 this way, even though 5 is in the array "a" 6 times in total, it will count seperately per sequence

it is a very specific, i wrote a function like this, but i want to know if there is in numpy a built-in function like this for fast perfotmance

thanks in advance

No, there is no built-in function. However, doing the consecutive count is easy enough. If you want to see a more general solution, research "run length encoding" — Prune
– Prune, Commented Oct 2, 2020 at 18:24

Quang Hoang · Accepted Answer · 2020-10-02 18:28:45Z

1

This can be done with bincount on cumsum of nonzero diff:

out = np.bincount((np.diff(a)!=0).cumsum())
out[0] += 1

Output:

array([4, 5, 1, 1, 4])

answered Oct 2, 2020 at 18:28

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

mathfux · Accepted Answer · 2020-10-02 20:25:58Z

0

You can also use additional attributes of np.diff to create an array of differences with extra units in both ends added artificially:

>>> np.diff(a,prepend=a[0]-1,append=a[-1]+1)
array([ 1,  0,  0,  0,  4,  0,  0,  0,  0,  1, -1, -3,  0,  0,  0,  1])

Now this is ready for combination of np.diff and np.nonzero:

x = np.diff(a, prepend=a[0]-1, append=a[-1]+1)
np.diff(np.nonzero(x))

Output:

array([[4, 5, 1, 1, 4]], dtype=int32)

But this is a little bit slower: 3x slower for small array a and 25% slower for large array a = np.random.randint(3,size=10000000).

answered Oct 2, 2020 at 20:18

mathfux

5,9792 gold badges20 silver badges38 bronze badges