26

I have a Numpy array and a list of indices whose values I would like to increment by one. This list may contain repeated indices, and I would like the increment to scale with the number of repeats of each index. Without repeats, the command is simple:

a=np.zeros(6).astype('int')
b=[3,2,5]
a[b]+=1

With repeats, I've come up with the following method.

b=[3,2,5,2]                     # indices to increment by one each replicate
bbins=np.bincount(b)
b.sort()                        # sort b because bincount is sorted
incr=bbins[np.nonzero(bbins)]   # create increment array
bu=np.unique(b)                 # sorted, unique indices (len(bu)=len(incr))
a[bu]+=incr

Is this the best way? Is there are risk involved with assuming that the np.bincount and np.unique operations would result in the same sorted order? Am I missing some simple Numpy operation to solve this?

1
  • 1
    Note that numpy.zeros(6).astype('int') is better written as numpy.zeros(6, int). Commented Jan 5, 2010 at 8:39

3 Answers 3

43

In numpy >= 1.8, you can also use the at method of the addition 'universal function' ('ufunc'). As the docs note:

For addition ufunc, this method is equivalent to a[indices] += b, except that results are accumulated for elements that are indexed more than once.

So taking your example:

a = np.zeros(6).astype('int')
b = [3, 2, 5, 2]

…to then…

np.add.at(a, b, 1)

…will leave a as…

array([0, 0, 2, 1, 0, 1])
Sign up to request clarification or add additional context in comments.

3 Comments

This solution is the most elegant one AFAIK!
I'm trying to do this with a matrix, but I'm getting an error: arr = np.array([[1,3,5],[7,9,11]]); lIndexes = [[0,1],[1,0],[1,2]] np.add.at(arr, lIndexes, 1) Any ideas?
I'd suggest posting this new question about multi-dimensional indexing as a separate question.
6

After you do

bbins=np.bincount(b)

why not do:

a[:len(bbins)] += bbins

(Edited for further simplification.)

4 Comments

Would this not be slower, when b contains just a few large bin numbers?
Yes, it will be slower than a simple Python loop in that case, but still faster than OP's code. I did a quick timing test with b = [99999, 99997, 99999], and a = np.zeros(1000, 'int'). Timings are: OP: 2.5 ms, mine: 495 us, simple loop: 84 us.
This works well. A simple loop has generally been slower in my program. Thanks.
Is there a similar way to accomplish this in a multi-dimensional case?
1

If b is a small subrange of a, one can refine Alok's answer like this:

import numpy as np
a = np.zeros( 100000, int )
b = np.array( [99999, 99997, 99999] )

blo, bhi = b.min(), b.max()
bbins = np.bincount( b - blo )
a[blo:bhi+1] += bbins

print a[blo:bhi+1]  # 1 0 2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.