4

I want to modify an empty bitmap by given indicators (x and y axis). For every coordinate given by the indicators the value should be raised by one.

So far so good everything seems to work. But if I have some similar indicators in my array of indicators it will only raise the value once.

>>> img
array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

>>> inds
array([[0, 0],
       [3, 4],
       [3, 4]])

Operation:

>>> img[inds[:,1], inds[:,0]] += 1

Result:

>>> img
    array([[1, 0, 0, 0, 0],
           [0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0],
           [0, 0, 0, 1, 0]])

Expected result:

>>> img
    array([[1, 0, 0, 0, 0],
           [0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0],
           [0, 0, 0, 0, 0],
           [0, 0, 0, 2, 0]])

Does someone have an idea how to solve this? Preferably a fast approach without the use of loops.

1

3 Answers 3

6

This is one way. Counting algorithm courtesy of @AlexRiley.

For performance implications of relative sizes of img and inds, see @PaulPanzer's answer.

# count occurrences of each row and return array
counts = (inds[:, None] == inds).all(axis=2).sum(axis=1)

# apply indices and counts
img[inds[:,1], inds[:,0]] += counts

print(img)

array([[1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 2, 0]])
Sign up to request clarification or add additional context in comments.

Comments

5

You could use numpy.add.at with a bit of manipulation to get the indices ready.

np.add.at(img, tuple(inds[:, [1, 0]].T), 1)

If you have larger inds arrays, this approach should remain fast... (though Paul Panzer's solution is faster)

Comments

4

Two remarks on the other two answers:

1) @jpp's can be improved by using np.unique with the axis and return_counts keywords.

2) If we translate to flat indexing we can use np.bincount which often (but not always, see last test case in benchmarks) is faster than np.add.at.

Thanks @miradulo for initial version of benchmarks.

import numpy as np

def jpp(img, inds):
    counts = (inds[:, None] == inds).all(axis=2).sum(axis=1)
    img[inds[:,1], inds[:,0]] += counts

def jpp_pp(img, inds):
    unq, cnts = np.unique(inds, axis=0, return_counts=True)
    img[unq[:,1], unq[:,0]] += cnts

def miradulo(img, inds):
    np.add.at(img, tuple(inds[:, [1, 0]].T), 1)

def pp(img, inds):
    imgf = img.ravel()
    indsf = np.ravel_multi_index(inds.T[::-1], img.shape[::-1])
    imgf += np.bincount(indsf, None, img.size)

inds = np.random.randint(0, 5, (3, 2))
big_inds = np.random.randint(0, 5, (10000, 2))
sml_inds = np.random.randint(0, 1000, (5, 2))
from timeit import timeit


for f in jpp, jpp_pp, miradulo, pp:
    print(f.__name__)
    for i, n, a in [(inds, 1000, 5), (big_inds, 10, 5), (sml_inds, 10, 1000)]:
        img = np.zeros((a, a), int)
        print(timeit("f(img, i)", globals=dict(img=img, i=i, f=f), number=n) * 1000 / n, 'ms')

Output:

jpp
0.011815106990979984 ms
2623.5026352020213 ms
0.04642329877242446 ms
jpp_pp
0.041291153989732265 ms
5.418520100647584 ms
0.05826510023325682 ms
miradulo
0.007099648006260395 ms
0.7788308983435854 ms
0.009103797492571175 ms
pp
0.0035401539935264736 ms
0.06540440081153065 ms
3.486583800986409 ms

2 Comments

Ahh this is nice, I'll get rid of my benchmark in favor of yours.
@jpp There is one scenario where bincount should not be the best which is large img, small inds. I'll try and add that to the benchmarks.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.