This question is similar to this one.
I have a 2d boolean array "belong" and a 2d float array "angles". What I want is to sum along the rows the angles for which the corresponding index in belong is True, and do that with numpy (ie. avoid python loops). I don't need to store the resulting rows, which would have different lengths and as explained in the linked question would require a list.
So what I attempted is np.sum(angles[belong] ,axis=1), but angles[belong] returns a 1d result, and I can't reduce it as I want. I have also tried np.sum(angles*belong ,axis=1) and that works. But I wonder if I could improve the timing by accessing only the indexes where belong is True. belong is True about 30% of the time and angles is a simplification of a longer formula which involves angles.
UPDATE
I like the solution with einsum, however in my actual computation the speed up is tiny. I used angles in the question to simplify, in practice it is a formula that uses angles. I suspect that this formula is calculated for all the angles (regardless of belong) and then passed to einsum, which would perform the computation.
This is what I've done:
THRES_THETA and max_line_length are floats. belong, angle and lines_lengths_vstacked have shape (1653, 58) and np.count_nonzero(belong)/belong.size -> 0.376473287856979
l2 = (lambda angle=angle, belong=belong, THRES_THETA=THRES_THETA, lines_lengths_vstacked=lines_lengths_vstacked, max_line_length=max_line_length:
np.sum(belong*(0.3 * (1-(angle/THRES_THETA)) + 0.7 * (lines_lengths_vstacked/max_line_length)), axis=1)) #base method
t2 = timeit.Timer(l2)
print(t2.repeat(3, 100))
l1 = (lambda angle=angle, belong=belong, THRES_THETA=THRES_THETA, lines_lengths_vstacked=lines_lengths_vstacked, max_line_length=max_line_length:
np.einsum('ij,ij->i', belong, 0.3 * (1-(angle/THRES_THETA)) + 0.7 * (lines_lengths_vstacked/max_line_length)))
t1 = timeit.Timer(l1)
print(t1.repeat(3, 100))
l3 = (lambda angle=angle, belong=belong:
np.sum(angle*belong ,axis=1)) #base method
t3 = timeit.Timer(l3)
print(t3.repeat(3, 100))
l4 = (lambda angle=angle, belong=belong:
np.einsum('ij,ij->i', belong, angle))
t4 = timeit.Timer(l4)
print(t4.repeat(3, 100))
and the results were:
[0.2505458095931187, 0.22666162878242901, 0.23591678551324263]
[0.23295411847036418, 0.21908727226505043, 0.22407296178704272]
[0.03711204915708555, 0.03149960399994978, 0.033403337575027114]
[0.025264803208228992, 0.022590580646423053, 0.024585736455331464]
If we look at the last two rows, the one corresponding to einsum is about 30% faster than using the base method. But if we look at the first two rows, the speed up for the einsum method is smaller, just about 0.1% faster.
I'm not sure if this timing can be improved.