The answer from a previous question:
In [173]: Numbers = np.array([3, 4, 5, 7, 8, 10,20])
...: Formating = np.array([0, 2 , 5, 12, 15, 22])
...: x = np.sort(Numbers);
...: l = np.searchsorted(x, Formating, side='left')
...:
In [174]: l
Out[174]: array([0, 0, 2, 6, 6, 7])
In [175]: for i in range(len(l)-1):
...: if l[i] >= l[i+1]:
...: print('Numbers between %d,%d = _0_' % (Formating[i], Formating[i+1]))
...: else:
...: print('Numbers between %d,%d = %s' % (Formating[i], Formating[i+1], ','.jo
...: in(map(str, list(x[l[i]:l[i+1]])))))
...:
Numbers between 0,2 = _0_
Numbers between 2,5 = 3,4
Numbers between 5,12 = 5,7,8,10
Numbers between 12,15 = _0_
Numbers between 15,22 = 20
Something that works fine with lists - in fact faster with lists than arrays:
In [182]: for i in range(len(Formating)-1):
...: print([x for x in Numbers if (Formating[i]<=x<Formating[i+1])])
...:
[]
[3, 4]
[5, 7, 8, 10]
[]
[20]
A version with iteration on Formating, but not Numbers. Rather similar to the version using searchsorted. I'm not sure which will be faster:
In [177]: for i in range(len(Formating)-1):
...: idx = (Formating[i]<=Numbers)&(Numbers<Formating[i+1])
...: print(Numbers[idx])
...:
[]
[3 4]
[ 5 7 8 10]
[]
[20]
We could get the idx mask for all values of Formating at once:
In [183]: mask=(Formating[:-1,None]<=Numbers)&(Numbers<Formating[1:,None])
In [184]: mask
Out[184]:
array([[False, False, False, False, False, False, False],
[ True, True, False, False, False, False, False],
[False, False, True, True, True, True, False],
[False, False, False, False, False, False, False],
[False, False, False, False, False, False, True]])
In [185]: N=Numbers[:,None].repeat(5,1).T # 5 = len(Formating)-1
In [186]: N
Out[186]:
array([[ 3, 4, 5, 7, 8, 10, 20],
[ 3, 4, 5, 7, 8, 10, 20],
[ 3, 4, 5, 7, 8, 10, 20],
[ 3, 4, 5, 7, 8, 10, 20],
[ 3, 4, 5, 7, 8, 10, 20]])
In [187]: np.ma.masked_array(N,~mask)
Out[187]:
masked_array(
data=[[--, --, --, --, --, --, --],
[3, 4, --, --, --, --, --],
[--, --, 5, 7, 8, 10, --],
[--, --, --, --, --, --, --],
[--, --, --, --, --, --, 20]],
mask=[[ True, True, True, True, True, True, True],
[False, False, True, True, True, True, True],
[ True, True, False, False, False, False, True],
[ True, True, True, True, True, True, True],
[ True, True, True, True, True, True, False]],
fill_value=999999)
Your lists are apparent there. But the list display still requires iteraiton:
In [188]: for row in mask:
...: print(Numbers[row])
[]
[3 4]
[ 5 7 8 10]
[]
[20]
I'll let you time test these alternatives with this or more realistic data. I suspect a pure list version is fastest for small problems, but I'm not sure how the others will scale.
edit
Following questions ask about sums. np.ma.sum, or the masked arrays own sum method, sums the unmasked values, effectively filling the masked values with 0.
In [253]: np.ma.masked_array(N,~mask).sum(axis=1)
Out[253]:
masked_array(data=[--, 7, 30, --, 20],
mask=[ True, False, False, True, False],
fill_value=999999)
In [256]: np.ma.masked_array(N,~mask).filled(0)
Out[256]:
array([[ 0, 0, 0, 0, 0, 0, 0],
[ 3, 4, 0, 0, 0, 0, 0],
[ 0, 0, 5, 7, 8, 10, 0],
[ 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 20]])
Actually we don't need to use the masked array mechanism to get here (though it can be nice visually):
In [258]: N*mask
Out[258]:
array([[ 0, 0, 0, 0, 0, 0, 0],
[ 3, 4, 0, 0, 0, 0, 0],
[ 0, 0, 5, 7, 8, 10, 0],
[ 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 20]])
In [259]: (N*mask).sum(axis=1)
Out[259]: array([ 0, 7, 30, 0, 20])
(Numbers[:,None]>=Formats[:-1]) & (Numbers[:,None]<=Formats[1:])might be a useful first step. It should be a 2d boolean array, with True where numbers fall in the desired format range.