Checking in between values with Numpy PYthon

Question

I am trying to convert the code down below to the Numpy version. The vanilla python code checks the previous and current values of Formating and checks to see if any of the Numbers values are in between them. The Numpy version of this code is faulty how would i be able to fix it? The code was gotten from the issue:issue link

Values:

Numbers = np.array([3, 4, 5, 7, 8, 10,20])
Formating = np.array([0, 2 , 5, 12, 15, 22])
x = np.sort(Numbers);
l = np.searchsorted(x, Formating, side='left')

Vanilla Python:

for i in range(len(l)-1):
    if l[i] >= l[i+1]:
        print('Numbers between %d,%d = _0_' % (Formating[i], Formating[i+1]))
    else:
        print('Numbers between %d,%d = %s' % (Formating[i], Formating[i+1], ','.join(map(str, list(x[l[i]:l[i+1]])))))

Numpy Version:

L_index = np.arange(0, len(l)-1, 1)
result= np.where(l[L_index] >= l[L_index+1], 0 , l )

Expected output:

[0]
[3 4]
[5 7 8 10]
[0]
[20]

If you expect lists (or arrays)that differ in length, that's a good indication that a 'pure' numpy option isn't possible. Arrays are 'rectangular', not 'raggeded'. There are tricks that can create padded arrays or masked ones. — hpaulj
– hpaulj, Commented Mar 11, 2021 at 3:45
(Numbers[:,None]>=Formats[:-1]) & (Numbers[:,None]<=Formats[1:]) might be a useful first step. It should be a 2d boolean array, with True where numbers fall in the desired format range. — hpaulj
– hpaulj, Commented Mar 11, 2021 at 4:47

hpaulj · Accepted Answer · 2021-03-12 06:44:40Z

The answer from a previous question:

In [173]: Numbers = np.array([3, 4, 5, 7, 8, 10,20])
     ...: Formating = np.array([0, 2 , 5, 12, 15, 22])
     ...: x = np.sort(Numbers);
     ...: l = np.searchsorted(x, Formating, side='left')
     ...: 
In [174]: l
Out[174]: array([0, 0, 2, 6, 6, 7])
In [175]: for i in range(len(l)-1):
     ...:     if l[i] >= l[i+1]:
     ...:         print('Numbers between %d,%d = _0_' % (Formating[i], Formating[i+1]))
     ...:     else:
     ...:         print('Numbers between %d,%d = %s' % (Formating[i], Formating[i+1], ','.jo
     ...: in(map(str, list(x[l[i]:l[i+1]])))))
     ...: 
Numbers between 0,2 = _0_
Numbers between 2,5 = 3,4
Numbers between 5,12 = 5,7,8,10
Numbers between 12,15 = _0_
Numbers between 15,22 = 20

Something that works fine with lists - in fact faster with lists than arrays:

In [182]: for i in range(len(Formating)-1):
     ...:     print([x for x in Numbers if (Formating[i]<=x<Formating[i+1])])
     ...: 
[]
[3, 4]
[5, 7, 8, 10]
[]
[20]

A version with iteration on Formating, but not Numbers. Rather similar to the version using searchsorted. I'm not sure which will be faster:

In [177]: for i in range(len(Formating)-1):
     ...:     idx = (Formating[i]<=Numbers)&(Numbers<Formating[i+1])
     ...:     print(Numbers[idx])
     ...: 
[]
[3 4]
[ 5  7  8 10]
[]
[20]

We could get the idx mask for all values of Formating at once:

In [183]: mask=(Formating[:-1,None]<=Numbers)&(Numbers<Formating[1:,None])
In [184]: mask
Out[184]: 
array([[False, False, False, False, False, False, False],
       [ True,  True, False, False, False, False, False],
       [False, False,  True,  True,  True,  True, False],
       [False, False, False, False, False, False, False],
       [False, False, False, False, False, False,  True]])
In [185]: N=Numbers[:,None].repeat(5,1).T   # 5 = len(Formating)-1
In [186]: N
Out[186]: 
array([[ 3,  4,  5,  7,  8, 10, 20],
       [ 3,  4,  5,  7,  8, 10, 20],
       [ 3,  4,  5,  7,  8, 10, 20],
       [ 3,  4,  5,  7,  8, 10, 20],
       [ 3,  4,  5,  7,  8, 10, 20]])
In [187]: np.ma.masked_array(N,~mask)
Out[187]: 
masked_array(
  data=[[--, --, --, --, --, --, --],
        [3, 4, --, --, --, --, --],
        [--, --, 5, 7, 8, 10, --],
        [--, --, --, --, --, --, --],
        [--, --, --, --, --, --, 20]],
  mask=[[ True,  True,  True,  True,  True,  True,  True],
        [False, False,  True,  True,  True,  True,  True],
        [ True,  True, False, False, False, False,  True],
        [ True,  True,  True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True,  True, False]],
  fill_value=999999)

Your lists are apparent there. But the list display still requires iteraiton:

In [188]: for row in mask:
     ...:     print(Numbers[row])
[]
[3 4]
[ 5  7  8 10]
[]
[20]

I'll let you time test these alternatives with this or more realistic data. I suspect a pure list version is fastest for small problems, but I'm not sure how the others will scale.

edit

Following questions ask about sums. np.ma.sum, or the masked arrays own sum method, sums the unmasked values, effectively filling the masked values with 0.

In [253]: np.ma.masked_array(N,~mask).sum(axis=1)
Out[253]: 
masked_array(data=[--, 7, 30, --, 20],
             mask=[ True, False, False,  True, False],
       fill_value=999999)

In [256]: np.ma.masked_array(N,~mask).filled(0)
Out[256]: 
array([[ 0,  0,  0,  0,  0,  0,  0],
       [ 3,  4,  0,  0,  0,  0,  0],
       [ 0,  0,  5,  7,  8, 10,  0],
       [ 0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0, 20]])

Actually we don't need to use the masked array mechanism to get here (though it can be nice visually):

In [258]: N*mask
Out[258]: 
array([[ 0,  0,  0,  0,  0,  0,  0],
       [ 3,  4,  0,  0,  0,  0,  0],
       [ 0,  0,  5,  7,  8, 10,  0],
       [ 0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0, 20]])
In [259]: (N*mask).sum(axis=1)
Out[259]: array([ 0,  7, 30,  0, 20])

is there a way so that I could replace the -- for zeroes fin the masked array. I tried implementing np.where(result != "--", result, 0), whoever it does work. it is the last example of your answer.
When iterating over the rows (print lines) just do a test for [] and substitue what you want. But any such tweak moves you further away from a 'pure' numpy solution. Matching the length of the list wjth the number of matches is the most logical choice.

Collectives™ on Stack Overflow

Checking in between values with Numpy PYthon

1 Answer 1

edit

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

edit

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related