0

I am trying to convert the code down below to the Numpy version. The vanilla python code checks the previous and current values of Formating and checks to see if any of the Numbers values are in between them. The Numpy version of this code is faulty how would i be able to fix it? The code was gotten from the issue:issue link

Values:

Numbers = np.array([3, 4, 5, 7, 8, 10,20])
Formating = np.array([0, 2 , 5, 12, 15, 22])
x = np.sort(Numbers);
l = np.searchsorted(x, Formating, side='left')

Vanilla Python:

for i in range(len(l)-1):
    if l[i] >= l[i+1]:
        print('Numbers between %d,%d = _0_' % (Formating[i], Formating[i+1]))
    else:
        print('Numbers between %d,%d = %s' % (Formating[i], Formating[i+1], ','.join(map(str, list(x[l[i]:l[i+1]])))))

Numpy Version:

L_index = np.arange(0, len(l)-1, 1)
result= np.where(l[L_index] >= l[L_index+1], 0 , l )

Expected output:

[0]
[3 4]
[5 7 8 10]
[0]
[20]
2
  • If you expect lists (or arrays)that differ in length, that's a good indication that a 'pure' numpy option isn't possible. Arrays are 'rectangular', not 'raggeded'. There are tricks that can create padded arrays or masked ones. Commented Mar 11, 2021 at 3:45
  • (Numbers[:,None]>=Formats[:-1]) & (Numbers[:,None]<=Formats[1:]) might be a useful first step. It should be a 2d boolean array, with True where numbers fall in the desired format range. Commented Mar 11, 2021 at 4:47

1 Answer 1

1

The answer from a previous question:

In [173]: Numbers = np.array([3, 4, 5, 7, 8, 10,20])
     ...: Formating = np.array([0, 2 , 5, 12, 15, 22])
     ...: x = np.sort(Numbers);
     ...: l = np.searchsorted(x, Formating, side='left')
     ...: 
In [174]: l
Out[174]: array([0, 0, 2, 6, 6, 7])
In [175]: for i in range(len(l)-1):
     ...:     if l[i] >= l[i+1]:
     ...:         print('Numbers between %d,%d = _0_' % (Formating[i], Formating[i+1]))
     ...:     else:
     ...:         print('Numbers between %d,%d = %s' % (Formating[i], Formating[i+1], ','.jo
     ...: in(map(str, list(x[l[i]:l[i+1]])))))
     ...: 
Numbers between 0,2 = _0_
Numbers between 2,5 = 3,4
Numbers between 5,12 = 5,7,8,10
Numbers between 12,15 = _0_
Numbers between 15,22 = 20

Something that works fine with lists - in fact faster with lists than arrays:

In [182]: for i in range(len(Formating)-1):
     ...:     print([x for x in Numbers if (Formating[i]<=x<Formating[i+1])])
     ...: 
[]
[3, 4]
[5, 7, 8, 10]
[]
[20]

A version with iteration on Formating, but not Numbers. Rather similar to the version using searchsorted. I'm not sure which will be faster:

In [177]: for i in range(len(Formating)-1):
     ...:     idx = (Formating[i]<=Numbers)&(Numbers<Formating[i+1])
     ...:     print(Numbers[idx])
     ...: 
[]
[3 4]
[ 5  7  8 10]
[]
[20]

We could get the idx mask for all values of Formating at once:

In [183]: mask=(Formating[:-1,None]<=Numbers)&(Numbers<Formating[1:,None])
In [184]: mask
Out[184]: 
array([[False, False, False, False, False, False, False],
       [ True,  True, False, False, False, False, False],
       [False, False,  True,  True,  True,  True, False],
       [False, False, False, False, False, False, False],
       [False, False, False, False, False, False,  True]])
In [185]: N=Numbers[:,None].repeat(5,1).T   # 5 = len(Formating)-1
In [186]: N
Out[186]: 
array([[ 3,  4,  5,  7,  8, 10, 20],
       [ 3,  4,  5,  7,  8, 10, 20],
       [ 3,  4,  5,  7,  8, 10, 20],
       [ 3,  4,  5,  7,  8, 10, 20],
       [ 3,  4,  5,  7,  8, 10, 20]])
In [187]: np.ma.masked_array(N,~mask)
Out[187]: 
masked_array(
  data=[[--, --, --, --, --, --, --],
        [3, 4, --, --, --, --, --],
        [--, --, 5, 7, 8, 10, --],
        [--, --, --, --, --, --, --],
        [--, --, --, --, --, --, 20]],
  mask=[[ True,  True,  True,  True,  True,  True,  True],
        [False, False,  True,  True,  True,  True,  True],
        [ True,  True, False, False, False, False,  True],
        [ True,  True,  True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True,  True, False]],
  fill_value=999999)

Your lists are apparent there. But the list display still requires iteraiton:

In [188]: for row in mask:
     ...:     print(Numbers[row])
[]
[3 4]
[ 5  7  8 10]
[]
[20]

I'll let you time test these alternatives with this or more realistic data. I suspect a pure list version is fastest for small problems, but I'm not sure how the others will scale.

edit

Following questions ask about sums. np.ma.sum, or the masked arrays own sum method, sums the unmasked values, effectively filling the masked values with 0.

In [253]: np.ma.masked_array(N,~mask).sum(axis=1)
Out[253]: 
masked_array(data=[--, 7, 30, --, 20],
             mask=[ True, False, False,  True, False],
       fill_value=999999)

In [256]: np.ma.masked_array(N,~mask).filled(0)
Out[256]: 
array([[ 0,  0,  0,  0,  0,  0,  0],
       [ 3,  4,  0,  0,  0,  0,  0],
       [ 0,  0,  5,  7,  8, 10,  0],
       [ 0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0, 20]])

Actually we don't need to use the masked array mechanism to get here (though it can be nice visually):

In [258]: N*mask
Out[258]: 
array([[ 0,  0,  0,  0,  0,  0,  0],
       [ 3,  4,  0,  0,  0,  0,  0],
       [ 0,  0,  5,  7,  8, 10,  0],
       [ 0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0, 20]])
In [259]: (N*mask).sum(axis=1)
Out[259]: array([ 0,  7, 30,  0, 20])
Sign up to request clarification or add additional context in comments.

2 Comments

is there a way so that I could replace the -- for zeroes fin the masked array. I tried implementing np.where(result != "--", result, 0), whoever it does work. it is the last example of your answer.
When iterating over the rows (print lines) just do a test for [] and substitue what you want. But any such tweak moves you further away from a 'pure' numpy solution. Matching the length of the list wjth the number of matches is the most logical choice.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.