0

I have a multidimensional array similar to the following and i'm trying to delete the strings that end with * (stars) such that I can convert it into an array of floats.

    array1 = np.column_stack((a, b, c, d)) 
    array1 = np.array([
       ['*0.70*', '21.59', '4.37', '21.70'],
       ['2.15', '21.42', '5.63', '22.33'],
       ['*8.00*', '21.17', '5.11', '22.40'],
       ['2.36', '22.88', '*2.54*', '*20.95*'],
       ['2.07', '22.64', '6.68', '22.26']
       ])

Is there a way to us np.where to give the coordinate within the array of the valuse highlighted with stars, and not just index so I can delete the entire row?

So Ideal output would be somthing along the lines ofK

fil1 = np.where(np.char.endswith(array1, "*") == True)

print(fil1) 
(0,0), (0,2), (2, 3), (3, 3)

3 Answers 3

1

np.where returns 1 array per dimension. If you want to know the indices of the rows containing stars, just do:

starred_rows = np.unique(np.where(np.char.endswith(array1, "*") == True)[0])

To get the index pairs, you can use zip:

np.array(list(zip(*np.where(np.char.endswith(array1, "*") == True))))
Sign up to request clarification or add additional context in comments.

Comments

0

To get the Ideal output mentioned above you need to use zip

fil1 = list(zip(*np.where(np.char.endswith(array1, "*") == True)))
print(fil1)  
[(0, 0), (2, 0), (3, 2), (3, 3)] //result

If you want to get only the index of rows, you can take the unique of the first element of the your result.

fil1 = list(set(np.where(np.char.endswith(array1, "*") == True)[0]))
print(fil1)
[0, 2, 3]  // result

1 Comment

np.transpose(np.where(...)) or np.argwhere(...)
0
In [81]: array1 = np.array([ 
    ...:        ['*0.70*', '21.59', '4.37', '21.70'], 
    ...:        ['2.15', '21.42', '5.63', '22.33'], 
    ...:        ['*8.00*', '21.17', '5.11', '22.40'], 
    ...:        ['2.36', '22.88', '*2.54*', '*20.95*'], 
    ...:        ['2.07', '22.64', '6.68', '22.26'] 
    ...:        ])                                                                       

The char test returns a boolean array:

In [84]: mask = np.char.endswith(array1,"*")                                             
In [85]: mask                                                                            
Out[85]: 
array([[ True, False, False, False],
       [False, False, False, False],
       [ True, False, False, False],
       [False, False,  True,  True],
       [False, False, False, False]])

np.nonzero (aka np.where) finds the coordinates of the True values, one array per dimension:

In [86]: np.nonzero(mask)                                                                
Out[86]: (array([0, 2, 3, 3]), array([0, 0, 2, 3]))

If you want to delete the rows, the first array can be used (duplicate 3 apparently doesn't bother delete):

In [88]: np.delete(array1, np.nonzero(mask)[0], 0)                                       
Out[88]: 
array([['2.15', '21.42', '5.63', '22.33'],
       ['2.07', '22.64', '6.68', '22.26']], dtype='<U7')

But we can also find rows with any True with:

In [89]: mask.any(axis=1)                                                                
Out[89]: array([ True, False,  True,  True, False])

and use that to select those rows (boolean array indexing)

In [91]: array1[mask.any(axis=1)]                                                        
Out[91]: 
array([['*0.70*', '21.59', '4.37', '21.70'],
       ['*8.00*', '21.17', '5.11', '22.40'],
       ['2.36', '22.88', '*2.54*', '*20.95*']], dtype='<U7')

or select their not:

In [92]: array1[~mask.any(axis=1)]                                                       
Out[92]: 
array([['2.15', '21.42', '5.63', '22.33'],
       ['2.07', '22.64', '6.68', '22.26']], dtype='<U7')

np.nonzero(Out[89]) is (array([0, 2, 3]),), the desired delete rows.

Other answers have used the Python list version of transpose; numpy's own transpose works as well:

In [93]: np.argwhere(mask)                                                               
Out[93]: 
array([[0, 0],
       [2, 0],
       [3, 2],
       [3, 3]])
In [94]: np.transpose(np.nonzero(mask))                                                  
Out[94]: 
array([[0, 0],
       [2, 0],
       [3, 2],
       [3, 3]])

For the purpose of deleting rows this transpose isn't any more useful than the where.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.