0

I have an array presented as list of lists, and I need ro remove all rows and then columns where values are greater then some variable. I can't find out myself how to do it. All I could do is something like:

rowNum =0
for row in self.Table:
    rowIsValid = False
    for value in row:
        if not value is None and (value > 0.35 and not value == 1):
            rowIsValid = True
    if not rowIsValid:
        self.Table =  numpy.delete(self.Table, (rowNum), axis=0)
        #self.Table.pop(row)
    rowNum+=1

And i'ts just for rows. And it didn't work( How do remove columns - I cant even imagine.

Data example Input:

 1.0 None 0.333 0.166 None
 0.4 1.0  0.541 0.4   0.3
 0.1 0.41 1.0   0.23  0.11

Output (for example i need remove rows and columns where all values are smaller than 0.3 and not (1 not included in calcualtions))

0.4 1.0  0.541 0.4  
0.1 0.41 1.0   0.23 
5
  • 3
    Please provide a sample input and expected output. Commented Jan 25, 2015 at 15:35
  • How about: stackoverflow.com/a/25391473/963881 ? Commented Jan 25, 2015 at 15:35
  • I agree with REACHUS - this is very nearly a duplicate of stackoverflow.com/a/25391473/963881 ... Commented Jan 25, 2015 at 15:52
  • There data sorted by just 1 column, and i need data that will be sorted by all values in all rows than all values in all columns. Commented Jan 25, 2015 at 17:43
  • Always post sample input and desired output apart from your attempted code. ;-) +1 Commented Jan 25, 2015 at 20:53

1 Answer 1

0

If your array looks something like this:

>>> arr = array([[ 1.   ,    nan,  0.333,  0.166,    nan],
       [ 0.4  ,  1.   ,  0.541,  0.4  ,  0.3  ],
       [ 0.1  ,  0.41 ,  1.   ,  0.23 ,  0.11 ]])

Then first set all nan values to True as they don't fit out criteria.

>>> arr[np.isnan(arr)] = True

Now get an array of booleans where True means the items that match our criteria, now if all items in a row or column are True then we should ignore that row or column:

>>> temp = (arr == 1.0) | (arr < 0.35)
>>> temp
array([[ True,  True,  True,  True,  True],
       [False,  True, False, False,  True],
       [ True, False,  True,  True,  True]], dtype=bool)

Get only those rows which contains at least one False:

>>> rows = ~np.all(temp, axis=1)
>>> rows
array([False,  True,  True], dtype=bool)

Same as rows, but on different axis:

>>> cols = ~np.all(temp, axis=0)
>>> cols
array([ True,  True,  True,  True, False], dtype=bool)

Now use simple indexing and slicing to get the required items:

>>> arr[rows][:, cols]
array([[ 0.4  ,  1.   ,  0.541,  0.4  ],
       [ 0.1  ,  0.41 ,  1.   ,  0.23 ]])
Sign up to request clarification or add additional context in comments.

1 Comment

At fact, I did clarification, and None can be equaled to 0, but your idea is good. I'll try to use it. Thanks a lot. It's creative approach)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.