0

I've read some column from an excel file and stored that in a numpy array, col. For every index i in col I want to check if the value is nan, if it's nan I will delete the index i in col and in another array, x. I did this,

workbook = xlrd.open_workbook('well data.xlsx')
sheet=workbook.sheet_by_index(0)
col= sheet.col_values(1,1)
col= np.array (col)
col= col.astype(np.float)
        for i in range (col.shape [0]):
            if (np.isnan(col[i])):
                col=np.delete(col,i)
                x= np.delete(x,i)

I'm getting two types of errors, first when this float conversion exists col= col.astype(np.float), I get

    if (np.isnan(col[i])):
IndexError: index out of bounds

second, if I remove the float conversion, I get this error,

    if (np.isnan(col[i])):
TypeError: Not implemented for this type

I know for removing the nan from a single numpy array I can do this,

x = x[numpy.logical_not(numpy.isnan(x))]

But my case is different, I want to delete the nan elements from col, and any corresponding element in x. For example, if index 3 in col is nan, index 3 in col and x should be deleted. Also, float conversion is necessary in my case.

This is a more detailed example,

These are the initial arrays (both have similar length):

col= [16.5, 14.3, 17.42,nan, 13.22, nan]

x= [1, 2, 3, 4, 5, 6]

After removing nans the arrays should be,

col= [16.5, 14.3, 17.42, 13.22]

x= [1, 2, 3, 5]

One more thing, the provided code works very well if I'm reading the columns from a .dat file, does it really matter if I'm reading the columns from excel?

Can anyone please help me solving this problem?

Thanks.

1
  • Please update the question with a sample input and expected output. Commented Jun 30, 2015 at 8:54

1 Answer 1

1

Your first idea was correct.

col= col.astype(np.float)
for i in range (col.shape [0]):
    if (np.isnan(col[i])):
        col=np.delete(col,i)
        x= np.delete(x,i)

Is almost correct. Shape return the total length of your object, but you have to go from 0 to this length -1. So your for line would be like :

for i in range (0, col.shape [0]):

But since you are removing elements from the array, you may have a smaller array while computing this thing. So if you want to access the fifth and last element and you removed an element before, col will no longer have 5 elements. I suggest you loop backward on your coloumn, like this

for i in range(col.shape [0]-1, -1, -1):
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a lot, but this is not working, I'm still getting the same error. Do you have any other solutions please?
Edited, looping backwards should be better in your case, since you change your array.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.