2

My current code is

from numpy import *

def buildRealDataObject(x):
    loc = array(x[0])
    trueClass = x[1]
    evid = ones(len(loc))
    evid[isnan(loc)] = 0
    loc[isnan(loc)] = 0
    return DataObject(location=loc, trueClass=trueClass, evidence=evid)

if trueClasses is None:
    trueClasses = zeros(len(dataset), dtype=int8).tolist()    
realObjects = list(map(lambda x: buildRealDataObject(x), zip(dataset, trueClasses)))

and it is working. What I expect is to create for each row of the DataFrame dataset each combined with the corresponding entry of trueClasses a realObject. I am not really sure though why it is working because if run list(zip(dataset, trueClasses)) I just get something like [(0, 0.0), (1, 0.0)]. The two columns of dataset are called 0 and 1. So my first question is: Why is this working and what is happening here?

However I think this might still be wrong on some level, because it might only work due to "clever implicit transformation" on side of pandas. Also, for the line evid[isnan(loc)] = 0 I now got the error

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

How should I rewrite this code instead?

3
  • You could looks mystical, e.g. where is isnan from, and not very pythonic?. You better try giving some input, and tell people what's your expected output. Commented May 25, 2017 at 18:55
  • @zyxue: It's from numpy. Commented May 25, 2017 at 19:15
  • Give minimal executable code so people can copy/paste to see the exception you describe, otherwise people have to guess what dataset is, and what DataObject is. Commented May 26, 2017 at 7:54

1 Answer 1

5

Currently the zip works on columns instead of rows. Use one of the method from Pandas convert dataframe to array of tuples to make the zip work on rows instead of columns. For example substitute

zip(dataset, trueClasses)

with

zip(dataset.values, trueClasses)

Considiering this post, if you have already l = list(data_train.values) for some reason, then zip(l, eClass) is faster than zip(dataset.values, trueClasses). However, if you don't then the transformation takes too much time to make it worth it in my tests.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.