1

Below is the error code that I am getting when I am trying to run my Python Code. Is it an issue with my installation? I have python 64-bit 3.6.0 installed and I'm sure I installed the 64-bit version of sci-kit. I also have numpy, scipy installed as prereqs.

Traceback (most recent call last):
  File "C:/Users/kevinshen/Desktop/Kaggle/GettingStarted/makeSubmission.py", line 1, in <module>
    from sklearn.ensemble import RandomForestClassifier
  File "C:\Users\kevinshen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\__init__.py", line 57, in <module>
    from .base import clone
  File "C:\Users\kevinshen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\base.py", line 12, in <module>
    from .utils.fixes import signature
  File "C:\Users\kevinshen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\utils\__init__.py", line 11, in <module>
    from .validation import (as_float_array,
  File "C:\Users\kevinshen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\utils\validation.py", line 18, in <module>
    from ..utils.fixes import signature
  File "C:\Users\kevinshen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\utils\fixes.py", line 406, in <module>
    if np_version < (1, 12, 0):
TypeError: '<' not supported between instances of 'str' and 'int

Python Code:

from sklearn.ensemble import RandomForestClassifier
from numpy import genfromtxt, savetxt

def main():
    #create the training & test sets, skipping the header row with [1:]
    dataset = genfromtxt(open('Data/train.csv','r'), delimiter=',', dtype='f8')[1:]
    target = [x[0] for x in dataset]
    train = [x[1:] for x in dataset]
    test = genfromtxt(open('Data/test.csv','r'), delimiter=',', dtype='f8')[1:]

    #create and train the random forest
    #multi-core CPUs can use: rf = RandomForestClassifier(n_estimators=100, n_jobs=2)
    rf = RandomForestClassifier(n_estimators=100)
    rf.fit(train, target)
    predicted_probs = [x[1] for x in rf.predict_proba(test)]

    savetxt('Data/submission.csv', predicted_probs, delimiter=',', fmt='%f')

if __name__=="__main__":
    main()
1
  • Since the first line of your code produces the error, you should not post any more of it. See stackoverflow.com/help/mcve. Commented Jan 5, 2017 at 5:55

1 Answer 1

3

You probably have numpy 1.12 beta (1.12.0b1) that's why it complains about the comparison between str and int (0b1 and 0).

The last version of fixes.py corrects this issue: https://github.com/scikit-learn/scikit-learn/commit/1f278e1c231e6b9b3cf813377819e25e87b6c8b6

Sign up to request clarification or add additional context in comments.

2 Comments

Works! Thanks for the thing! I'm totally new to python/kaggle so I had no clue.
It works! I installed sklearn from lfd.uci.edu/~gohlke/pythonlibs/#scikit-learn py3.6 win32. Hope it will fix the whl file soon.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.