SciKit-Learn Python Package has Error

Question

Below is the error code that I am getting when I am trying to run my Python Code. Is it an issue with my installation? I have python 64-bit 3.6.0 installed and I'm sure I installed the 64-bit version of sci-kit. I also have numpy, scipy installed as prereqs.

Traceback (most recent call last):
  File "C:/Users/kevinshen/Desktop/Kaggle/GettingStarted/makeSubmission.py", line 1, in <module>
    from sklearn.ensemble import RandomForestClassifier
  File "C:\Users\kevinshen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\__init__.py", line 57, in <module>
    from .base import clone
  File "C:\Users\kevinshen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\base.py", line 12, in <module>
    from .utils.fixes import signature
  File "C:\Users\kevinshen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\utils\__init__.py", line 11, in <module>
    from .validation import (as_float_array,
  File "C:\Users\kevinshen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\utils\validation.py", line 18, in <module>
    from ..utils.fixes import signature
  File "C:\Users\kevinshen\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\utils\fixes.py", line 406, in <module>
    if np_version < (1, 12, 0):
TypeError: '<' not supported between instances of 'str' and 'int

Python Code:

from sklearn.ensemble import RandomForestClassifier
from numpy import genfromtxt, savetxt

def main():
    #create the training & test sets, skipping the header row with [1:]
    dataset = genfromtxt(open('Data/train.csv','r'), delimiter=',', dtype='f8')[1:]
    target = [x[0] for x in dataset]
    train = [x[1:] for x in dataset]
    test = genfromtxt(open('Data/test.csv','r'), delimiter=',', dtype='f8')[1:]

    #create and train the random forest
    #multi-core CPUs can use: rf = RandomForestClassifier(n_estimators=100, n_jobs=2)
    rf = RandomForestClassifier(n_estimators=100)
    rf.fit(train, target)
    predicted_probs = [x[1] for x in rf.predict_proba(test)]

    savetxt('Data/submission.csv', predicted_probs, delimiter=',', fmt='%f')

if __name__=="__main__":
    main()

Since the first line of your code produces the error, you should not post any more of it. See stackoverflow.com/help/mcve. — Terry Jan Reedy
– Terry Jan Reedy, Commented Jan 5, 2017 at 5:55

gntoni · Accepted Answer · 2017-01-05 05:53:39Z

3

You probably have numpy 1.12 beta (1.12.0b1) that's why it complains about the comparison between str and int (0b1 and 0).

The last version of fixes.py corrects this issue: https://github.com/scikit-learn/scikit-learn/commit/1f278e1c231e6b9b3cf813377819e25e87b6c8b6

answered Jan 5, 2017 at 5:53

gntoni

4874 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

kevin shen Over a year ago

Works! Thanks for the thing! I'm totally new to python/kaggle so I had no clue.

Zing Lee Over a year ago

It works! I installed sklearn from lfd.uci.edu/~gohlke/pythonlibs/#scikit-learn py3.6 win32. Hope it will fix the whl file soon.

Collectives™ on Stack Overflow

SciKit-Learn Python Package has Error

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related