Scikit-learn ValueError when implementing logistic regression in Python

Question

I am new to machine learning and am trying to set up a logistic regression for prediction purposes in Python using scikit-learn. I already set one up with a small, mock dataset, but when expanding this code to work for larger datasets, I run into an issue regarding a ValueError. Here is my code:

inputData = np.genfromtxt(file, skip_header=1, unpack=True)
print "X array shape: ",inputData.shape 
inputAnswers = np.genfromtxt(file2, skip_header=1, unpack=True)
print "Y array shape: ",inputAnswers.shape

logreg = LogisticRegression(penalty='l2',C=2.0)
logreg.fit(inputData, inputAnswers)

The inputData 2D array (matrix) has 149 rows and 231 columns. I'm trying to fit it to the inputAnswers array, which has 149 rows, correctly corresponding to the 149 rows of the inputData array. However, here is the output I receive:

X array shape:  (231, 149)
Y array shape:  (149,)
Traceback (most recent call last):
File "LogRegTry_rawData.py", line 26, in <module>
logreg.fit(inputData, inputAnswers)
File "[path]", line 676, in fit
(X.shape[0], y.shape[0]))
ValueError: X and y have incompatible shapes.
X has 231 samples, but y has 149.

I understand what the error means, but I'm not sure of both why it is showing up in this situation and how to fix it. Any help is greatly appreciated. Thank you!

ojy · Accepted Answer · 2014-07-26 00:32:57Z

1

In shape, the first element is the number of rows, and the second - the number of columns. So you have 231 entries, and only 149 labels. Try transposing your data: inputData.T

answered Jul 26, 2014 at 0:32

ojy

2,4922 gold badges20 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user3847447 Over a year ago

thank you! I used the np.transpose() function, and this worked. I wonder why np.genfromtxt reads it "inverted," however...

ojy Over a year ago

unpack=True is transposing the data

Collectives™ on Stack Overflow

Scikit-learn ValueError when implementing logistic regression in Python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related