4

I am currently doing a course i Machine Learning from Coursera ( https://www.coursera.org/learn/ml-foundations/lecture/6wD6H/visualizing-predictions-of-simple-model-with-matplotlib ). The course use Graphlab Create framework for during the course for learning and assignments. I don't want to use Graphlab, instead I am using pandas, numpy for assignments.

In the course, the instructor has created a regression model, and then he shows the prediction using matplotlib:

Build Regression Model

sqft_model = graphlab.linear_regression.create(train_data, target='price', features=['sqft_living'],validation_set=None)

and then the prediction code is as follows:

plt.plot(test_data['sqft_living'],test_data['price'],'.',
        test_data['sqft_living'],sqft_model.predict(test_data),'-')

The result is:

prediction image

In the above image, blue dots are test data, green line is the prediction from the simple regression. I am a complete beginner to programming and python. I wanted to use free resources such as pandas and scikit. I have used following to do the same in Ipython:

Build Regression Model

from pandas.stats.api import ols
sqft_model = ols(y=train_data['price'], x=train_data['sqft_living'])

But, I get the following error while inputting the prediction code:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

Thus, I am not able to produce the desired result as done by the instructor (i.e. the image shown above). Can anyone help me out?

pls find the below link to download data:

https://onedrive.live.com/redir?resid=EDAAD532F68FDF49!1091&authkey=!AKs341lbRnuCt9w&ithint=folder%2cipynb

4
  • Hi @Drjnker... I get the following error while inputting the prediction code: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). Commented Dec 14, 2015 at 12:24
  • May the problem come from your train_data ? Commented Dec 14, 2015 at 12:35
  • Thanks @Drjnker..I will check> Meanwhile, could you pls let me know what is the meaning of : The truth value of a Series is ambiguous...? Commented Dec 14, 2015 at 12:49
  • You must be looking for a condition on a serie but python doesn't know if you want "the series isn't empty", "an element of the serie is true", or "all elements of the serie are true".. that's why it gives you a.empty, a.all() ... Although it doesnt make much sense in your code (from where I sit) Commented Dec 14, 2015 at 13:19

1 Answer 1

1

I suspect the issue here is that the Pandas OLS model can't understand GraphLab's SArray. Try converting the SFrames train_data and test_data into a Pandas Dataframe first - the following works for me:

df_train = train_data.to_dataframe()
model = old(y=df_train['price'], x=df_train['sqft_living'])
Sign up to request clarification or add additional context in comments.

3 Comments

Hi, Thanks for the help... when I input the above command, i m getting: AttributeError: 'DataFrame' object has no attribute 'to_dataframe'.. AM I missing something... ..Thanks !
This might help but I don't know what type is your data.
@Drjnker..I have provided Onedrive link ( Edited above) to download data and code files i am using,.. could you pls check..

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.