2

I have a pandas dataframe df like:

A,B,C
1,1,1
0.8,0.6,0.9
0.7,0.5,0.8
0.2,0.4,0.1
0.1,0,0

where the three columns have sorted values [0,1]. I'm trying to plot a linear regression over the three series. So far I was able to use scipy.stats as following:

from scipy import stats

xi = np.arange(len(df))

slope, intercept, r_value, p_value, std_err = stats.linregress(xi,df['A'])
line1 = intercept + slope*xi
slope, intercept, r_value, p_value, std_err = stats.linregress(xi,df['B'])
line2 = intercept + slope*xi
slope, intercept, r_value, p_value, std_err = stats.linregress(xi,df['C'])
line3 = intercept + slope*xi

plt.plot(line1,'r-')
plt.plot(line2,'b-')
plt.plot(line3,'g-')

plt.plot(xi,df['A'],'ro')
plt.plot(xi,df['B'],'bo')
plt.plot(xi,df['C'],'go')

obtaining the following plot:

enter image description here

Is it possible to obtain a single linear regression that summarize the three single linear regressions within scipy.stats?

1
  • 1
    If you want a regression that summarizes the three regressions, why not combine all the data and do linear regression on that data? Commented Jan 19, 2016 at 17:36

1 Answer 1

2

Perhaps something like this:

x = pd.np.tile(xi, 3)
y = pd.np.r_[df['A'], df['B'], df['C']]

slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
line4 = intercept + slope * xi

plt.plot(line4,'k-')
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.