2

I've looked through the documentation and still can't figure this out. I want to run a WLS with multiple regressions.

statsmodels.api is imported as sm

Example of single variable.

X = Height
Y = Weight

res = sm.OLS(Y,X,).fit() 
res.summary()

Say I also have:

X2 = Age

How do I add this into my regresssion?

1
  • 2
    Note, statsmodels does not add an intercept except when using formulas. Commented Aug 24, 2020 at 14:57

2 Answers 2

1

You can put them into a data.frame and call out the columns (this way the output looks nicer too):

import statsmodels.api as sm
import pandas as pd
import numpy as np

Height = np.random.uniform(0,1,100)
Weight = np.random.uniform(0,1,100)
Age = np.random.uniform(0,30,100)

df = pd.DataFrame({'Height':Height,'Weight':Weight,'Age':Age})

res = sm.OLS(df['Height'],df[['Weight','Age']]).fit()

In [10]: res.summary()
Out[10]: 
<class 'statsmodels.iolib.summary.Summary'>
"""
                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:                 Height   R-squared (uncentered):                   0.700
Model:                            OLS   Adj. R-squared (uncentered):              0.694
Method:                 Least Squares   F-statistic:                              114.3
Date:                Mon, 24 Aug 2020   Prob (F-statistic):                    2.43e-26
Time:                        15:54:30   Log-Likelihood:                         -28.374
No. Observations:                 100   AIC:                                      60.75
Df Residuals:                      98   BIC:                                      65.96
Df Model:                           2                                                  
Covariance Type:            nonrobust                                                  
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Weight         0.1787      0.090      1.988      0.050       0.000       0.357
Age            0.0229      0.003      8.235      0.000       0.017       0.028
==============================================================================
Omnibus:                        2.938   Durbin-Watson:                   1.813
Prob(Omnibus):                  0.230   Jarque-Bera (JB):                2.223
Skew:                          -0.211   Prob(JB):                        0.329
Kurtosis:                       2.404   Cond. No.                         49.7
==============================================================================
Sign up to request clarification or add additional context in comments.

Comments

1

I use a 2nd order polynomial to predict how height and age affect weight for a soldier. You can pick up ansur_2_m.csv on my GitHub.

 df=pd.read_csv('ANSUR_2_M.csv', encoding = "ISO-8859-1",   usecols=['Weightlbs','Heightin','Age'],  dtype={'Weightlbs':np.integer,'Heightin':np.integer,'Age':np.integer})
 df=df.dropna()
 df.reset_index()
 df['Heightin2']=df['Heightin']**2
 df['Age2']=df['Age']**2

 formula="Weightlbs ~ Heightin+Heightin2+Age+Age2"
 model_ols = smf.ols(formula,data=df).fit()
 minHeight=df['Heightin'].min()
 maxHeight=df['Heightin'].max()
 avgAge = df['Age'].median()
 print(minHeight,maxHeight,avgAge)

 df2=pd.DataFrame()

 df2['Heightin']=np.linspace(60,100,50)
 df2['Heightin2']=df2['Heightin']**2
 df2['Age']=28
 df2['Age2']=df['Age']**2

 df3=pd.DataFrame()
 df3['Heightin']=np.linspace(60,100,50)
 df3['Heightin2']=df2['Heightin']**2
 df3['Age']=45
 df3['Age2']=df['Age']**2

 prediction28=model_ols.predict(df2)
 prediction45=model_ols.predict(df3)

 plt.clf()
 plt.plot(df2['Heightin'],prediction28,label="Age 28")
 plt.plot(df3['Heightin'],prediction45,label="Age 45")
 plt.ylabel="Weight lbs"
 plt.xlabel="Height in"
 plt.legend()
 plt.show()

 print('A 45 year old soldier is more probable to weight more than an 28 year old soldier')

1 Comment

see (github.com/dnishimoto/python-deep-learning/blob/master/…) I used visualization and logistic regression to analyze army height and weight and bmi data

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.