Why does statsmodels include interaction with reference?

I'm trying to create a Tweedie Regression in statsmodels. The regression basically has three categorical predictors which have four levels each. To ilustrate, here is an example:

import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf

data = pd.DataFrame({
    'V1': pd.Categorical(['A', 'B', 'C', 'D', 'A', 'B', 'C', 'D']),
    'V2': pd.Categorical(['W', 'X', 'Y', 'Z', 'W', 'X', 'Y', 'Z']),
    'V3': pd.Categorical(['K', 'L', 'M', 'N', 'K', 'L', 'M', 'N']),
    'y': [5.1, 7.3, 6.9, 8.0, 5.4, 7.1, 6.8, 8.2]
})

formula = 'y ~ C(V1, Treatment('A')) + C(V2, Treatment('W')):C(V3, Treatment('K'))'
model = smf.GLM.from_formula(formula, data, family=sm.families.Tweedie())
result = model.fit()

print(result.summary())

I used to do this kind of regression using SAS, and SAS do not return the interaction with the reference. For example, in this case, SAS do not include any interaction with V3(K). Here is the analogous code in SAS:

data example;
    input V1 $ V2 $ V3 $ y;
    datalines;
A W K 5.1
B X L 7.3
C Y M 6.9
D Z N 8.0
A W K 5.4
B X L 7.1
C Y M 6.8
D Z N 8.2
;
run;

proc hpgenselect data=example;
    class V1 (ref='A') V2 (ref='W') V3 (ref='K'); 
    model y = V1 V2*V3 / dist=tweedie link=log; 
run;

However, in statsmodel, this interaction is included. Does anyone know why this happen? And how to do something similar to SAS (without the interaction with the reference)?

edited Aug 16, 2024 at 21:23

asked Aug 16, 2024 at 20:11

Felippe Trigueiro

1191 silver badge10 bronze badges

What SAS code did you try? Your example categorical variables are all perfectly correlated, so no model will really work. Do you have some actual example data?

Tom
– Tom

2024-08-16 21:11:37 +00:00
Commented Aug 16, 2024 at 21:11
I edited the code including the SAS code.

Felippe Trigueiro
– Felippe Trigueiro

2024-08-16 21:23:23 +00:00
Commented Aug 16, 2024 at 21:23

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Why does statsmodels include interaction with reference?

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest