Test if two coefficients in a linear model are both significant

Question

Is there a single test one can perform that would be significant if two coefficients in a linear model are both different from 0, but would not be significant if only one of them differs from 0? Yes one can look at the model summary, but this does not give you a p-value for the null hypothesis that one or none are significant. I am aware of methods for the null that neither is significant (e.g., car::linearHypothesis()), but this would be significant if only one coefficient differs from 0, which is not what I'm interested in.

An example is below. I am looking for a method that is significant for y1 but not y2.

set.seed(1234)

dat = MASS::mvrnorm(n = 500,mu = rep(0,4),
                    Sigma = matrix(nrow = 4,byrow = TRUE,
                                    c(1,  .2,  .2,  .3,
                                     .2,   1,  .2,   0,
                                     .2,  .2,   1,   0,
                                     .3,   0,   0,   1) )) |> as.data.frame()

colnames(dat) = c("x1","x2","y1","y2")

summary(lm(y1 ~ x1 + x2,data = dat))

Call:
lm(formula = y1 ~ x1 + x2, data = dat)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.70450 -0.63760 -0.01951  0.68030  2.58319 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.04704    0.04171   1.128   0.2599    
x1           0.12099    0.04271   2.833   0.0048 ** 
x2           0.25624    0.04254   6.024 3.31e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9324 on 497 degrees of freedom
Multiple R-squared:  0.1052,    Adjusted R-squared:  0.1016 
F-statistic: 29.23 on 2 and 497 DF,  p-value: 9.978e-13


summary(lm(y2 ~ x1 + x2,data = dat))
Call:
lm(formula = y2 ~ x1 + x2, data = dat)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.4124 -0.5711 -0.0436  0.5844  3.2467 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.036480   0.041036  -0.889    0.374    
x1           0.242923   0.042027   5.780 1.32e-08 ***
x2           0.002013   0.041852   0.048    0.962    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9174 on 497 degrees of freedom
Multiple R-squared:  0.06828,   Adjusted R-squared:  0.06454 
F-statistic: 18.21 on 2 and 497 DF,  p-value: 2.328e-08

I don't understand what information you think this would give you. Looking at the results, in the first model both variables are significant, in the second one only one. If a test behaved exactly as you say you'd like it to behave, it would add zero information, because all the information you need to see this is already there. — Christian Hennig
– Christian Hennig, Commented Aug 10 at 20:04
I would like a single p-value for the test. This would allow, for example, for multiple test correction for this type of hypothesis across multiple outcomes. — David B
– David B, Commented Aug 10 at 20:17
I'm doubting myself but I don't see why the maximum of both p-values wouldn't work. Sure it's not necessarily uniform under the Null-Hypothesis but that's normal because it's a composite (see here stats.stackexchange.com/q/58929/341520 or here: statmodeling.stat.columbia.edu/2023/04/14/…) — Lukas Lohse
– Lukas Lohse, Commented Aug 10 at 21:50
Investigate intersection-union and union-intersection tests. You're after null that's a union of one-parameter nulls (you're in the null if any of the nulls hold) and an alternative that's an intersection of their complements (you're only in the alternative if all the one-parameter nulls are false). An intersection-union test would reject if all the one-parameter tests reject. Do each test at $\alpha$. The overall p-value should be the lowest overall $\alpha$ at which you'd reject all tests; that is indeed the larger of the component one-parameter p-values. NB: This can be conservative. ctd.. — Glen_b
– Glen_b, Commented Aug 10 at 23:00
ctd... Casella and Berger has a section on both kinds of test. There's an introductory discussion of the intersection-union test here (uploaded by the author). — Glen_b
– Glen_b, Commented Aug 10 at 23:03

David B · Accepted Answer · 2025-08-12 02:17:44Z

Based on the excellent tips from the comments, I will try to answer my own question. The goal here is to have a test which is significant only when two variables are both independently associated with an IV - they both add something unique, beyond what is explained by their shared variance. One way to test this is with an intersection-union test (IUT). We look at both p-values in the model, and the p-value of the test is the larger of the p-values. I'm also going to expand this to also include the p-values when the variables are entered separately, because I want to rule-out collider-bias (i.e., the variables must also be associated with the outcome regardless of whether or not the other is included). So we have 4 p-values, but no multiple-test correction is needed across the 4 as we choose the least significant one as the p-value of our test.

Here I'll simulate the distribution of p-values under various nulls, as well as when the alternative is true. We'll see that this is a non-uniform, but conservative, null distribution, and as such BH-FDR correction can be used, but other approaches that make stronger assumptions about the null distribution (e.g., q-value FDR correction) are inappropriate.

# First, a basic simultion
sim_IUT = function(size,x1_x2,x1_y,x2_y){

dat = MASS::mvrnorm(n = size,mu = rep(0,3),
                    Sigma = matrix(nrow = 3,byrow = TRUE,
                                   c(1,     x1_x2,  x1_y,  
                                     x1_x2, 1,      x2_y,  
                                     x1_y,  x2_y,   1    ) )) |> as.data.frame()

colnames(dat) = c("x1","x2","y1")

m_all  = lm(y1 ~ x1 + x2 ,data = dat)
m_x1   = lm(y1 ~ x1      ,data = dat)
m_x2   = lm(y1 ~      x2 ,data = dat)

# The 4 p-values
ps = c(summary(m_all)$coefficients[-1,4],
   summary(m_x1)$coefficients[-1,4],
       summary(m_x2)$coefficients[-1,4])

return(max(ps))

}

Under a 'full' null, the p-value distribution is very conservative, with only a 0.2% false positive rate.

null_ps = sapply(c(1:10000),
             function(X){sim_IUT(size = 500,x1_x2 = 0,x1_y = 0,x2_y = 0)})
sum(null_ps <0.05)/10000
[1] 0.002
hist(null_ps)

We are protected from false-positives that could arise if only one variable is associated, such as from collider-bias. The null is still conservative, though with a bit of an unusual distribution, with a false positive rate of 2.5%.

null_ps2 = sapply(c(1:10000),function(X){sim_IUT(size = 500,x1_x2 = .3,x1_y = .2,x2_y = 0)})
sum(null_ps2 <0.05)/10000
[1] 0.025
hist(null_ps2)

Even so, the p-value distribution looks as-expected when the alternative is true.

true_ps = sapply(c(1:10000),function(X){sim_IUT(size = 500,x1_x2 = .3,x1_y = .1,x2_y = .15)})
sum(true_ps <0.05)/10000
[1] 0.1823
hist(true_ps)

Stack Exchange Network

Test if two coefficients in a linear model are both significant

1 Answer 1

Your Answer

Linked

Hot Network Questions

Test if two coefficients in a linear model are both significant

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Linked

Related

Hot Network Questions