EDIT: I would like help closing my question. This question was originally posted on Cross Validated, should have stayed there because the answer is more of a statistics question and not a programming one, but it got closed over there and was asked to repost it over here. I did that, but the answer is really a stats question so I'll post the answer here. If someone would be willing to close it I would be appreciative.
I have been trying to do some clustered bootstrapping using lm that I can then use in the package sensemakr(). If you're not familiar with it, it's an implemention of some ideas in Cinelli & Hazlett (2020, in the Journal of the Royal Statistical Society: Statistical Methodology, Series B).
The way sensemakr() works is that you run a regression model that produces either an lm or feols-class object, which you can then use in sensemakr().
library(palmerpenguins)
library(sensemakr)
penguin_dat<-penguins
model_lm<-lm(flipper_length_mm ~ sex + body_mass_g, data=penguin_dat)
summary(model_lm)
sensitivity1<-sensemakr(model=model_lm, treatment="body_mass_g", benchmark_covariates="sexmale")
summary(sensitivity1)
I need to find a way to pass models with more complicated error structures to sensemakr, specifically a model with a clustered bootstrap. It currently takes lm models, but I read that you can try and pass feols models from fixest to it if you're willing to use the development version (see: Use sensemakr with fixest feols model (R)). I haven't tried it myself yet, however.
Here are the options I have considered:
- Use my previous lm-class object (model_lm in the previous example) to lm.boot() in simpleboot, then use that object in sensemakr. The problem: I can't find any functionality for cluster-bootstrap in simpleboot. I am afraid I may be overlooking something--I am hoping that I just overlooked this functionality as it would likely be the simplest solution.
- Use the wild cluster bootstrap from fwildclusterboot. I'm not sure what object it produces, and it was removed from CRAN due to dependency issues so I can't install it (or look at the reference manual either).
- Use a cluster bootstrap package like clusbootglm or lmeresampler and then try to coerce the result to an lm-type object. This is somewhat beyond my R-capabilities, but if #1 doesn't work I think it may be the only option. As an example of what #3 would look like, here is an example using the coeftest() from lmtest and vcovBS() from sandwich--but I can't figure out how one would coerce this back into an lm object, or even if it would be a good idea:
library(palmerpenguins)
library(sandwich)
library(sensemakr)
library(lmtest)
penguin_dat<-penguins
model_lm<-lm(flipper_length_mm ~ sex + body_mass_g, data=penguin_dat)
clustered_bootstrap_model_lm<-coeftest(model_lm, vcov = vcovBS, cluster = penguin_dat$species, R = 1000)
The irony of this is that the authors of sensemakr() have also implemented this in Stata, and it's very simple to do a clustered bootstrap and then use the standard errors in another command.
sysuse auto2.dta
regress price mpg weight rep78, vce(bootstrap, reps(100)) cluster(foreign)
est sto m1
But the Stata implementation doesn't allow you to pass the model through, instead making you run the regression within sensemakr itself...and when you do that, it disallows more complicated error structures.
Does anyone have any ideas on how to do this? I greatly appreciate any advice you can give.
Citations: Cinelli, C., & Hazlett, C. (2020). Making sense of sensitivity: Extending omitted variable bias. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82(1), 39-67. https://doi.org/10.1111/rssb.12348
Cinelli, C., Ferwerda, J., & Hazlett, C. (2020). sensemakr: Sensitivity analysis tools for OLS in R and Stata. Available at SSRN: https://ssrn.com/abstract=3588978 or http://dx.doi.org/10.2139/ssrn.3588978