Since parglm() returns a standard glm
object, it works directly with the sandwich
package for heteroskedasticity-consistent (HC) and cluster-robust
standard errors. The lmtest
package provides coeftest() for displaying results with
alternative covariance matrices.
We simulate a Poisson dataset with 20 clusters of 10 observations each, where a cluster-level random effect induces within-cluster correlation.
The default model-based standard errors assume the Poisson variance equals the mean. They will be too small here because the cluster random effects induce overdispersion.
vcovHC() computes sandwich standard errors that are
robust to misspecification of the variance function. HC3 (the default)
is recommended for small to moderate samples.
coeftest(fit, vcov = vcovHC)
#>
#> z test of coefficients:
#>
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.680296 0.069394 9.8034 < 2.2e-16 ***
#> x1 0.301561 0.078775 3.8281 0.0001291 ***
#> x2 -0.233887 0.058692 -3.9850 6.749e-05 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1vcovCL() accounts for within-cluster correlation, which
is the appropriate correction here.
coeftest(fit, vcov = vcovCL, cluster = ~cluster)
#>
#> z test of coefficients:
#>
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.680296 0.130871 5.1982 2.012e-07 ***
#> x1 0.301561 0.045306 6.6561 2.813e-11 ***
#> x2 -0.233887 0.059090 -3.9581 7.554e-05 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1As expected, the cluster-robust standard errors are larger than the model-based ones, reflecting the extra variability due to the cluster random effects.
The covariance matrices themselves are also available:
vcovHC(fit, type = "HC3")
#> (Intercept) x1 x2
#> (Intercept) 0.0048154924 -1.496148e-03 1.564757e-04
#> x1 -0.0014961477 6.205443e-03 9.089414e-05
#> x2 0.0001564757 9.089414e-05 3.444771e-03
vcovCL(fit, cluster = ~cluster)
#> (Intercept) x1 x2
#> (Intercept) 0.0171271937 0.0006650872 -0.0006143826
#> x1 0.0006650872 0.0020526462 -0.0002594062
#> x2 -0.0006143826 -0.0002594062 0.0034916456model = TRUE (the default in parglm) must
be set so that the model frame is stored, allowing sandwich
to reconstruct the design matrix internally.
The gtsummary
package produces publication-ready regression tables from model objects.
parglm() models are supported directly because they inherit
from glm.
We fit a logistic regression model with parglm() and
pass it to tbl_regression(). Setting
exponentiate = TRUE displays odds ratios with their
confidence intervals.
set.seed(2)
n2 <- 300
x1_b <- rnorm(n2)
x2_b <- rnorm(n2)
y_bin <- rbinom(n2, 1, plogis(0.4 + 0.6 * x1_b - 0.4 * x2_b))
dat_b <- data.frame(y = y_bin, x1 = x1_b, x2 = x2_b)
fit_logistic <- parglm(y ~ x1 + x2, data = dat_b, family = binomial(),
control = parglm.control(nthreads = 1L))
suppressWarnings(
tbl_regression(fit_logistic, exponentiate = TRUE)
)| Characteristic | OR | 95% CI | p-value |
|---|---|---|---|
| x1 | 1.50 | 1.20, 1.90 | <0.001 |
| x2 | 0.73 | 0.57, 0.93 | 0.012 |
| Abbreviations: CI = Confidence Interval, OR = Odds Ratio | |||
tidy_parglm_robust() is a drop-in tidy_fun
for tbl_regression() that replaces the default model-based
standard errors with sandwich estimates. Here we use cluster-robust
standard errors for the Poisson model fitted earlier.
| Characteristic | log(IRR) | 95% CI | p-value |
|---|---|---|---|
| x1 | 0.30 | 0.15, 0.46 | <0.001 |
| x2 | -0.23 | -0.35, -0.12 | <0.001 |
| Abbreviations: CI = Confidence Interval, IRR = Incidence Rate Ratio | |||