Introduction to confoundvis

Subir Hait

2026-03-25

Overview

confoundvis provides a unified, ggplot2-native visualization toolkit for causal sensitivity analysis under unmeasured confounding. It supports multiple frameworks — the Impact Threshold for a Confounding Variable (ITCV; Frank 2000), partial R²-based methods (Cinelli & Hazlett 2020), and the multilevel mITCV extension — and introduces the Sensitivity Love Plot, a novel diagnostic that benchmarks sensitivity thresholds against observed covariate impacts in a format familiar from propensity score analysis.

The confoundsens object

All plotting functions in confoundvis accept a confoundsens object, which stores a sensitivity path \(\theta(\lambda)\) along with optional standard errors, test statistics, and level labels (for multilevel decompositions).

x <- new_confoundsens(
  lambda = seq(0, 0.25, length.out = 20),
  theta  = 0.42 - 1.8 * seq(0, 0.25, length.out = 20),
  se     = rep(0.07, 20)
)
x
#> <confoundsens>
#>   n      : 20 
#>   lambda : [0, 0.25] 
#>   theta  : [-0.03, 0.42] 
#>   level  : <none>
#>   se     : yes 
#>   t      : no

You can also construct a confoundsens object from an existing data frame:

df <- data.frame(
  lambda = seq(0, 0.25, length.out = 20),
  theta  = 0.42 - 1.8 * seq(0, 0.25, length.out = 20),
  se     = rep(0.07, 20),
  level  = rep(c("within", "between"), each = 10)
)
x_ml <- as_confoundsens(df, level = "level")
summary(x_ml)
#> <summary.confoundsens>
#> 
#>    level  n lambda_min lambda_max  theta_min theta_max   theta_0 se_0 t_0
#>   within 10  0.0000000  0.1184211  0.2068421 0.4200000 0.4200000   NA  NA
#>  between 10  0.1315789  0.2500000 -0.0300000 0.1831579 0.1831579   NA  NA

Robustness curve

plot_robustness_curve() visualizes the full sensitivity path \(\theta(\lambda)\) as confounding strength \(\lambda\) grows from zero. This is the functional profile behind the scalar ITCV threshold — it shows not just where the effect crosses zero, but how quickly.

plot_robustness_curve(x, bands = TRUE, points = FALSE) +
  ggplot2::geom_hline(yintercept = 0, linetype = "dotted") +
  ggplot2::labs(title = "Robustness curve: effect path under growing confounding")

For multilevel data, set facet_level = TRUE to see within- and between-cluster paths side by side:

x_ml2 <- new_confoundsens(
  lambda = rep(seq(0, 0.25, length.out = 15), 2),
  theta  = c(0.50 - 2.0 * seq(0, 0.25, length.out = 15),
             0.35 - 1.2 * seq(0, 0.25, length.out = 15)),
  se     = rep(0.06, 30),
  level  = rep(c("within", "between"), each = 15)
)
#> Warning: `x$lambda` is not nondecreasing; plots may look odd.
plot_robustness_curve(x_ml2, facet_level = TRUE, bands = TRUE, points = FALSE) +
  ggplot2::labs(title = "Multilevel robustness curves")
#> Warning: `x$lambda` is not nondecreasing; plots may look odd.

Sensitivity contour plot

plot_sensitivity_contour() draws the ITCV hyperbolic boundary in \((r_{YU}, r_{DU})\) space. Observed covariate benchmarks can be overlaid as labeled points, allowing readers to compare the threshold against the empirical distribution of measured covariate correlations.

benchmarks <- data.frame(
  r_yu  = c(0.12, 0.09, 0.18, 0.05),
  r_du  = c(0.21, 0.14, 0.08, 0.31),
  label = c("SES", "Gender", "Race", "Prior achievement")
)
plot_sensitivity_contour(threshold = 0.025, benchmarks = benchmarks) +
  ggplot2::labs(
    title   = "Sensitivity contour plot",
    subtitle = "Benchmarks: observed covariate correlations"
  )

Sensitivity Love plot

The Sensitivity Love Plot is the key novel contribution of confoundvis. It adapts the Love plot — widely used to assess covariate balance in propensity score analysis — to the sensitivity analysis context. Each observed covariate appears as a point on a horizontal impact axis. The vertical dashed line marks the sensitivity threshold: covariates to its left are weaker than the threshold, while those to its right represent the strength an unmeasured confounder would need to exceed to overturn inference.

covariate_impacts <- data.frame(
  covariate = c("SES", "Gender", "Race", "Prior achievement",
                "School type", "Region", "Age"),
  impact    = c(0.025, 0.008, 0.019, 0.041, 0.011, 0.006, 0.014)
)

plot_sensitivity_love(covariate_impacts, threshold = 0.022) +
  ggplot2::labs(
    title    = "Sensitivity Love plot",
    subtitle = "Impact of observed covariates vs. sensitivity threshold"
  )

Covariates to the right of the dashed line (here, SES and Prior achievement) are stronger than the threshold, indicating that an unmeasured confounder of similar strength could overturn the conclusion.

Local Taylor diagnostic

plot_local_taylor() visualizes the local quadratic approximation of a sensitivity path. The observed path, the tangent (first-order) approximation, and the quadratic (second-order) approximation are distinguished by color and linetype.

d <- seq(0, 0.20, length.out = 40)
taylor_df <- data.frame(
  delta  = rep(d, 3),
  series = rep(c("Observed path", "Tangent (1st order)",
                  "Quadratic (2nd order)"), each = 40),
  value  = c(
    0.4 - 0.7*d - 0.6*d^2,   # concave-down path
    0.4 - 0.7*d,              # tangent
    0.4 - 0.7*d - 0.3*d^2    # local quadratic fit
  )
)

plot_local_taylor(taylor_df) +
  ggplot2::geom_hline(yintercept = 0, linetype = "dotted") +
  ggplot2::labs(
    title    = "Local Taylor diagnostic",
    subtitle = "Concave-down path: tangent overestimates robustness"
  )

The key insight is that when the path is concave-down, the tangent overestimates how far the effect is from zero — the first-order threshold is optimistic. The local quadratic corrects this bias.

Fitting a local quadratic

fit_local_quadratic() estimates the quadratic coefficients directly from numerical sensitivity path data:

path_df <- data.frame(
  delta = seq(0, 0.5, length.out = 100),
  theta = 0.4 - 0.7 * seq(0, 0.5, length.out = 100) -
          0.5 * seq(0, 0.5, length.out = 100)^2
)

fit <- fit_local_quadratic(path_df, local_max_delta = 0.2)
cat("Intercept:", round(fit$intercept, 4), "\n")
#> Intercept: 0.4
cat("Slope    :", round(fit$slope,     4), "\n")
#> Slope    : -0.7
cat("Curvature:", round(fit$quad,      4), "\n")
#> Curvature: -0.5

Simulating curvature regimes

simulate_taylor_demo() generates three synthetic paths with identical baseline and slope but different curvature, useful for teaching and for illustrating why first-order thresholds can mislead:

sims <- simulate_taylor_demo(delta_max = 1, step = 0.05,
                             theta0 = 0.4, slope = -0.5, kappa = 0.4)

# Plot all three on one figure
library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 4.5.3
paths <- rbind(
  data.frame(delta = sims$linear$lambda,
             theta = sims$linear$theta,  regime = "Linear"),
  data.frame(delta = sims$concave$lambda,
             theta = sims$concave$theta, regime = "Concave-down"),
  data.frame(delta = sims$convex$lambda,
             theta = sims$convex$theta,  regime = "Convex-up")
)

ggplot(paths, aes(x = delta, y = theta, colour = regime, linetype = regime)) +
  geom_line(linewidth = 0.8) +
  geom_hline(yintercept = 0, linetype = "dotted") +
  labs(x = expression(delta), y = expression(theta(delta)),
       title   = "Three curvature regimes",
       colour  = "Regime", linetype = "Regime")

References

Frank, K. A. (2000). Impact of a confounding variable on a regression coefficient. Sociological Methods & Research, 29(2), 147–194.

Cinelli, C., & Hazlett, C. (2020). Making sense of sensitivity: Extending omitted variable bias. Journal of the Royal Statistical Society: Series B, 82(1), 39–67.

Austin, P. C., & Stuart, E. A. (2015). Moving towards best practice when using inverse probability of treatment weighting. Statistics in Medicine, 34(28), 3661–3679.