Help for package CausalSpline

Type:

Package

Title:

Nonlinear Causal Dose-Response Estimation via Splines

Version:

0.1.0

Description:

Estimates nonlinear causal dose-response functions for continuous treatments using spline-based methods under standard causal assumptions (unconfoundedness / ignorability). Implements three identification strategies: Inverse Probability Weighting (IPW) via the generalised propensity score (GPS), G-computation (outcome regression), and a doubly-robust combination. Natural cubic splines and B-splines are supported for both the exposure-response curve f(T) and the propensity nuisance model. Pointwise confidence bands are obtained via the sandwich estimator or nonparametric bootstrap. Also provides fragility diagnostics including pointwise curvature-based fragility, uncertainty-normalised fragility, and regional integration over user-defined treatment intervals. Builds on the framework of Hirano and Imbens (2004) <doi:10.1111/j.1468-0262.2004.00481.x> for continuous treatments and extends it to fully nonparametric spline estimation.

License:

GPL (≥ 3)

Encoding:

UTF-8

RoxygenNote:

7.3.3

Depends:

R (≥ 4.1.0)

Imports:

splines, stats, utils, ggplot2 (≥ 3.4.0), sandwich, boot

Suggests:

testthat (≥ 3.0.0), knitr, rmarkdown, patchwork, cobalt, dplyr

VignetteBuilder:

knitr

Config/testthat/edition:

URL:

https://github.com/causalfragility-lab/CausalSpline

BugReports:

https://github.com/causalfragility-lab/CausalSpline/issues

NeedsCompilation:

Packaged:

2026-03-21 13:00:44 UTC; Subir

Author:

Subir Hait

[aut, cre]

Maintainer:

Subir Hait <haitsubi@msu.edu>

Repository:

CRAN

Date/Publication:

2026-03-25 21:10:07 UTC

CausalSpline: Nonlinear Causal Dose-Response Estimation via Splines

Description

CausalSpline estimates causal dose-response functions E[Y(t)] for continuous treatments under unconfoundedness, without imposing linearity on the treatment effect. The package fills a gap in the causal inference ecosystem: most tools assume a scalar \beta_1 T treatment effect, whereas real policy or health applications frequently exhibit thresholds, diminishing returns, and non-monotone relationships.

Core functions

causal_spline: Main fitting function. Supports IPW, G-computation, and doubly-robust estimation.
gps_model: Fit the generalised propensity score model for continuous treatments.
trim_weights: Winsorise extreme IPW weights.
check_overlap: Diagnose positivity (ESS, weight plots).
dose_response_curve: Extract the curve data frame.
simulate_dose_response: Simulate nonlinear dose-response data for benchmarking.
gradient_curve: Numerical first and second derivatives of the fitted dose-response curve.
fragility_curve: Pointwise fragility with adaptive eps, interpretation zones, and uncertainty normalisation.
region_fragility: Integrated fragility over a treatment interval.

Causal identification

Identification relies on two standard assumptions:

Unconfoundedness (strong ignorability): Y(t) \perp T \mid X for all t.
Positivity (overlap): f(t \mid X = x) > 0 for all t in the support of T and all x in the support of X.

Methods

IPW: The outcome is regressed on a spline of T using GPS-based inverse probability weights. Consistent if the GPS model is correctly specified.
G-computation: The outcome model includes spline(T) + X. The curve is obtained by marginalising over the observed X distribution. Consistent if the outcome model is correctly specified.
Doubly Robust: Combines IPW and g-computation. Consistent if at least one of the two models is correctly specified.

Author(s)

Maintainer: Subir Hait haitsubi@msu.edu (ORCID)

References

Hirano, K., & Imbens, G. W. (2004). The propensity score with continuous treatments. doi:10.1002/0470090456.ch7

Imbens, G. W. (2000). The role of the propensity score in estimating dose-response functions. Biometrika, 87(3), 706-710. doi:10.1093/biomet/87.3.706

Nonlinear causal dose-response estimation via splines

Description

Estimates the causal dose-response function E[Y(t)] for a continuous treatment T under unconfoundedness, using either Inverse Probability Weighting (IPW) with the generalised propensity score (GPS) or G-computation (outcome regression). The exposure-response curve is modelled as a natural cubic spline or B-spline, allowing thresholds, diminishing returns, and other nonlinear patterns to be recovered without parametric assumptions on the functional form.

Usage

causal_spline(
  formula,
  data,
  method = c("ipw", "gcomp", "dr"),
  spline_type = c("ns", "bs"),
  df_exposure = 4L,
  knots_exposure = NULL,
  df_gps = 4L,
  gps_family = c("gaussian", "gamma", "beta"),
  trim_quantiles = c(0.01, 0.99),
  eval_grid = 100L,
  eval_points = NULL,
  se_method = c("sandwich", "bootstrap"),
  boot_reps = 500L,
  conf_level = 0.95,
  verbose = FALSE
)

Arguments

formula

A two-sided formula of the form outcome ~ treatment | covariate1 + covariate2 + .... The vertical bar | separates the treatment variable from the confounders. Example: Y ~ T | X1 + X2 + X3.

data

A data frame containing all variables in formula.

method

Character string. Estimation method: "ipw" (inverse probability weighting via GPS), "gcomp" (g-computation / outcome regression), or "dr" (doubly-robust, combines both). Default "ipw".

spline_type

Character string. Type of spline basis for f(T): "ns" (natural cubic spline, default) or "bs" (B-spline).

df_exposure

Integer. Degrees of freedom for the treatment spline f(T). Default 4.

knots_exposure

Numeric vector of interior knot positions for the treatment spline. If NULL (default), knots are placed at quantiles of the treatment distribution.

df_gps

Integer. Degrees of freedom for the GPS (propensity) spline used in the "ipw" method. Default 4.

gps_family

Character string. Family for the GPS model: "gaussian" (default) for a linear GPS, "gamma", or "beta" for positive / bounded treatments.

trim_quantiles

Numeric vector of length 2 giving lower and upper quantiles for weight trimming. Default c(0.01, 0.99). Set to NULL to skip trimming.

eval_grid

Integer. Number of equally-spaced treatment values at which to evaluate E[Y(t)]. Default 100.

eval_points

Numeric vector of specific treatment values at which to evaluate the curve. Overrides eval_grid if supplied.

se_method

Character string. Method for standard errors: "sandwich" (default, fast) or "bootstrap".

boot_reps

Integer. Number of bootstrap replications when se_method = "bootstrap". Default 500.

conf_level

Numeric. Confidence level for intervals. Default 0.95.

verbose

Logical. Print progress messages. Default FALSE.

Value

An object of class "causal_spline" with elements:

curve: A data frame with columns t (treatment grid), estimate (E[Y(t)]), se, lower, upper.
ate: Estimated average treatment effect over the observed treatment range (scalar).
weights: Numeric vector of final (trimmed, normalised) IPW weights (only for method = "ipw" or "dr").
gps_fit: The fitted GPS model object.
outcome_fit: The fitted outcome model object.
call: The matched call.
method: The estimation method used.
spline_type: Spline type used.
df_exposure: Degrees of freedom for the exposure spline.
data_summary: Summary statistics of the treatment variable.

References

Hirano, K., & Imbens, G. W. (2004). The propensity score with continuous treatments. Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives, 226164, 73-84. doi:10.1002/0470090456.ch7

Flores, C. A., Flores-Lagunes, A., Gonzalez, A., & Neumann, T. C. (2012). Estimating the effects of length of exposure to instruction in a training program: the case of job corps. Review of Economics and Statistics, 94(1), 153-171. doi:10.1162/REST_a_00177

Imbens, G. W. (2000). The role of the propensity score in estimating dose-response functions. Biometrika, 87(3), 706-710. doi:10.1093/biomet/87.3.706

Examples

# Simulate nonlinear dose-response data
set.seed(42)
dat <- simulate_dose_response(n = 200, dgp = "threshold")

# Fit with IPW
fit <- causal_spline(Y ~ T | X1 + X2 + X3, data = dat, method = "ipw",
                     df_exposure = 5)
summary(fit)
plot(fit)

# Fit with G-computation
fit_gc <- causal_spline(Y ~ T | X1 + X2 + X3, data = dat,
                        method = "gcomp", df_exposure = 5)
plot(fit_gc, truth = dat)

Diagnose positivity / overlap for a continuous treatment

Description

Plots the distribution of the treatment variable conditional on covariate strata and returns effective sample size (ESS) and weight diagnostics to assess the positivity (overlap) assumption.

Usage

check_overlap(treatment, weights, plot = TRUE)

Arguments

treatment

Numeric vector of treatment values.

weights

Numeric vector of IPW weights (length must equal length(treatment)).

plot

Logical. If TRUE, returns a ggplot2 diagnostic plot. Default TRUE.

Value

A list with:

ess: Effective sample size: (\sum w_i)^2 / \sum w_i^2.
weight_summary: Five-number summary of the weights.
plot: ggplot2 object (if plot = TRUE).

Examples

dat <- simulate_dose_response(200)
X   <- as.matrix(dat[, c("X1", "X2", "X3")])
gps <- gps_model(dat$T, X)
w   <- trim_weights(abs(1 / stats::residuals(gps$model)), c(0.01, 0.99))
check_overlap(dat$T, w)

Extract the dose-response curve data frame

Description

Convenience function to pull the estimated E[Y(t)] curve from a fitted causal_spline object.

Usage

dose_response_curve(fit)

Arguments

fit

A causal_spline object.

Value

A data frame with columns t, estimate, se, lower, upper.

Examples

dat <- simulate_dose_response(200)
fit <- causal_spline(Y ~ T | X1 + X2 + X3, data = dat)
curve_df <- dose_response_curve(fit)
head(curve_df)

Geometric fragility curve for a CausalSpline fit

Description

Computes a pointwise shape-based fragility measure from the first and second derivatives of the fitted dose-response curve, with five enhancements:

Adaptive eps: stabilisation constant is 0.05 \times \text{median}(|f'(t)|) so interpretation is consistent across datasets.
Interpretation zones: fragility is classified into "low", "moderate", and "high" based on the 50th and 75th percentiles of the pointwise fragility distribution.
Uncertainty-normalised fragility: an additional column F^*(t) = F(t) / \text{SE}(\hat{\mu}(t)) combines shape instability with statistical uncertainty.
Support density: the kernel density of the treatment variable (if t_obs is supplied) is attached, flagging regions with sparse data.
High-fragility flag: logical column high_fragility marks points above the 75th percentile.

Usage

fragility_curve(
  object,
  grid = NULL,
  h = NULL,
  eps = NULL,
  type = c("curvature_ratio", "inverse_slope"),
  t_obs = NULL,
  ...
)

Arguments

object

A fitted causal_spline object.

grid

Optional numeric evaluation grid. If NULL, the fitted grid in object$curve$t is used.

h

Step size for finite differences. Default NULL (auto).

eps

Numeric or NULL. If NULL (default), set adaptively to 0.05 \times \text{median}(|f'(t)|). If numeric, used directly.

type

Character. Fragility definition: "curvature_ratio" (default) or "inverse_slope".

t_obs

Optional numeric vector of observed treatment values (used to compute support density). If NULL, density column is omitted.

...

Ignored.

Value

A data frame of class "fragility_curve" with columns:

t: Treatment grid.
estimate: Fitted E[Y(t)].
se: Standard error of fitted curve.
derivative: First derivative.
second_derivative: Second derivative.
fragility: Pointwise fragility F(t).
fragility_norm: Uncertainty-normalised fragility F^*(t) = F(t) / \text{SE}(\hat{\mu}(t)).
fragility_zone: Factor: "low", "moderate", "high".
high_fragility: Logical: TRUE if above 75th percentile.
support_density: Kernel density of T at each grid point (only if t_obs supplied).
fragility_type: Character. Type used.

Examples

dat <- simulate_dose_response(200, dgp = "threshold")
fit <- causal_spline(Y ~ T | X1 + X2 + X3, data = dat)
fc  <- fragility_curve(fit, t_obs = dat$T)
plot(fc)

Fit the Generalised Propensity Score model

Description

Fits a model for the conditional distribution of a continuous treatment T | X (the generalised propensity score). Covariates are entered linearly by default; spline transformations of the treatment are not needed here since this models the treatment, not the outcome.

Usage

gps_model(
  treatment,
  covariates,
  spline_type = c("ns", "bs"),
  df = 4L,
  family = c("gaussian", "gamma", "beta"),
  verbose = FALSE
)

Arguments

treatment

Numeric vector of treatment values.

covariates

Numeric matrix of confounders.

spline_type

Character. "ns" or "bs" for spline transformations of continuous covariates. Default "ns".

df

Integer. Degrees of freedom for covariate splines. Default 4. Set to NULL to use linear covariates only.

family

Character. Distribution for the GPS model. "gaussian" (default), "gamma", or "beta".

verbose

Logical. Default FALSE.

Value

A list with:

model: Fitted model object (lm or glm).
family: Family used.
r_squared: R-squared of the GPS model (diagnostic).

Examples

dat <- simulate_dose_response(n = 200, dgp = "linear")
X   <- as.matrix(dat[, c("X1", "X2", "X3")])
gps <- gps_model(dat$T, X)
summary(gps$model)

Numerical derivatives of a CausalSpline dose-response curve

Description

Computes first and second numerical derivatives of the fitted dose-response curve E[Y(t)] using central finite differences applied to predict.causal_spline. Useful for identifying regions of rapid change (high first derivative) or inflection / diminishing returns (second derivative changes sign).

Usage

gradient_curve(object, grid = NULL, h = NULL, eps = 1e-06, ...)

Arguments

object

A fitted causal_spline object.

grid

Optional numeric vector of treatment values at which to evaluate the derivatives. If NULL, the fitted evaluation grid stored in object$curve$t is used.

h

Numeric. Step size for finite differences. If NULL, chosen automatically as (t_{max} - t_{min}) / 500.

eps

Small positive constant used by fragility_curve to stabilise division. Default 1e-6.

...

Ignored.

Value

A data frame with columns:

t: Treatment grid values.
estimate: Fitted E[Y(t)].
se: Standard error of E[Y(t)] from the fitted curve.
derivative: First derivative f'(t).
second_derivative: Second derivative f''(t).

Examples

dat <- simulate_dose_response(200, dgp = "threshold")
fit <- causal_spline(Y ~ T | X1 + X2 + X3, data = dat)
gd  <- gradient_curve(fit)
head(gd)

Plot the estimated dose-response curve

Description

Plots the estimated causal dose-response curve E[Y(t)] against t with pointwise confidence bands and an optional rug for the observed treatment distribution.

Usage

## S3 method for class 'causal_spline'
plot(
  x,
  rug = TRUE,
  truth = NULL,
  xlab = "Treatment (T)",
  ylab = "E[Y(t)]",
  title = NULL,
  ...
)

Arguments

x

A causal_spline object.

rug

Logical. Add a treatment distribution rug. Default TRUE.

truth

Optional data frame with columns t and true_effect for overlaying the true dose-response (useful in simulations).

xlab

Character. x-axis label. Default "Treatment (T)".

ylab

Character. y-axis label. Default "E[Y(t)]".

title

Character. Plot title. Default NULL.

...

Ignored.

Value

A ggplot2 object.

Examples

dat <- simulate_dose_response(200, dgp = "threshold")
fit <- causal_spline(Y ~ T | X1 + X2 + X3, data = dat)
plot(fit)

Plot method for fragility_curve objects

Description

Produces a dual-panel plot: the dose-response curve (top) and the fragility curve (bottom), with high-fragility regions shaded. If support_density is present in x (i.e. t_obs was supplied to fragility_curve), a scaled density ribbon is overlaid on the fragility panel to flag low-support regions.

Usage

## S3 method for class 'fragility_curve'
plot(x, ...)

Arguments

x

A fragility_curve data frame (output of fragility_curve).

...

Ignored.

Value

A combined patchwork plot if the patchwork package is installed, otherwise the two panels are printed separately and a list of two ggplot2 objects is returned invisibly.

Examples

dat <- simulate_dose_response(200, dgp = "threshold")
fit <- causal_spline(Y ~ T | X1 + X2 + X3, data = dat)
fc  <- fragility_curve(fit, t_obs = dat$T)
plot(fc)

Predict E[Y(t)] at new treatment values

Description

Predict E[Y(t)] at new treatment values

Usage

## S3 method for class 'causal_spline'
predict(object, newt, se_fit = FALSE, warn_extrap = TRUE, ...)

Arguments

object

A causal_spline object.

newt

Numeric vector of treatment values at which to predict.

se_fit

Logical. Return standard errors? Default FALSE.

warn_extrap

Logical. Warn if any values in newt fall outside the observed treatment range? Default TRUE.

...

Ignored.

Value

A data frame with columns t, estimate, extrapolated, and optionally se, lower, upper. The extrapolated column is TRUE for any newt value outside the observed treatment support.

Examples

dat <- simulate_dose_response(200)
fit <- causal_spline(Y ~ T | X1 + X2 + X3, data = dat)
predict(fit, newt = c(1, 2, 3, 4, 5))

Print method for causal_spline objects

Description

Print method for causal_spline objects

Usage

## S3 method for class 'causal_spline'
print(x, ...)

Arguments

x

A causal_spline object.

...

Ignored.

Value

Invisibly returns the input causal_spline object x, unchanged. The function is called for its side effect of printing a compact summary to the console, showing the estimation method, spline type, degrees of freedom, sample size, treatment range, and number of evaluation points on the dose-response curve.

Regional fragility summary for a CausalSpline fit

Description

Integrates the pointwise fragility curve over a treatment interval [a, b] using the trapezoidal rule. Useful for comparing sensitivity across dose ranges (e.g., low vs. high dose) or summarising instability at a policy-relevant threshold.

Usage

region_fragility(
  object,
  a,
  b,
  grid = NULL,
  h = NULL,
  eps = NULL,
  type = c("curvature_ratio", "inverse_slope"),
  normalize = TRUE,
  t_obs = NULL,
  ...
)

Arguments

object

A fitted causal_spline object.

a

Numeric scalar. Lower bound of the integration interval.

b

Numeric scalar. Upper bound of the integration interval.

grid

Optional numeric evaluation grid within [a, b]. If NULL, a dense grid of 300 points is created automatically.

h

Step size for finite differences. Default NULL (auto).

eps

Adaptive eps passed to fragility_curve. Default NULL (auto).

type

Fragility definition: "curvature_ratio" (default) or "inverse_slope".

normalize

Logical. Divide integral by interval length? Default TRUE.

t_obs

Optional numeric vector of observed treatment values for support density. Default NULL.

...

Ignored.

Value

A list with elements:

interval: Named numeric vector c(a, b) after clamping to the observed support.
type: Fragility type used.
integral_fragility: Trapezoidal integral of fragility over [a, b].
average_fragility: Integral divided by interval length.
normalized: Logical flag.
curve: The full fragility_curve data frame.

Examples

dat <- simulate_dose_response(200, dgp = "threshold")
fit <- causal_spline(Y ~ T | X1 + X2 + X3, data = dat)
region_fragility(fit, a = 2, b = 5)

Simulate nonlinear dose-response data

Description

Generates synthetic observational data with a continuous treatment, confounders, and a nonlinear dose-response outcome. Useful for testing, benchmarking, and illustrating the CausalSpline package.

Usage

simulate_dose_response(
  n = 500L,
  dgp = c("threshold", "diminishing", "nonmonotone", "linear", "sinusoidal"),
  confounding = 0.5,
  sigma_y = 1,
  seed = NULL
)

Arguments

n

Integer. Sample size. Default 500.

dgp

Character string. Data-generating process:

"threshold": Flat below treatment = 3, steep rise above.
"diminishing": Concave relationship with diminishing returns.
"nonmonotone": Inverted-U relationship.
"linear": Standard linear effect (useful as baseline).
"sinusoidal": Oscillatory effect (difficult, high df needed).

confounding

Numeric scalar in [0, 1]. Strength of confounding (correlation between treatment and confounders). Default 0.5.

sigma_y

Numeric. Standard deviation of the outcome noise. Default 1.

seed

Integer or NULL. Random seed. Default NULL.

Value

A data frame with columns:

Y: Observed outcome.
T: Observed (confounded) treatment.
X1, X2, X3: Confounders.
Y0: Potential outcome at T = 0 (for evaluation).
true_effect: f(T) at each observed T value.

Examples

dat <- simulate_dose_response(n = 200, dgp = "threshold", seed = 1)
plot(dat$T, dat$true_effect, type = "l",
     xlab = "Treatment", ylab = "True causal effect")

dat2 <- simulate_dose_response(n = 200, dgp = "nonmonotone",
                                confounding = 0.8, seed = 42)
hist(dat2$T, main = "Treatment distribution", xlab = "T")

Summary method for causal_spline objects

Description

Summary method for causal_spline objects

Usage

## S3 method for class 'causal_spline'
summary(object, ...)

Arguments

object

A causal_spline object.

...

Ignored.

Value

Invisibly returns the input causal_spline object object, unchanged. The function is called for its side effect of printing a detailed summary to the console, including the original call, estimation method, spline configuration, treatment variable statistics, IPW diagnostics (effective sample size and weight range, if applicable), and a table of the estimated dose-response curve at seven representative percentile points (treatment value, point estimate, standard error, and confidence interval bounds).

Trim extreme IPW weights

Description

Winsorises (clips) weights at specified quantiles to reduce variance from extreme propensity scores. This is a standard stabilisation step when using inverse probability weighting for continuous treatments.

Usage

trim_weights(weights, quantiles = c(0.01, 0.99))

Arguments

weights

Numeric vector of (possibly unstabilised) IPW weights.

quantiles

Numeric vector of length 2: lower and upper quantile thresholds. Default c(0.01, 0.99).

Value

Numeric vector of trimmed weights (same length as input).

Examples

w <- rexp(200)
w_trimmed <- trim_weights(w, c(0.01, 0.99))
summary(w_trimmed)

Package {CausalSpline}

CausalSpline: Nonlinear Causal Dose-Response Estimation via Splines

Description

Core functions

Causal identification

Methods

Author(s)

References

See Also

Nonlinear causal dose-response estimation via splines

Description

Usage

Arguments

Value

References

Examples

Diagnose positivity / overlap for a continuous treatment

Description

Usage

Arguments

Value

Examples

Extract the dose-response curve data frame

Description

Usage

Arguments

Value

Examples

Geometric fragility curve for a CausalSpline fit

Description

Usage

Arguments

Value

Examples

Fit the Generalised Propensity Score model

Description

Usage

Arguments

Value

Examples

Numerical derivatives of a CausalSpline dose-response curve

Description

Usage

Arguments

Value

Examples

Plot the estimated dose-response curve

Description

Usage

Arguments

Value

Examples

Plot method for fragility_curve objects

Description

Usage

Arguments

Value

Examples

Predict E[Y(t)] at new treatment values

Description

Usage

Arguments

Value

Examples

Print method for causal_spline objects

Description

Usage

Arguments

Value

Regional fragility summary for a CausalSpline fit

Description

Usage

Arguments

Value

Examples

Simulate nonlinear dose-response data

Description

Usage

Arguments

Value