weightflow

Declarative, pipeable survey weighting in base R — from design weights to calibrated, variance-ready weights.

weightflow builds survey weights by chaining hierarchical adjustments with a tidymodels-style API, and estimates their variances with a bootstrap that re-applies the whole recipe on each replicate. It has no hard dependencies (base R, R >= 4.1) and bridges to survey/srvyr for design-based inference.

Installation

# install.packages("remotes")
remotes::install_github("jpferreira33/weightflow")

The idea

A recipe is inert: building it computes nothing. prep() walks the steps in order and estimates the cascade of factors; collect_weights() extracts the final weights. Separating define from apply makes the whole process reproducible and auditable, and it is exactly what lets the bootstrap re-run the entire cascade per replicate.

library(weightflow)

recipe <- weighting_spec(sample_survey, base_weights = pw) |>
  step_unknown_eligibility(unknown = unknown_elig, by = "region") |>
  step_nonresponse(respondent = responded, method = "weighting_class",
                   by = c("region", "sex")) |>
  step_calibrate(method = "raking",
                 margins = list(region = c(table(population$region)),
                                sex    = c(table(population$sex))))

fitted <- prep(recipe)              # estimate the cascade
summary(fitted)                     # per-stage diagnostics + Kish deff
wts    <- collect_weights(fitted)   # data.frame with .weight

What it does

Adjustment steps, applied in the order you pipe them:

Step What it does
step_unknown_eligibility() Redistribute unknown-eligibility cases among the known ones (person- or household-level via cluster).
step_drop_ineligible() Zero out out-of-scope units.
step_select_within() Within-household selection (unequal prob or equal n_eligible).
step_nonresponse() Weighting classes or propensity (logit / CART / random forest), person- or household-level.
step_calibrate() Raking, post-stratification, linear/GREG; bounded (Deville-Särndal) and integrative (one weight per household) options.
step_model_calibration() Wu-Sitter model calibration with working models for the outcomes.
step_trim(), step_trim_weights() Manual or automatic survey-style trimming, insertable anywhere.
step_round(), step_rescale() Integer rounding and rescaling to a size or total.
step_assert() Quality checkpoint on deff, weight ratio or effective n.

Eligibility and response accept 0/1 dummy columns or any logical condition.

Diagnostics and reporting: summary() and plot() show the per-stage cascade with the Kish design effect (deff = 1 + CV²) and effective sample size; weight_factors() returns the per-unit, per-step factors; report_weighting() writes a self-contained HTML report — pipeline diagram, variables used, per-stage summaries and per-step visuals — with no graphics device or server required.

Variance estimation (see the Variance estimation article):

boot <- bootstrap_weights(recipe, replicates = 500, strata = "region", psu = "psu")
boot_mean(boot, "income")           # estimate, SE and CI
as_svydesign(fitted, ids = "psu", strata = "region")   # survey linearization
collect_replicate_weights(boot)     # replicate weights, ready for srvyr

The bootstrap resamples PSUs within strata (Rao-Wu rescaling bootstrap) and re-applies the recipe on each replicate, so the replicate weights carry the variability of every adjustment.

Example data

Three bundled datasets: population (the frame), sample_survey (take-all roster) and sample_one (multistage select-one design), all with stratum, PSU and design weight, so the full pipeline and the variance methods run natively.

Extending

apply_step() is the internal S3 generic behind each step. To add an adjustment, define a step_*() constructor (inert) and its apply_step.<class>() method — nothing else changes.

References

General framework

Nonresponse

Calibration

Design effect and trimming

Variance estimation

License

MIT © Juan Pablo Ferreira