Declarative, pipeable survey weighting in base R — from design weights to calibrated, variance-ready weights.
weightflow builds survey weights by chaining
hierarchical adjustments with a tidymodels-style API, and
estimates their variances with a bootstrap that re-applies the whole
recipe on each replicate. It has no hard dependencies
(base R, R >= 4.1) and bridges to
survey/srvyr for design-based inference.
# install.packages("remotes")
remotes::install_github("jpferreira33/weightflow")A recipe is inert: building it computes nothing.
prep() walks the steps in order and estimates the
cascade of factors; collect_weights() extracts the final
weights. Separating define from apply makes the whole
process reproducible and auditable, and it is exactly what lets the
bootstrap re-run the entire cascade per replicate.
library(weightflow)
recipe <- weighting_spec(sample_survey, base_weights = pw) |>
step_unknown_eligibility(unknown = unknown_elig, by = "region") |>
step_nonresponse(respondent = responded, method = "weighting_class",
by = c("region", "sex")) |>
step_calibrate(method = "raking",
margins = list(region = c(table(population$region)),
sex = c(table(population$sex))))
fitted <- prep(recipe) # estimate the cascade
summary(fitted) # per-stage diagnostics + Kish deff
wts <- collect_weights(fitted) # data.frame with .weightAdjustment steps, applied in the order you pipe them:
| Step | What it does |
|---|---|
step_unknown_eligibility() |
Redistribute unknown-eligibility cases among the known ones (person-
or household-level via cluster). |
step_drop_ineligible() |
Zero out out-of-scope units. |
step_select_within() |
Within-household selection (unequal prob or equal
n_eligible). |
step_nonresponse() |
Weighting classes or propensity (logit / CART / random forest), person- or household-level. |
step_calibrate() |
Raking, post-stratification, linear/GREG; bounded (Deville-Särndal) and integrative (one weight per household) options. |
step_model_calibration() |
Wu-Sitter model calibration with working models for the outcomes. |
step_trim(), step_trim_weights() |
Manual or automatic survey-style trimming, insertable anywhere. |
step_round(), step_rescale() |
Integer rounding and rescaling to a size or total. |
step_assert() |
Quality checkpoint on deff, weight ratio or effective n. |
Eligibility and response accept 0/1 dummy columns or any logical condition.
Diagnostics and reporting: summary()
and plot() show the per-stage cascade with the Kish
design effect (deff = 1 + CV²) and effective sample size;
weight_factors() returns the per-unit, per-step factors;
report_weighting() writes a self-contained HTML report —
pipeline diagram, variables used, per-stage summaries and per-step
visuals — with no graphics device or server required.
Variance estimation (see the Variance estimation article):
boot <- bootstrap_weights(recipe, replicates = 500, strata = "region", psu = "psu")
boot_mean(boot, "income") # estimate, SE and CI
as_svydesign(fitted, ids = "psu", strata = "region") # survey linearization
collect_replicate_weights(boot) # replicate weights, ready for srvyrThe bootstrap resamples PSUs within strata (Rao-Wu rescaling bootstrap) and re-applies the recipe on each replicate, so the replicate weights carry the variability of every adjustment.
Three bundled datasets: population (the frame),
sample_survey (take-all roster) and sample_one
(multistage select-one design), all with stratum, PSU and design weight,
so the full pipeline and the variance methods run natively.
apply_step() is the internal S3 generic behind each
step. To add an adjustment, define a step_*() constructor
(inert) and its apply_step.<class>() method — nothing
else changes.
General framework
Nonresponse
Calibration
Design effect and trimming
Variance estimation
MIT © Juan Pablo Ferreira