miceDRF provides an imputation method for the mice framework
based on distributional random forests (DRF).
The package extends multiple imputation by chained equations (MICE) with a nonparametric approach that models conditional distributions rather than only conditional means. This allows flexible imputation of complex data structures, nonlinear effects, and heterogeneous conditional distributions.
The method can be used directly within the standard mice
workflow via:
method = "DRF"Install the development version from GitHub with:
if (!requireNamespace("devtools", quietly = TRUE)) {
install.packages("devtools")
}
devtools::install_github("KrystynaGrzesiak/miceDRF")library(mice)
library(miceDRF)
set.seed(123)
# Generate data
n <- 200
d <- 5
X <- matrix(runif(n * d), nrow = n, ncol = d)
# Introduce missing values
pmiss <- 0.2
X.NA <- apply(X, 2, function(x) {
U <- runif(length(x))
ifelse(U <= pmiss, NA, x)
})
# Imputation with DRF
imp <- mice(X.NA, m = 1, method = "DRF")
Ximp <- complete(imp)Näf, J., Scornet, E., & Josse, J. (2024). What is a good imputation under MAR missingness? arXiv preprint. https://arxiv.org/abs/2403.19196
Cevid, D., Michel, L., Näf, J., Meinshausen, N., and Buehlmann, P. (2022). Distributional random forests: Heterogeneity adjustment and multivariate distributional regression. Journal of Machine Learning Research, 23(333), 1–79.
If you use miceDRF in your research, please cite:
Näf, J., Grzesiak, K., and Scornet, E. (2025). How to rank imputation methods? arXiv preprint arXiv:2507.11297. https://doi.org/10.48550/arXiv.2507.11297