| Type: | Package |
| Title: | Imputation with 'mice' and Distributional Random Forests |
| Version: | 0.1.0 |
| Description: | Provides a custom imputation method for the 'mice' package based on distributional random forests. The package implements the 'mice.impute.DRF' method, which can be used within the standard 'mice' workflow. Missing values are imputed by estimating conditional distributions with distributional random forests and sampling observed responses using forest weights. |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| URL: | https://github.com/KrystynaGrzesiak/miceDRF, https://krystynagrzesiak.github.io/miceDRF/ |
| BugReports: | https://github.com/KrystynaGrzesiak/miceDRF/issues |
| Imports: | drf |
| Suggests: | mice, spelling, testthat (≥ 3.0.0) |
| Language: | en-US |
| Config/testthat/edition: | 3 |
| RoxygenNote: | 7.3.3 |
| NeedsCompilation: | no |
| Packaged: | 2026-05-28 17:24:23 UTC; Krysia |
| Author: | Krystyna Grzesiak |
| Maintainer: | Krystyna Grzesiak <krygrz11@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-06-01 13:50:02 UTC |
Imputation with Distributional Random Forests for 'mice'
Description
Imputes missing values using distributional random forests within the multiple imputation by chained equations framework implemented in the mice package.
Usage
mice.impute.DRF(
y,
ry,
x,
wy = NULL,
min.node.size = 1,
num.features = 10,
num.trees = 10,
...
)
Arguments
y |
Vector to be imputed. |
ry |
Logical vector indicating which elements of |
x |
Numeric design matrix with |
wy |
Logical vector indicating elements of |
min.node.size |
Target minimum number of observations in each tree leaf
in the distributional random forest. Default is |
num.features |
Number of random features to sample at each split.
Default is |
num.trees |
Number of trees in the distributional random forest.
Default is |
... |
Additional arguments passed by |
Details
This function is called internally by mice when the imputation
method is set to "DRF". For each variable with missing values, a
distributional random forest is fitted to the observed values using the
remaining variables as predictors. Missing values are then imputed by
sampling observed responses according to the forest weights.
Value
A numeric vector of imputed values for the entries of y
indicated by wy. The vector has length sum(wy) and is
returned to mice to replace the missing values in the current
variable.
References
Näf, J., Scornet, E., and Josse, J. (2024). "What is a good imputation under MAR missingness?" https://arxiv.org/abs/2403.19196.
Cevid, D., Michel, L., Näf, J., Meinshausen, N., and Buehlmann, P. (2022). "Distributional random forests: Heterogeneity adjustment and multivariate distributional regression." Journal of Machine Learning Research, 23(333), 1–79.
Examples
library(mice)
set.seed(123)
X <- matrix(rnorm(1000), nrow = 100)
X[runif(length(X)) < 0.3] <- NA
imp <- mice(X, method = "DRF", m = 1, maxit = 1, printFlag = FALSE)
complete(imp)