A grammar of machine learning workflows for R.
Split, fit, evaluate, assess — four verbs that encode the workflow
from Hastie, Tibshirani & Friedman (The Elements of Statistical
Learning, Ch. 7). The evaluate/assess boundary makes data leakage
inexpressible: ml_evaluate() runs on validation data and
can be called freely; ml_assess() runs on held-out test
data and locks after one use.
# Install from GitHub (current)
remotes::install_github("epagogy/ml", subdir = "r")
# install.packages("ml")
# CRAN submission is under review — the line above will work once accepted.R >= 4.1.0. Optional backends: ‘xgboost’, ‘ranger’, ‘glmnet’, ‘kknn’, ‘e1071’, ‘naivebayes’, ‘rpart’.
library(ml)
s <- ml_split(iris, "Species", seed = 42)
model <- ml_fit(s$train, "Species", seed = 42)
ml_evaluate(model, s$valid) # check performance, tweak, repeat
final <- ml_fit(s$dev, "Species", seed = 42)
ml_assess(final, test = s$test) # final exam — second call errorss$dev is train + valid combined, used for the final
refit before assessment. This three-way split (train 60 / valid 20 /
test 20) with a .dev convenience accessor follows the
textbook protocol exactly.
ml_split() |
Stratified three-way split → $train,
$valid, $test, $dev |
ml_fit() |
Train a model (per-fold preprocessing, deterministic seeding) |
ml_evaluate() |
Validation metrics — repeat freely |
ml_assess() |
Test metrics — once, final, locks after use |
These four are the grammar. Everything else extends it:
ml_screen() |
Algorithm leaderboard |
ml_tune() |
Hyperparameter search |
ml_stack() |
OOF ensemble stacking |
ml_predict() |
Class labels or probabilities |
ml_explain() |
Feature importance |
ml_compare() |
Side-by-side model comparison |
ml_validate() |
Pass/fail deployment gate |
ml_drift() |
Distribution shift detection (KS, chi-squared) |
ml_calibrate() |
Probability calibration (Platt, isotonic) |
ml_profile() |
Dataset summary |
ml_save() / ml_load() |
Serialize to .mlr |
13 families. engine = "auto" uses the Rust backend when
available; engine = "r" forces the R package backend.
| Algorithm | String | Clf | Reg | Backend |
|---|---|---|---|---|
| Logistic | "logistic" |
Y | nnet | |
| Decision Tree | "decision_tree" |
Y | Y | rpart |
| Random Forest | "random_forest" |
Y | Y | ranger |
| Extra Trees | "extra_trees" |
Y | Y | Rust |
| Gradient Boosting | "gradient_boosting" |
Y | Y | Rust |
| XGBoost | "xgboost" |
Y | Y | xgboost |
| Ridge | "linear" |
Y | glmnet | |
| Elastic Net | "elastic_net" |
Y | glmnet | |
| SVM | "svm" |
Y | Y | e1071 |
| KNN | "knn" |
Y | Y | kknn |
| Naive Bayes | "naive_bayes" |
Y | naivebayes | |
| AdaBoost | "adaboost" |
Y | Rust | |
| Hist. Gradient Boosting | "histgradient" |
Y | Y | Rust |
Seeds. seed = NULL auto-generates a
seed and stores it on the result for reproducibility.
seed = 42 gives full deterministic control.
Per-fold preprocessing. Scaling and encoding fit on training folds only, never on validation or test. No information leaks across the split boundary.
Error messages. Wrong column name?
ml_fit() tells you what columns exist. Wrong algorithm
string? It lists the valid ones. Errors aim to fix themselves.
Roth, S. (2026). A Grammar of Machine Learning Workflows.
doi:10.5281/zenodo.19023838
MIT. Simon Roth, 2026.