Quickstart Guide to ROOT

Installation

# install.packages("devtools")
devtools::install_github("peterliu599/ROOT-R-Package")

What is ROOT?

ROOT (Rashomon set of Optimal Trees) learns interpretable binary weight functions that minimize a user-specified global objective function and are represented as sparse decision trees. Each unit is either included (w = 1) or excluded (w = 0) based on the covariates.

Rather than returning a single solution, ROOT returns a Rashomon set of near-optimal trees and extracts a characteristic tree that summarizes the common patterns across them.

Basic usage

The main function is ROOT(). At minimum, you supply a data frame where the first column is the outcome to minimize, and the remaining columns are covariates.

library(ROOT)
set.seed(123)

# Simulate 80 units with two covariates and a variance-type objective
n <- 80
dat <- data.frame(
  vsq  = c(rnorm(40, mean = 0.01, sd = 0.005),   # low-variance group
            rnorm(40, mean = 0.08, sd = 0.02)),   # high-variance group
  x1   = c(runif(40, 0, 1), runif(40, 0, 1)),
  x2   = c(rep(0, 40), rep(1, 40))               # x2 = 1 flags high-variance units
)

fit <- ROOT(
  data        = dat,
  num_trees   = 20,
  top_k_trees = TRUE,
  k           = 10,
  seed        = 123
)

Inspecting results

print(fit)      # brief summary
#> ROOT object
#>   Generalizability mode: FALSE 
#> 
#> Summary classifier (f):
#> n= 80 
#> 
#> node), split, n, loss, yval, (yprob)
#>       * denotes terminal node
#> 
#> 1) root 80 40 0 (0.5000000 0.5000000)  
#>   2) x2>=0.5 40  0 0 (1.0000000 0.0000000) *
#>   3) x2< 0.5 40  0 1 (0.0000000 1.0000000) *
summary(fit)    # full summary including Rashomon set details
#> ROOT object
#>   Generalizability mode: FALSE 
#> 
#> Summary classifier (f):
#> n= 80 
#> 
#> node), split, n, loss, yval, (yprob)
#>       * denotes terminal node
#> 
#> 1) root 80 40 0 (0.5000000 0.5000000)  
#>   2) x2>=0.5 40  0 0 (1.0000000 0.0000000) *
#>   3) x2< 0.5 40  0 1 (0.0000000 1.0000000) *
#> 
#> Global objective function:
#>   User-supplied: No (default objective used)
#> 
#> Diagnostics:
#>   Number of trees grown: 20
#>   Rashomon set size: 10
#>   % observations with w_opt == 1: 50.0%
plot(fit)       # visualize the characteristic tree

Next steps