| Title: | Decision Tree Analysis for Longitudinal Measurement Data |
| Version: | 1.0.0 |
| Maintainer: | Ryoto Obata <ryoto.obata@gmail.com> |
| Description: | Implements tree-based methods for longitudinal data. The package constructs decision trees that evaluate both the main effect of a covariate and its interaction with time through a weighted splitting criterion. It supports single-tree construction, bootstrap-based multiple-tree selection, and tree visualisation. For methodological details, see Obata and Sugimoto (2026) <doi:10.1007/s11634-025-00665-2>. |
| License: | GPL (≥ 3) |
| Encoding: | UTF-8 |
| LazyData: | true |
| NeedsCompilation: | yes |
| RoxygenNote: | 7.2.3 |
| Depends: | R (≥ 3.5.0) |
| Imports: | stats, graphics, partykit, ggparty, ggplot2, lme4, utils |
| Suggests: | testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| Packaged: | 2026-03-22 10:37:44 UTC; jovyan |
| Author: | Ryoto Obata [aut, cre], Tomoyuki Sugimoto [aut] |
| Repository: | CRAN |
| Date/Publication: | 2026-03-26 09:40:02 UTC |
Construction of a Decision Tree for Longitudinal Data
Description
Constructs a single decision tree for longitudinal data. The method evaluates both the main effect of a covariate and its interaction with time, incorporating a weighting mechanism to balance the two effects. Three single-tree construction procedures (ST1, ST2, ST3) are available; see Details. For the underlying methodology, refer to Obata and Sugimoto (2026).
Usage
longitree(
formula,
time,
random,
weight = "w",
data,
alpha = "no",
gamma = "no",
cv = "no",
maxdepth = 5,
minbucket = 5,
minsplit = 20,
xval = 10
)
## S3 method for class 'longitree'
summary(object, ...)
## S3 method for class 'longitree'
print(x, ...)
## S3 method for class 'longitree'
predict(object, ...)
## S3 method for class 'longitree'
plot(x, ...)
Arguments
formula |
A formula specifying the model.
The response variable should be on the left side and covariates on the
right side. Use |
time |
Character string giving the column name of the time variable. All individuals are assumed to be observed at the same time points. |
random |
Character string giving the column name of the random effect (subject identifier). |
weight |
Weight for balancing the main effect of a covariate and
its interaction with time. A value in
|
data |
A data frame containing the variables in |
alpha |
Significance level used as the stopping rule for tree
growth. A smaller value produces a more conservative (smaller) tree.
Specify a numeric value or |
gamma |
Complexity parameter for pruning. A larger value prunes
more aggressively, yielding a smaller and simpler tree; a smaller
value retains more branches. Specify a numeric value or |
cv |
Set |
maxdepth |
Maximum depth of the tree (default 5). |
minbucket |
Minimum number of subjects in a terminal node (default 5). |
minsplit |
Minimum number of subjects required to attempt a split (default 20). |
xval |
Number of cross-validation folds (default 10). Used to
compute the cross-validated coefficient of determination
( |
object |
A |
... |
Additional arguments passed to |
x |
A |
Details
Exactly one of alpha, gamma, or cv must be specified.
Specifying more than one will result in an error. These correspond to the
three single-tree construction procedures:
- ST1 (
cv = "yes") Tree growth, pruning, and final tree selection via cross-validation.
- ST2 (
alpha) Tree growth with a significance threshold. No pruning or final tree selection via cross-validation.
- ST3 (
gamma) Tree growth followed by pruning with a pre-specified complexity parameter. No final tree selection via cross-validation.
Since the time variable is not used as a splitting variable, each terminal node (leaf) contains the full longitudinal responses for every subject assigned to it, allowing direct evaluation of longitudinal trajectories within each leaf.
Value
An object of class "longitree". Use
summary.longitree, predict.longitree,
or plot.longitree to inspect the results.
Methods (by generic)
-
summary(longitree): Print a brief summary of alongitreeobject. -
print(longitree): Print method (callssummary). -
predict(longitree): Extract predicted values and terminal node assignments from alongitreeobject. Returns a data frame with columnspredict(predicted values) andterminalnode(terminal node assignments). -
plot(longitree): Plot alongitreeobject. A convenience wrapper aroundtreeplot.
References
Obata, R. and Sugimoto, T. (2026). A decision tree analysis for longitudinal measurement data and its applications. Advances in Data Analysis and Classification. doi:10.1007/s11634-025-00665-2
See Also
Examples
data(ltreedata)
# ST1: tree construction via cross-validation
result_st1 <- longitree(y ~ ., time = "time", random = "subject",
weight = 0.7, data = ltreedata, cv = "yes")
summary(result_st1)
predict(result_st1)
plot(result_st1)
# ST2: tree growth with a significance threshold
result_st2 <- longitree(y ~ ., time = "time", random = "subject",
weight = 0.1, data = ltreedata, alpha = 0.05)
summary(result_st2)
predict(result_st2)
plot(result_st2)
# ST3: pruning with a complexity parameter
result_st3 <- longitree(y ~ ., time = "time", random = "subject",
weight = "w", data = ltreedata, gamma = 3)
summary(result_st3)
predict(result_st3)
plot(result_st3)
Construction of Multiple Decision Trees for Longitudinal Data
Description
Generates multiple trees from bootstrap samples and evaluates all three-tree combinations based on two criteria: cross-validated prediction error and tree diversification measured by the adjusted Rand index (ARI). Bootstrap sampling is performed at the subject level to preserve longitudinal structure.
Usage
longitrees(
formula,
time,
random,
weight = "w",
data,
alpha = "no",
gamma = "no",
cv = "no",
maxdepth = 5,
minbucket = 5,
minsplit = 20,
xval = 10,
bootsize,
trees = 100,
mins = 40
)
Arguments
formula |
A formula specifying the model.
The response variable should be on the left side and covariates on the
right side. Use |
time |
Character string giving the column name of the time variable. All individuals are assumed to be observed at the same time points. |
random |
Character string giving the column name of the random effect (subject identifier). |
weight |
Weight for balancing the main effect of a covariate and
its interaction with time. A value in
|
data |
A data frame containing the variables in |
alpha |
Significance level used as the stopping rule for tree
growth. A smaller value produces a more conservative (smaller) tree.
Specify a numeric value or |
gamma |
Complexity parameter for pruning. A larger value prunes
more aggressively, yielding a smaller and simpler tree; a smaller
value retains more branches. Specify a numeric value or |
cv |
Set |
maxdepth |
Maximum depth of the tree (default 5). |
minbucket |
Minimum number of subjects in a terminal node (default 5). |
minsplit |
Minimum number of subjects required to attempt a split (default 20). |
xval |
Number of cross-validation folds (default 10). Used to
compute the cross-validated coefficient of determination
( |
bootsize |
Number of subjects in each bootstrap sample. |
trees |
Number of bootstrap trees to grow (default 100). |
mins |
Number of top-ranking candidate three-tree subsets to retain (default 40). |
Details
See longitree for a description of the three single-tree
construction procedures (ST1, ST2, ST3) corresponding to cv,
alpha, and gamma.
Value
An object of class "longitrees". Pass to
selectionplot to select the optimal three-tree combination.
References
Obata, R. and Sugimoto, T. (2026). A decision tree analysis for longitudinal measurement data and its applications. Advances in Data Analysis and Classification. doi:10.1007/s11634-025-00665-2
See Also
longitree, selectionplot,
threetrees, treeplot
Sample longitudinal data for decision tree examples
Description
A sample balanced longitudinal dataset with 50 subjects observed at 10 equally spaced time points.
Usage
ltreedata
Format
A data frame with 500 rows and 7 variables:
- y
Response variable (continuous).
- subject
Subject identifier (integer, 1–50).
- time
Time point (integer, 1–10).
- x1
Baseline covariate 1 (integer, 1–10).
- x2
Baseline covariate 2 (integer, 1–10).
- x3
Baseline covariate 3 (integer, 1–6).
- x4
Baseline covariate 4 (integer, 1–12).
Select Optimal Three-Tree Combination
Description
Plots the cross-validated prediction error against the maximum pairwise
adjusted Rand index (ARI) for candidate three-tree subsets, and selects
a subset based on either prediction performance or tree diversification.
The selected combination is indicated by a red point on the plot, which
corresponds to the three trees used in the subsequent
threetrees step.
Usage
selectionplot(longitrees, metric, nth)
Arguments
longitrees |
A |
metric |
|
nth |
Rank of the tree subset to select (1 = best). |
Value
An object of class "selectionplot". Pass to
threetrees to refit and evaluate the selected trees.
References
Obata, R. and Sugimoto, T. (2026). A decision tree analysis for longitudinal measurement data and its applications. Advances in Data Analysis and Classification. doi:10.1007/s11634-025-00665-2
See Also
Fit and Evaluate Three Selected Trees
Description
Refits the three trees selected by selectionplot on their
original bootstrap samples.
Usage
threetrees(x, selection)
## S3 method for class 'threetrees'
summary(object, ...)
## S3 method for class 'threetrees'
print(x, ...)
## S3 method for class 'threetrees'
predict(object, tree = 1, ...)
## S3 method for class 'threetrees'
plot(x, tree = 1, ...)
Arguments
x |
A |
selection |
A |
object |
A |
... |
Additional arguments passed to |
tree |
Integer 1, 2, or 3 selecting which tree to plot. |
Value
An object of class "threetrees". Use
summary.threetrees, predict.threetrees,
or plot.threetrees to inspect the results.
Methods (by generic)
-
summary(threetrees): Print a brief summary of athreetreesobject. -
print(threetrees): Print method (callssummary). -
predict(threetrees): Extract predicted values and terminal node assignments from athreetreesobject. Returns a data frame with columnspredict(predicted values) andterminalnode(terminal node assignments). -
plot(threetrees): Plot one of the three trees. A convenience wrapper aroundtreeplot.
References
Obata, R. and Sugimoto, T. (2026). A decision tree analysis for longitudinal measurement data and its applications. Advances in Data Analysis and Classification. doi:10.1007/s11634-025-00665-2
See Also
longitrees, selectionplot,
treeplot
Examples
data(ltreedata)
set.seed(10)
trees_res <- longitrees(y ~ ., time = "time", random = "subject",
weight = 0.5, data = ltreedata, alpha = 0.01,
bootsize = 50, mins = 40)
sel <- selectionplot(trees_res, metric = "PE", nth = 1)
tt <- threetrees(trees_res, selection = sel)
summary(tt)
predict(tt, tree = 1)
predict(tt, tree = 2)
predict(tt, tree = 3)
plot(tt, tree = 1)
plot(tt, tree = 2)
plot(tt, tree = 3)
Decision Tree Plot Visualisation for Longitudinal Data
Description
Visualises the structure of a decision tree for longitudinal
data. Built on ggparty. Each split node displays the node
number, split variable, p-value, and weight w. Each
terminal node displays the node number, sample size N, and the
intercept (\hat\beta_0) and slope (\hat\beta_1) from a
linear mixed-effects model fitted within that node. Individual
longitudinal trajectories are shown as dashed lines; the predicted
values (average at each time point) are shown as solid lines, with the
response variable on the vertical axis and time on the horizontal axis.
Usage
treeplot(
x,
tree = NULL,
snsize = 50,
spsize = 5,
plotsize = 80,
linesize1 = 0.3,
linesize2 = 1,
tnsize = 60
)
Arguments
x |
A |
tree |
Integer 1, 2, or 3 selecting which tree to plot when |
snsize |
Split-node label size (default 50). |
spsize |
Split-point label size (default 5). |
plotsize |
Overall plot size (default 80). |
linesize1 |
Branch line width (default 0.3). |
linesize2 |
Main line width (default 1). |
tnsize |
Terminal-node label size (default 60). |
Value
A ggplot2/ggparty object.
References
Obata, R. and Sugimoto, T. (2026). A decision tree analysis for longitudinal measurement data and its applications. Advances in Data Analysis and Classification. doi:10.1007/s11634-025-00665-2