When analyzing masked series system data, a fundamental question
precedes parameter estimation: which parametric family should we use for
the component lifetimes? The maskedcauses package provides
three models of increasing generality — exponential, homogeneous
Weibull, and heterogeneous Weibull — each nested in the next. This
hierarchy forms a natural scaffold for model selection via the
likelihood ratio test (LRT).
We develop a two-step testing cascade that moves top-down from the most general model, simplifying only when the data cannot distinguish between levels. We demonstrate the cascade on worked examples and then assess the power of the shape heterogeneity test through Monte Carlo simulation.
The three models form a strict nesting chain: \[ \underbrace{\text{Exponential}}_{\lambda_1, \ldots, \lambda_m} \;\subset\; \underbrace{\text{Hom.\ Weibull}}_{k,\, \beta_1, \ldots, \beta_m} \;\subset\; \underbrace{\text{Het.\ Weibull}}_{k_1, \beta_1, \ldots, k_m, \beta_m} \]
| Model | Constructor | Parameters | Count |
|---|---|---|---|
| Exponential | exp_series_md_c1_c2_c3() |
\(\lambda_1, \ldots, \lambda_m\) | \(m\) |
| Hom. Weibull | wei_series_homogeneous_md_c1_c2_c3() |
\(k, \beta_1, \ldots, \beta_m\) | \(m + 1\) |
| Het. Weibull | wei_series_md_c1_c2_c3() |
\(k_1, \beta_1, \ldots, k_m, \beta_m\) | \(2m\) |
Each nesting defines a hypothesis test:
Step 1: Shape heterogeneity. Does the data support different shapes for different components? \[ H_0: k_1 = k_2 = \cdots = k_m \quad \text{(homogeneous)} \quad\text{vs}\quad H_1: \text{shapes unrestricted (heterogeneous)} \] The LRT statistic is \(\Lambda = -2[\ell(\hat\theta_{\text{hom}}) - \ell(\hat\theta_{\text{het}})]\), which under \(H_0\) follows \(\chi^2_{m-1}\) asymptotically.
Step 2: Aging. If the homogeneous model is retained, does the common shape differ from 1 (exponential)? \[ H_0: k = 1 \quad \text{(exponential)} \quad\text{vs}\quad H_1: k \neq 1 \quad \text{(homogeneous Weibull)} \] The LRT statistic follows \(\chi^2_1\) under \(H_0\).
The nesting chain reflects increasing physical realism:
Exponential \(\to\) Homogeneous Weibull tests whether components age (\(k > 1\), increasing hazard) or exhibit infant mortality (\(k < 1\), decreasing hazard). Rejecting \(k = 1\) means the constant-hazard assumption is inadequate.
Homogeneous \(\to\) Heterogeneous Weibull tests whether components have different aging patterns. Rejecting equal shapes indicates at least one component has a fundamentally different failure mechanism — for example, an electronic component with infant mortality alongside a bearing with wear-out.
We recommend a top-down approach: start with the most general model and simplify until a restriction is rejected:
This approach controls the risk of prematurely simplifying the model.
We generate data from a 5-component homogeneous Weibull system — the “middle” model in the chain — and show that the LRT correctly identifies it.
m <- 5
theta_hom <- c(k = 1.5, beta1 = 300, beta2 = 400, beta3 = 500,
beta4 = 600, beta5 = 700)
model_hom <- wei_series_homogeneous_md_c1_c2_c3()
# Right-censoring time at the 80th percentile of the system lifetime
beta_sys <- wei_series_system_scale(theta_hom[1], theta_hom[-1])
tau <- qweibull(0.8, shape = theta_hom[1], scale = beta_sys)
cat("System scale:", round(beta_sys, 1), "\n")
#> System scale: 152.9
cat("Right-censoring time (q = 0.8):", round(tau, 1), "\n")
#> Right-censoring time (q = 0.8): 210# Exponential (m params)
model_exp <- exp_series_md_c1_c2_c3()
fit_e <- fit(model_exp)(df, par = rep(0.002, m), method = "Nelder-Mead")
# Homogeneous Weibull (m+1 params)
fit_h <- fit(model_hom)(df, par = c(1, rep(500, m)),
method = "Nelder-Mead", control = list(maxit = 5000))
# Heterogeneous Weibull (2m params) — warm-start from hom fit
model_het <- wei_series_md_c1_c2_c3()
het_start <- as.numeric(rbind(rep(fit_h$par[1], m), fit_h$par[-1]))
fit_w <- fit(model_het)(df, par = het_start,
method = "Nelder-Mead", control = list(maxit = 10000))np_exp <- m
np_hom <- m + 1
np_het <- 2 * m
n_obs <- nrow(df)
# LRT: heterogeneous vs homogeneous
LRT_shape <- -2 * (fit_h$loglik - fit_w$loglik)
df_shape <- np_het - np_hom # m - 1 = 4
p_shape <- pchisq(LRT_shape, df = df_shape, lower.tail = FALSE)
# LRT: homogeneous vs exponential
LRT_aging <- -2 * (fit_e$loglik - fit_h$loglik)
df_aging <- np_hom - np_exp # 1
p_aging <- pchisq(LRT_aging, df = df_aging, lower.tail = FALSE)
# AIC and BIC
aic_val <- function(ll, k) -2 * ll + 2 * k
bic_val <- function(ll, k, n) -2 * ll + log(n) * k
comparison <- data.frame(
Model = c("Exponential", "Hom. Weibull", "Het. Weibull"),
Params = c(np_exp, np_hom, np_het),
LogLik = c(fit_e$loglik, fit_h$loglik, fit_w$loglik),
AIC = c(aic_val(fit_e$loglik, np_exp),
aic_val(fit_h$loglik, np_hom),
aic_val(fit_w$loglik, np_het)),
BIC = c(bic_val(fit_e$loglik, np_exp, n_obs),
bic_val(fit_h$loglik, np_hom, n_obs),
bic_val(fit_w$loglik, np_het, n_obs))
)
knitr::kable(comparison, digits = 2,
caption = "Model comparison (true model: homogeneous Weibull)",
col.names = c("Model", "# Params", "Log-Lik", "AIC", "BIC"))| Model | # Params | Log-Lik | AIC | BIC |
|---|---|---|---|---|
| Exponential | 5 | -1728 | 3465 | 3484 |
| Hom. Weibull | 6 | -1698 | 3407 | 3430 |
| Het. Weibull | 10 | -1697 | 3414 | 3451 |
cat("Step 1: Het vs Hom (shape heterogeneity)\n")
#> Step 1: Het vs Hom (shape heterogeneity)
cat(sprintf(" LRT = %.2f, df = %d, p = %.4f\n", LRT_shape, df_shape, p_shape))
#> LRT = 1.36, df = 4, p = 0.8514
if (p_shape < 0.05) {
cat(" Decision: Reject homogeneous -> adopt heterogeneous model\n\n")
} else {
cat(" Decision: Fail to reject -> retain homogeneous model\n\n")
}
#> Decision: Fail to reject -> retain homogeneous model
cat("Step 2: Hom vs Exp (aging)\n")
#> Step 2: Hom vs Exp (aging)
cat(sprintf(" LRT = %.2f, df = %d, p = %.4f\n", LRT_aging, df_aging, p_aging))
#> LRT = 59.58, df = 1, p = 0.0000
if (p_aging < 0.05) {
cat(" Decision: Reject exponential -> adopt homogeneous model\n")
} else {
cat(" Decision: Fail to reject -> retain exponential model\n")
}
#> Decision: Reject exponential -> adopt homogeneous modelSince the true model is homogeneous Weibull with \(k = 1.5\), we expect:
The cascade correctly identifies the homogeneous model. Both AIC and BIC should agree, with the homogeneous model having the smallest values.
We now generate data from a system where components have meaningfully different shapes, and show that the LRT detects this.
# Shapes span DFR to strong IFR
theta_het_true <- c(0.8, 300, # DFR (electronics)
1.0, 400, # CFR (random)
1.5, 500, # mild IFR (seals)
2.0, 600, # moderate IFR (bearings)
2.5, 700) # strong IFR (wear-out)
model_het <- wei_series_md_c1_c2_c3()
gen_het <- rdata(model_het)
set.seed(7231)
df_het <- gen_het(theta_het_true, n = 300, p = 0.2,
observe = observe_right_censor(tau = tau))
cat("Observation types:\n")
#> Observation types:
print(table(df_het$omega))
#>
#> exact right
#> 239 61fit_e2 <- fit(model_exp)(df_het, par = rep(0.002, m), method = "Nelder-Mead")
fit_h2 <- fit(model_hom)(df_het, par = c(1, rep(500, m)),
method = "Nelder-Mead", control = list(maxit = 5000))
het_start2 <- as.numeric(rbind(rep(fit_h2$par[1], m), fit_h2$par[-1]))
fit_w2 <- fit(model_het)(df_het, par = het_start2,
method = "Nelder-Mead", control = list(maxit = 10000))# LRT: heterogeneous vs homogeneous
LRT_shape2 <- -2 * (fit_h2$loglik - fit_w2$loglik)
p_shape2 <- pchisq(LRT_shape2, df = m - 1, lower.tail = FALSE)
# LRT: homogeneous vs exponential
LRT_aging2 <- -2 * (fit_e2$loglik - fit_h2$loglik)
p_aging2 <- pchisq(LRT_aging2, df = 1, lower.tail = FALSE)
comparison2 <- data.frame(
Model = c("Exponential", "Hom. Weibull", "Het. Weibull"),
Params = c(np_exp, np_hom, np_het),
LogLik = c(fit_e2$loglik, fit_h2$loglik, fit_w2$loglik),
AIC = c(aic_val(fit_e2$loglik, np_exp),
aic_val(fit_h2$loglik, np_hom),
aic_val(fit_w2$loglik, np_het)),
BIC = c(bic_val(fit_e2$loglik, np_exp, nrow(df_het)),
bic_val(fit_h2$loglik, np_hom, nrow(df_het)),
bic_val(fit_w2$loglik, np_het, nrow(df_het)))
)
knitr::kable(comparison2, digits = 2,
caption = "Model comparison (true model: heterogeneous Weibull)",
col.names = c("Model", "# Params", "Log-Lik", "AIC", "BIC"))| Model | # Params | Log-Lik | AIC | BIC |
|---|---|---|---|---|
| Exponential | 5 | -1617 | 3244 | 3263 |
| Hom. Weibull | 6 | -1617 | 3246 | 3268 |
| Het. Weibull | 10 | -1538 | 3095 | 3132 |
cat(sprintf("\nStep 1: Het vs Hom — LRT = %.2f, df = %d, p = %.4f\n",
LRT_shape2, m - 1, p_shape2))
#>
#> Step 1: Het vs Hom — LRT = 158.43, df = 4, p = 0.0000
cat(sprintf("Step 2: Hom vs Exp — LRT = %.2f, df = %d, p = %.4f\n",
LRT_aging2, 1, p_aging2))
#> Step 2: Hom vs Exp — LRT = 0.29, df = 1, p = 0.5902With true shapes spanning 0.8 to 2.5, the cascade correctly:
Note that the homogeneous vs exponential test (step 2) may fail to reject even though component aging is present. This is because the pooled shape estimate averages across DFR (\(k < 1\)) and IFR (\(k > 1\)) components, yielding \(\hat{k} \approx 1\) — which is close to exponential. The top-down cascade avoids this trap by testing heterogeneity first.
The worked examples show the cascade on individual datasets. We now assess the statistical power of the shape heterogeneity test (step 1) as a function of how different the component shapes actually are.
We parameterize shape heterogeneity by the coefficient of variation (CV) of the shape parameters. All \(m = 5\) shapes are set to \(k_j = \bar{k}(1 + \text{cv} \cdot z_j)\) where \(\bar{k} = 1.5\) and \(z_1, \ldots, z_m\) are fixed offsets with mean 0 and standard deviation 1.
| CV | Meaning |
|---|---|
| 0% | All shapes equal — homogeneous model is correct |
| 5–10% | Mild heterogeneity — hard to detect |
| 20–30% | Moderate heterogeneity — detectable with adequate \(n\) |
| 50% | Strong heterogeneity — reliably detected |
Other settings: \(n = 500\) observations, \(p = 0.2\) (masking probability), right-censoring at the 80th percentile of the system lifetime under the homogeneous baseline, \(\alpha = 0.05\).
set.seed(42)
m <- 5
base_k <- 1.5
base_scales <- c(300, 400, 500, 600, 700)
cv_levels <- c(0, 0.05, 0.10, 0.20, 0.30, 0.50)
n_reps <- 200
n_obs <- 500
p_mask <- 0.2
alpha <- 0.05
# Fixed offsets: mean 0, sd 1
offsets <- c(-2, -1, 0, 1, 2)
offsets <- offsets / sd(offsets)
# Right-censoring time (based on homogeneous baseline at CV = 0)
beta_sys <- wei_series_system_scale(base_k, base_scales)
tau <- qweibull(0.8, shape = base_k, scale = beta_sys)
# Models and solvers
model_exp <- exp_series_md_c1_c2_c3()
model_hom <- wei_series_homogeneous_md_c1_c2_c3()
model_het <- wei_series_md_c1_c2_c3()
solver_exp <- fit(model_exp)
solver_hom <- fit(model_hom)
solver_het <- fit(model_het)
gen_het <- rdata(model_het)
np_exp <- m
np_hom <- m + 1
np_het <- 2 * mresults_list <- vector("list", length(cv_levels) * n_reps)
idx <- 1
for (cv in cv_levels) {
shapes <- base_k * (1 + cv * offsets)
theta <- as.numeric(rbind(shapes, base_scales))
for (r in seq_len(n_reps)) {
df_r <- gen_het(theta, n = n_obs, p = p_mask,
observe = observe_right_censor(tau = tau))
row <- list(cv = cv, rep = r,
reject_het_vs_hom = NA, reject_hom_vs_exp = NA,
ll_exp = NA, ll_hom = NA, ll_het = NA,
aic_exp = NA, aic_hom = NA, aic_het = NA,
bic_exp = NA, bic_hom = NA, bic_het = NA,
converged_all = FALSE)
tryCatch({
fe <- solver_exp(df_r, par = rep(0.002, m), method = "Nelder-Mead")
# Two-phase for hom: Nelder-Mead warm-up -> L-BFGS-B refinement
fh_nm <- solver_hom(df_r, par = c(1, rep(500, m)),
method = "Nelder-Mead",
control = list(maxit = 5000))
fh <- solver_hom(df_r, par = fh_nm$par,
method = "L-BFGS-B", lower = rep(1e-6, np_hom))
# Two-phase for het: warm-start from hom, Nelder-Mead -> L-BFGS-B
het_start <- as.numeric(rbind(rep(fh$par[1], m), fh$par[-1]))
fw_nm <- solver_het(df_r, par = het_start,
method = "Nelder-Mead",
control = list(maxit = 10000))
fw <- solver_het(df_r, par = fw_nm$par,
method = "L-BFGS-B", lower = rep(1e-6, np_het))
if (fe$converged && fh$converged && fw$converged) {
row$ll_exp <- fe$loglik
row$ll_hom <- fh$loglik
row$ll_het <- fw$loglik
row$aic_exp <- -2 * fe$loglik + 2 * np_exp
row$aic_hom <- -2 * fh$loglik + 2 * np_hom
row$aic_het <- -2 * fw$loglik + 2 * np_het
row$bic_exp <- -2 * fe$loglik + log(n_obs) * np_exp
row$bic_hom <- -2 * fh$loglik + log(n_obs) * np_hom
row$bic_het <- -2 * fw$loglik + log(n_obs) * np_het
LRT_het <- -2 * (fh$loglik - fw$loglik)
LRT_hom <- -2 * (fe$loglik - fh$loglik)
row$reject_het_vs_hom <- pchisq(LRT_het, df = m - 1,
lower.tail = FALSE) < alpha
row$reject_hom_vs_exp <- pchisq(LRT_hom, df = 1,
lower.tail = FALSE) < alpha
row$converged_all <- TRUE
}
}, error = function(e) NULL)
results_list[[idx]] <- row
idx <- idx + 1
}
cat(sprintf("CV = %.0f%% complete\n", cv * 100))
}
power_study <- do.call(rbind, lapply(results_list, as.data.frame))ps <- mc_data$power_study
ps_conv <- ps[ps$converged_all, ]
rej_summary <- do.call(rbind, lapply(split(ps_conv, ps_conv$cv), function(d) {
data.frame(
CV_pct = d$cv[1] * 100,
n_converged = nrow(d),
reject_het_vs_hom = mean(d$reject_het_vs_hom),
reject_hom_vs_exp = mean(d$reject_hom_vs_exp)
)
}))
knitr::kable(rej_summary, digits = 3, row.names = FALSE,
caption = sprintf(
"Rejection rates at alpha = %.2f (n = %d, m = %d)",
mc_data$config$alpha, mc_data$config$n, mc_data$config$m),
col.names = c("Shape CV (%)", "Converged",
"Reject Hom (Het vs Hom)",
"Reject Exp (Hom vs Exp)"))| Shape CV (%) | Converged | Reject Hom (Het vs Hom) | Reject Exp (Hom vs Exp) |
|---|---|---|---|
| 0 | 200 | 0.045 | 1.00 |
| 5 | 200 | 0.050 | 1.00 |
| 10 | 200 | 0.145 | 1.00 |
| 20 | 200 | 0.350 | 1.00 |
| 30 | 200 | 0.860 | 0.78 |
| 50 | 200 | 1.000 | 0.99 |
plot(rej_summary$CV_pct, rej_summary$reject_het_vs_hom,
type = "b", pch = 19, col = "steelblue", lwd = 2,
xlab = "Coefficient of Variation of Shapes (%)",
ylab = "Rejection Rate",
ylim = c(0, 1),
main = "Power of the LRT Cascade")
lines(rej_summary$CV_pct, rej_summary$reject_hom_vs_exp,
type = "b", pch = 17, col = "firebrick", lwd = 2, lty = 2)
abline(h = mc_data$config$alpha, lty = 3, col = "gray50")
legend("right",
c("Het vs Hom (shape heterogeneity)",
"Hom vs Exp (aging)",
expression(alpha == 0.05)),
col = c("steelblue", "firebrick", "gray50"),
lty = c(1, 2, 3), pch = c(19, 17, NA), lwd = 2, cex = 0.85)
grid()At CV = 0% (no heterogeneity), the rejection rate of the shape heterogeneity test should be near the nominal level \(\alpha = 0.05\) — the Type I error rate. As CV increases, the power rises, eventually reaching near 1 at CV = 50%.
The aging test (hom vs exp) maintains high rejection rates across all CV levels, because the baseline shape \(\bar{k} = 1.5\) is far from 1 regardless of how much the individual shapes vary around it.
We also evaluate how often each information criterion selects the correct model. At CV = 0, the correct model is homogeneous Weibull. At CV > 0, it is heterogeneous Weibull.
aic_correct <- do.call(rbind, lapply(split(ps_conv, ps_conv$cv), function(d) {
cv <- d$cv[1]
aic_choice <- ifelse(d$aic_het < d$aic_hom & d$aic_het < d$aic_exp, "het",
ifelse(d$aic_hom < d$aic_exp, "hom", "exp"))
bic_choice <- ifelse(d$bic_het < d$bic_hom & d$bic_het < d$bic_exp, "het",
ifelse(d$bic_hom < d$bic_exp, "hom", "exp"))
correct_model <- if (cv == 0) "hom" else "het"
data.frame(
CV_pct = cv * 100,
AIC_correct = mean(aic_choice == correct_model),
BIC_correct = mean(bic_choice == correct_model)
)
}))
knitr::kable(aic_correct, digits = 3, row.names = FALSE,
caption = "Proportion selecting the correct model",
col.names = c("Shape CV (%)", "AIC Correct", "BIC Correct"))| Shape CV (%) | AIC Correct | BIC Correct |
|---|---|---|
| 0 | 0.925 | 1.000 |
| 5 | 0.080 | 0.000 |
| 10 | 0.210 | 0.000 |
| 20 | 0.445 | 0.025 |
| 30 | 0.920 | 0.405 |
| 50 | 1.000 | 1.000 |
Based on the Monte Carlo study and general theory, we offer the following recommendations for practitioners:
| Shape CV | LRT Power | Recommended Approach |
|---|---|---|
| 0% (identical) | \(\sim\alpha\) (nominal) | Homogeneous Weibull — no benefit to extra parameters |
| \(< 10\%\) | Low (\(< 20\%\)) | Prefer homogeneous — heterogeneity is undetectable |
| 10–25% | Moderate | Run the cascade; decision depends on \(n\) and censoring |
| \(> 25\%\) | High (\(> 80\%\)) | Heterogeneous Weibull — shape differences are reliably detected |
The aging test is almost always decisive. When the true common shape is far from 1, even moderate sample sizes (\(n \geq 200\)) overwhelmingly reject the exponential model. The choice between exponential and Weibull is rarely ambiguous.
The heterogeneity test is the crux. Power depends strongly on how different the component shapes are. With \(n = 500\) and \(m = 5\), the test has good power for CV \(\geq 25\%\) but is nearly blind to CV \(< 10\%\). Practitioners should consider:
For extensive sensitivity analysis varying \(n\), \(m\), \(p\), and censoring schemes, see the companion paper on model selection for masked series systems.