Package Ecosystem: Reliability Analysis with Masked Failure Data

The Problem

In reliability engineering, a series system fails when any of its \(m\) components fails. When the system is observed in the field, we often know when it failed but not which component caused the failure. Instead, we observe a candidate set — a subset of components that plausibly contains the true cause. This is masked failure data.

The statistical challenge is to estimate component-level lifetime parameters from system-level observations where the failure cause is partially or fully masked and the failure time may be censored (right, left, or interval).

This vignette describes a family of R packages that address this problem at different levels of generality, from foundational distribution algebra through closed-form maximum likelihood estimation to fully general numerical inference.

Package Ecosystem

The packages form a layered dependency graph. Solid arrows indicate Imports dependencies; the dashed arrow indicates a Suggests dependency used for cross-validation testing.

    algebraic.dist ─────────────────────────┐
         │                                  │
         ▼                                  │
    algebraic.mle                           │
         │                                  │
         ├──────────────┐                   │
         ▼              ▼                   ▼
  compositional.mle  likelihood.model   generics
                        │       │          │
                   ┌────┘       └────┐     │
                   ▼                 ▼     │
                flexhaz        maskedcauses
                   │                 │
                   ▼                 │
                serieshaz            │
                   │                 │
                   └──────┐  ┌──────┘
                          ▼  ▼
                        maskedhaz

All packages are available on r-universe. Packages marked with [CRAN] below are published on CRAN; the remainder are targeting CRAN submission.

Foundation: Distribution and MLE Infrastructure

Four packages provide the algebraic and inferential foundation that the reliability-specific packages build on.

algebraic.dist [CRAN] — An algebra over probability distributions. Compose, sample, and auto-simplify distributions (normal, exponential, multivariate normal, empirical) using S3 generics. Provides the distribution abstraction used throughout the ecosystem.

algebraic.mle [CRAN] — An algebra over maximum likelihood estimators. Unified interface for mle, mle_numerical, mle_boot, and rmap_mle objects with delta-method and bootstrap inference. The MLE result class used by fitting functions across the ecosystem.

likelihood.model [CRAN] — A Fisherian likelihood framework. Defines the core generics — loglik, score, hess_loglik, fim, observed_info, rdata, assumptions — that all likelihood models in this ecosystem implement. Contribution-based models support heterogeneous observation types (exact, right-censored, left-censored, interval-censored) in a single likelihood.

compositional.mle [CRAN] — Composable MLE solvers as first-class functions. Sequential chaining (%>>%), parallel racing (%|%), and random restarts turn solver construction into an algebra. Useful for building robust optimization strategies for the multi-modal likelihoods that arise in masked data problems.

Hazard-Based Components: flexhaz

flexhaz — Define a lifetime distribution by specifying its hazard function \(h(t; \boldsymbol{\theta})\) and get the survival function, density, CDF, quantiles, random sampling, and MLE automatically. The dfr_dist class (“dynamic failure rate distribution”) is the building block for component-level models.

Built-in constructors include dfr_exponential(), dfr_weibull(), dfr_gompertz(), and dfr_loglogistic(), but any user-defined hazard function works. flexhaz implements the likelihood_model interface from likelihood.model, so every dfr_dist object can be used directly with loglik(), score(), fit(), and friends.

Series System Topology: serieshaz

serieshaz — Compose dfr_dist components from flexhaz into a series system. For a series system of \(m\) components, the system hazard is the sum of component hazards:

\[h_{\text{sys}}(t) = \sum_{j=1}^{m} h_j(t; \boldsymbol{\theta}_j)\]

and the system survival is the product of component survivals:

\[S_{\text{sys}}(t) = \prod_{j=1}^{m} S_j(t; \boldsymbol{\theta}_j).\]

The dfr_dist_series class inherits all flexhaz methods (density, survival, quantiles, sampling, MLE) and adds a parameter layout system that maps a flat parameter vector to per-component slices — enabling standard optimizers to work on the full system likelihood.

This Package: maskedcauses

maskedcauses (this package) provides closed-form maximum likelihood estimation for masked series system data under the standard masking conditions (C1, C2, C3). Three likelihood models form a natural nesting chain:

Model Parameters Constructor
Exponential series \(m\) rates: \((\lambda_1, \ldots, \lambda_m)\) exp_series_md_c1_c2_c3()
Homogeneous Weibull series \(m + 1\): \((k, \beta_1, \ldots, \beta_m)\) wei_series_homogeneous_md_c1_c2_c3()
Heterogeneous Weibull series \(2m\): \((k_1, \beta_1, \ldots, k_m, \beta_m)\) wei_series_md_c1_c2_c3()

Each nesting level is testable via likelihood ratio test (see the vignette("model_selection") for a complete treatment).

The key advantage of maskedcauses is analytical tractability: the exponential model has closed-form log-likelihood, score, and Hessian for all four observation types (exact, right, left, interval). The Weibull models have analytical expressions for exact and right-censored observations, with efficient closed-form or numerical expressions for left and interval censoring.

maskedcauses also provides the masked data infrastructure used across the ecosystem: composable observation functors (observe_right_censor, observe_left_censor, observe_periodic, observe_mixture), candidate set generation under C1/C2/C3 conditions, and the series_md S3 class hierarchy.

General Masked Inference: maskedhaz

maskedhaz — When component lifetimes follow distributions beyond exponential and Weibull — Gompertz, log-logistic, or arbitrary user-defined hazard functions — maskedhaz provides MLE for the masked series system likelihood using numerical methods.

maskedhaz builds on serieshaz for system construction and maskedcauses for masked data infrastructure. It uses numerical integration (stats::integrate) for left and interval-censored contributions and numerical differentiation (numDeriv::jacobian) for the Hessian.

The trade-off is clear: maskedhaz handles any component lifetime distribution that flexhaz can represent, at the cost of speed and the availability of analytical derivatives. For exponential and Weibull components, maskedcauses is the faster and more precise choice.

Cross-validation tests in maskedhaz verify that its numerical results match the analytical results from maskedcauses for the exponential special case, providing confidence in both implementations.

Choosing the Right Package

You need… Use
Component lifetimes are exponential maskedcauses::exp_series_md_c1_c2_c3()
Component lifetimes are Weibull (shared shape) maskedcauses::wei_series_homogeneous_md_c1_c2_c3()
Component lifetimes are Weibull (different shapes) maskedcauses::wei_series_md_c1_c2_c3()
Model selection across the Weibull chain maskedcauses — see vignette("model_selection")
Component lifetimes are Gompertz, log-logistic, or custom maskedhaz with flexhaz components
Series system distribution (no masking) serieshaz with flexhaz components
Single-component DFR distribution flexhaz
MLE algebra (delta method, bootstrap) algebraic.mle
Composing optimization strategies compositional.mle