Model Structure

Overview

The bml package implements Bayesian multiple-membership multilevel models (MMMM) with parameterizable weight functions. These models are designed for data structures where:

Lower-level units (members) can belong to multiple higher-level units (groups)
Groups are composed of multiple members
Group outcomes depend on the combined influence of their constituent members

This structure is common across many empirical settings: political parties participating in multiple coalition governments, individuals participating in multiple teams, spatial units influenced by multiple neighbors, or students in peer networks, where each student is influenced by multiple peers.

The key innovation of the extended MMMM is that it allows researchers to model rather than impose the aggregation mechanism through which member-level effects combine to influence group-level outcomes.

Notation and Data Structure

Let:

\(i = 1, \ldots, N\) index groups (the primary unit of analysis)
\(k = 1, \ldots, K\) index members (lower-level units)
\(j = 1, \ldots, J\) index nesting units (higher-level contexts)
\(p[i]\) denote the set of members belonging to group \(i\)
\(n_i = |p[i]|\) denote the number of members in group \(i\)

Example (Coalition Governments): Groups are governments (\(i\)), members are political parties (\(k\)), and nesting units are countries (\(j\)). The function \(p[i]\) returns all parties in government \(i\), and \(n_i\) is coalition size.

The outcome \(y_i^{(g)}\) is measured at the group level, where superscript \((g)\) denotes the level of measurement. In what follows, we present the model for Gaussian (normal) outcomes, though the framework extends to other distributions (see below).

The Extended Multiple-Membership Multilevel Model

Full Model Specification

The extended MMMM decomposes the group-level outcome into three additive components, each corresponding to a level at which explanatory factors operate:

\[\begin{equation} y_i^{(g)} = \theta_i^{(g)} + \theta_{j[i]}^{(c)} + \theta_i^{(p)} \tag{1} \end{equation}\]

where:

\(\theta_i^{(g)}\) denotes the group-level effect associated with government \(i\).
\(\theta_{j[i]}^{(c)}\) denotes the hierarchical nesting-level effect associated with country \(j[i]\).
\(\theta_i^{(p)}\) denotes the aggregated member-level effect, summarizing party-level contributions within government \(i\).

Each component includes both systematic effects (observed covariates) and random effects (unobserved heterogeneity).

Group-Level Effect

The group-level effect models the direct influence of group-level covariates and group-specific unobserved factors:

\[\begin{equation} \theta_i^{(g)} = \mathbf{\beta}^{(g)} \cdot \mathbf{x}_i^{(g)} + u_i^{(g)}, \quad u_i^{(g)} \overset{\text{iid}}{\sim} N(0, \sigma_{u^{(g)}}^2) \tag{2} \end{equation}\]

where:

\(\mathbf{x}_i^{(g)} = [1, x_{1i}^{(g)}, \ldots, x_{Gi}^{(g)}]\) is a vector of group-level covariates
\(\mathbf{\beta}^{(g)} = [\beta_0^{(g)}, \beta_1^{(g)}, \ldots, \beta_G^{(g)}]\) are the corresponding regression coefficients
\(u_i^{(g)}\) are independently and identically distributed (i.i.d.) random effects
\(\sigma_{u^{(g)}}^2\) captures unobserved heterogeneity at the group level

Example: For government survival, \(\mathbf{x}_i^{(g)}\) might include majority status, ideological heterogeneity, or economic conditions.

Hierarchical Nesting-Level Effect

When groups are nested within higher-level contexts, the nesting-level effect accounts for hierarchical dependencies:

\[\begin{equation} \theta_j^{(c)} = \mathbf{\beta}^{(c)} \cdot \mathbf{x}_j^{(c)} + u_j^{(c)}, \quad u_j^{(c)} \overset{\text{iid}}{\sim} N(0, \sigma_{u^{(c)}}^2) \tag{3} \end{equation}\]

where:

\(\mathbf{x}_j^{(c)} = [x_{1j}^{(c)}, \ldots, x_{Cj}^{(c)}]\) are nesting-level covariates
\(\mathbf{\beta}^{(c)} = [\beta_1^{(c)}, \ldots, \beta_C^{(c)}]\) are the corresponding coefficients
\(u_j^{(c)}\) are i.i.d. random effects
\(\sigma_{u^{(c)}}^2\) captures unobserved heterogeneity at the nesting level

Including \(u_j^{(c)}\) allows groups within the same nesting unit to be more similar than groups across nesting units, even after conditioning on observed covariates.

Example: Governments are nested within countries. Country-level variables might include electoral system or investiture requirements. The random effect captures country-specific factors affecting all governments in that country.

Note: Multiple hm() blocks can be specified to accommodate cross-classified structures (e.g., groups nested within multiple hierarchical contexts).

Aggregated Member-Level Effect

The core innovation of the extended MMMM is modeling how member-level effects aggregate to influence group outcomes. Individual member effects are first specified, then aggregated via a weighted sum:

\[\begin{equation} \theta_i^{(p)} = \sum_{k \in p[i]} w_{ik} \left( \mathbf{\beta}^{(p)} \cdot \mathbf{x}_{ik}^{(p)} + u_k^{(p)} \right) \tag{4} \end{equation}\]

This can be decomposed into systematic and random components:

\[\begin{equation} \theta_i^{(p)} = \underbrace{\mathbf{\beta}^{(p)} \sum_{k \in p[i]} w_{ik} \mathbf{x}_{ik}^{(p)}}_{\text{Systematic component}} + \underbrace{\sum_{k \in p[i]} w_{ik} u_k^{(p)}}_{\text{Random component}}, \quad u_k^{(p)} \overset{\text{iid}}{\sim} N(0, \sigma_{u^{(p)}}^2) \tag{5} \end{equation}\]

where:

\(\mathbf{x}_{ik}^{(p)} = [x_{1ik}^{(p)}, \ldots, x_{Pik}^{(p)}]\) are member-level covariates
\(\mathbf{\beta}^{(p)} = [\beta_1^{(p)}, \ldots, \beta_P^{(p)}]\) are structural regression coefficients measuring the effect of member attributes on group outcomes
\(w_{ik}\) are aggregation weights determining how member \(k\)’s contribution enters group \(i\)’s outcome
\(u_k^{(p)}\) are i.i.d. member-specific random effects
\(\sigma_{u^{(p)}}^2\) captures unobserved heterogeneity at the member level

Interpretation of Weights: The weights \(w_{ik}\) define the micro-macro link — how individual member characteristics combine to produce group-level attributes. Since the structural effect \(\mathbf{\beta}^{(p)}\) is constant, the weights serve to aggregate the member-level covariate \(\mathbf{x}_{ik}^{(p)}\) into a group-level construct \(\sum_{k \in p[i]} w_{ik} \mathbf{x}_{ik}^{(p)}\).

Covariance Structure: Including member-specific random effects \(u_k^{(p)}\) induces covariance among groups sharing members. The covariance between groups \(i\) and \(i'\) is:

\[\begin{equation} \text{Cov}(y_i^{(g)}, y_{i'}^{(g)}) = \sigma_{u^{(p)}}^2 \sum_k w_{ik} w_{i'k} \tag{6} \end{equation}\]

This increases with the degree of member overlap and accounts for dependencies across groups involving recurring members.

Example: In coalition governments, \(\mathbf{x}_{ik}^{(p)}\) might include party cohesion or organizational structure. The random effect \(u_k^{(p)}\) captures party-specific unobserved factors that affect all governments in which party \(k\) participates.

Parameterizable Weight Functions

Conventional MMMM: Fixed Weights

The conventional MMMM requires weights to be specified a priori. The most common choice is equal weighting (arithmetic mean):

\[\begin{equation} w_{ik} = \frac{1}{n_i} \tag{7} \end{equation}\]

This assumes all members contribute equally to group outcomes, regardless of their attributes or context.

Limitations: Equal weighting can be theoretically implausible. In coalition governments, for instance, larger parties, parties holding the prime ministership, or ideologically central parties may exert disproportionate influence. In spatial analysis, nearby neighbors may matter more than distant ones. Imposing equal weights when they vary in practice introduces aggregation bias.

Extended MMMM: Parameterizing Weights

The extended MMMM allows weights to be modeled as functions of covariates and estimated parameters. For example,

\[\begin{equation} w_{ik}(\mathbf{x}_{ik}^{(w)}, n_i) = \frac{1}{1 + (n_i - 1) \exp\left(-\mathbf{\beta}^{(w)} \cdot \mathbf{x}_{ik}^{(w)}\right)} \tag{8} \end{equation}\]

subject to the normalization constraint \(\sum_{k \in p[i]} w_{ik} = 1\) for all \(i\).

In bml, this is specified via:

fn(w ~ 1 / (1 + (n - 1) * exp(-(b0 + b1 * x1 + b2 * x2))))

where:

\(\mathbf{x}_{ik}^{(w)} = [1, x_{1ik}^{(w)}, \ldots, x_{Wik}^{(w)}]\) are weight covariates
\(\mathbf{\beta}^{(w)} = [\beta_0^{(w)}, \beta_1^{(w)}, \ldots, \beta_W^{(w)}]\) are weight function parameters to be estimated

Functional Form: Equation (8) generalizes the logistic function by centering it at \(1/n_i\) when \(\mathbf{\beta}^{(w)} \cdot \mathbf{x}_{ik}^{(w)} = 0\). This sets equal weighting as the baseline: absent systematic differences, each member receives weight \(1/n_i\). When \(\mathbf{\beta}^{(w)} \neq 0\), weights deviate from this baseline, reflecting heterogeneous member influence.

Properties:

Bounded: \(w_{ik} \in [0, 1]\)
Baseline-centered: When \(\mathbf{\beta}^{(w)} = 0\), weights reduce to \(w_{ik} = 1/n_i\)
Monotonic: Weights increase/decrease smoothly with the linear predictor
Normalized: Constraint ensures \(\sum_k w_{ik} = 1\) (unless normalization is not desired)

Interpretation of Weight Parameters: While \(\mathbf{\beta}^{(p)}\) measure structural effects of member attributes on group outcomes, \(\mathbf{\beta}^{(w)}\) measure aggregation effects—how member attributes determine their relative importance in the weighted sum. Together, they reveal both what matters (structural effects) and how much different members’ characteristics matter (aggregation effects).

Example: Let \(x_{ik}^{(p)}\) = party cohesion and \(x_{ik}^{(w)}\) = indicator for holding the prime ministership in a government survival model.

If \(\beta^{(w)} = 0\): Prime minister’s cohesion matters as much as coalition partners’ cohesion (equal weights)
If \(\beta^{(w)} > 0\): Prime minister’s cohesion carries more weight
If \(\beta^{(w)} < 0\): Coalition partners’ cohesion carries more weight

Testing Alternative Aggregation Mechanisms

Weight function regression enables empirical tests of substantive hypotheses about aggregation. Group-level functions such as min(), max(), mean(), and sum() can be incorporated directly into weight formulas, and standard mathematical transformations are supported. For example:

1. Mean vs. Extremes: Does the group outcome depend on the average member attribute or on extreme values (minimum/maximum)?

\[w_{ik} = \beta_1^{(w)} \cdot x_{ik}^{(w)} + (1 - \beta_1^{(w)}) \cdot \mathbb{1}\left[x_{ik}^{(w)} = \min_k x_{ik}^{(w)}\right]\]

In bml:

fn(
  w ~ b1 * x + (1 - b1) * (x == min(x)),
  c = TRUE
)

The parameter \(\beta_1^{(w)}\) controls the mixture between aggregation mechanisms. When \(\beta_1^{(w)} = 1\), weights are proportional to \(x_{ik}^{(w)}\) (mean-like aggregation); when \(\beta_1^{(w)} = 0\), all weight is assigned to the member with the minimum value. Intermediate values yield a convex combination of both. Replacing min() with max() tests whether the highest-valued member dominates. To preserve this interpretation, \(\beta_1^{(w)}\) should be constrained to \([0,1]\), e.g., via priors = list(“b1~dunif(0,1)”).

2. Mean vs. Sum: Are member effects contextual (mean aggregation: \(\sum_k w_{ik} = 1\)) or autonomous (sum aggregation: \(\sum_k w_{ik} = n_i\))?

\[w_{ik} = \frac{1}{1 + (n_i - 1) \exp\left(-\beta_0^{(w)}\right)}\]

In bml:

fn(
  w ~ 1 / (1 + (n - 1) * exp(-b0)),
  c = FALSE
)

When \(\beta_0^{(w)} \approx 0\), the weight evaluates to \(1/n_i\), yielding mean aggregation (\(\sum_k w_{ik} = 1\)). As \(\beta_0^{(w)}\) grows large, the weight approaches 1 for each member, yielding sum aggregation (\(\sum_k w_{ik} = n_i\)). Under mean aggregation, a one-unit change in \(x_{ik}^{(p)}\) affects the outcome by \(\beta^{(p)}\) only if all members change. Under sum aggregation, a one-unit change by a single member already produces an effect of \(\beta^{(p)}\).

3. Asymmetric Influence: Do certain members (e.g., larger, more central, more powerful) carry disproportionate weight?

\[ w_{ik} = \frac{1}{1 + (n_i - 1) \exp\left(-(\beta_0^{(w)} + \beta_1^{(w)} x_{1ik}^{(w)} + \beta_2^{(w)} x_{2ik}^{(w)})\right)} \]

In bml:

fn(
  w ~ 1 / (1 + (n - 1) * exp(-(b0 + b1 * x1 + b2 * x2))),
  c = TRUE
)

This weight function tests whether member attributes \(x_1, x_2\) (e.g., size, organization, power) generate disproportionate influence via \(\beta_1^{(w)}, \beta_2^{(w)}\).

Generalized Outcomes

The model structure extends to non-Gaussian outcomes via generalized linear and survival models:

Binomial (Logistic Regression):

\[\begin{equation} \text{logit}\left(P(y_i^{(g)} = 1)\right) = \theta_i^{(g)} + \theta_{j[i]}^{(c)} + \theta_i^{(p)} \tag{9} \end{equation}\]

In bml, specify family = "Binomial":

bml(y ~ 1 + x1 + mm(...) + hm(...), family = "Binomial", data = d)

Weibull Survival Model:

\[\begin{equation} \log(t_i) = \theta_i^{(g)} + \theta_{j[i]}^{(c)} + \theta_i^{(p)} + \sigma \epsilon_i \tag{10} \end{equation}\]

where \(t_i\) is survival time, \(\epsilon_i\) follows an extreme value distribution, and \(\sigma\) is the scale parameter related to the Weibull shape parameter.

In bml, specify family = "Weibull" with a Surv() outcome:

bml(Surv(time, event) ~ 1 + x1 + mm(...) + hm(...), family = "Weibull", data = d)

Cox Proportional Hazards Model:

\[\begin{equation} h(t_i) = h_0(t) \exp\left(\theta_i^{(g)} + \theta_{j[i]}^{(c)} + \theta_i^{(p)}\right) \tag{11} \end{equation}\]

where \(h_0(t)\) is the baseline hazard function, estimated non-parametrically or as a piecewise constant function.

In bml, specify family = "Cox". The optional cox_intervals argument controls the number of intervals for the piecewise constant baseline hazard:

bml(Surv(time, event) ~ 1 + x1 + mm(...) + hm(...), family = "Cox", cox_intervals = 10, data = d)

All outcome types support the full multilevel and multiple-membership structure, including parameterizable weight functions.

Variance Decomposition and Intraclass Correlation

The multilevel structure partitions total variance into additive components corresponding to each level in the model. For a three-level model with group, nesting, and member levels:

\[\begin{equation} \text{Var}(y_i^{(g)}) = \sigma_{u^{(g)}}^2 + \sigma_{u^{(c)}}^2 + \sigma_{u^{(p)}}^2 \sum_{k \in p[i]} w_{ik}^2 \tag{12} \end{equation}\]

Note that the member-level variance component is scaled by the sum of squared weights. With equal weights (\(w_{ik} = 1/n_i\)), this simplifies to \(\sigma_{u^{(p)}}^2 / n_i\), so the effective member-level variance shrinks as group size increases. With unequal weights, members with larger weights contribute more to the variance.

Intraclass Correlation Coefficients (ICC)

The intraclass correlation coefficient (ICC) for a given level is the proportion of total variance attributable to that level. Intuitively, the ICC answers: “What fraction of the total variation in the outcome is due to differences between units at this level?”

For example, an ICC of 0.30 for the country level means that 30% of the outcome variation can be attributed to between-country differences, while the remaining 70% arises from variation at other levels (e.g., between groups within countries, or between members).

\[\begin{align} \rho^{(g)} &= \frac{\sigma_{u^{(g)}}^2}{\sigma_{u^{(g)}}^2 + \sigma_{u^{(c)}}^2 + \sigma_{u^{(p)}}^2 \overline{w^2}} \tag{13} \\ \rho^{(c)} &= \frac{\sigma_{u^{(c)}}^2}{\sigma_{u^{(g)}}^2 + \sigma_{u^{(c)}}^2 + \sigma_{u^{(p)}}^2 \overline{w^2}} \tag{14} \\ \rho^{(p)} &= \frac{\sigma_{u^{(p)}}^2 \overline{w^2}}{\sigma_{u^{(g)}}^2 + \sigma_{u^{(c)}}^2 + \sigma_{u^{(p)}}^2 \overline{w^2}} \tag{15} \end{align}\]

where \(\overline{w^2} = \frac{1}{N} \sum_{i=1}^{N} \sum_{k \in p[i]} w_{ik}^2\) is the average sum of squared weights across groups.

Since the model is estimated via MCMC, the ICCs inherit full posterior distributions. The varDecomp() function computes ICCs per posterior draw and summarizes them, properly propagating uncertainty:

m1 <- bml(
  Surv(dur_wkb, event_wkb) ~ 1 + majority +
    mm(id = id(pid, gid), vars = vars(cohesion), fn = fn(w ~ 1/n), RE = TRUE) +
    hm(id = id(cid), type = "RE"),
  family = "Weibull",
  data = coalgov
)

varDecomp(m1)
varDecomp(m1, uncertainty = "ci")

The uncertainty argument controls the reported uncertainty measure: "sd" (posterior standard deviation, the default), "mad" (median absolute deviation), or "ci" (95% credible intervals).

Family-specific handling: For Binomial models, which have no residual \(\sigma\), the latent logistic residual variance \(\pi^2/3 \approx 3.29\) is used. For Cox models, which also lack a residual variance, ICCs are computed among the non-residual components only.

Extensions

Multiple MM Blocks

The bml package allows multiple mm() blocks with distinct aggregation mechanisms. This is useful when different member-level covariates aggregate through different processes:

\[\begin{equation} \theta_i^{(p)} = \sum_{b=1}^B \sum_{k \in p[i]} w_{ik}^{(b)} \left( \mathbf{\beta}^{(p,b)} \cdot \mathbf{x}_{ik}^{(p,b)} + u_k^{(p,b)} \right) \tag{16} \end{equation}\]

where \(b\) indexes MM blocks, each with its own weight function \(w_{ik}^{(b)} = f(\mathbf{x}_{ik}^{(w,b)}, n_i)\).

Example: Model the effect of the cohesion of the party with the largest seat share while the party random effects are specified with equal weights:

mm(id = id(pid, gid), vars = vars(cohesion), fn = fn(w ~ pseat == max(pseat)), RE = FALSE) +
mm(id = id(pid, gid), vars = NULL, fn = fn(w ~ 1/n), RE = TRUE)

Important: For any given id(pid, gid) combination, only one mm() block may include RE = TRUE within the same model.

Autoregressive Random Effects

Member random effects may evolve over time rather than remain constant. The ar = TRUE option specifies a random walk process:

\[\begin{equation} u_{ik}^{(p)} \sim \begin{cases} N(0, \sigma_{u^{(p)}}^2) & \text{if } g[i,k] = \emptyset \\ N(u_{g[i,k],k}^{(p)}, \sigma_{u^{(p)}}^2) & \text{if } g[i,k] \neq \emptyset \end{cases} \tag{17} \end{equation}\]

where \(g[i,k]\) returns the most recent previous group in which member \(k\) participated. For \(g[i,k] = \emptyset\) (first participation), the effect is drawn from the prior.

Interpretation: Autoregressive effects capture dynamics where a member’s unobserved heterogeneity changes across successive group participations. The variance \(\sigma_{u^{(p)}}^2\) now represents the innovation variance governing temporal change.

Example: A party’s propensity to destabilize governments may evolve as its leadership, organization, or electoral fortunes change.

Usage:

mm(id = id(pid, gid), vars = vars(cohesion), fn = fn(w ~ 1/n), RE = TRUE, ar = TRUE)

Opposition Members

When analyzing group outcomes influenced by both members and non-members (e.g., coalition and opposition parties), separate mm() blocks can model their distinct contributions:

\[\begin{equation} \theta_i^{(p)} = \sum_{k \in p[i]} w_{ik}^{(g)} \left( \mathbf{\beta}^{(g)} \cdot \mathbf{x}_{ik}^{(g)} + u_k^{(g)} \right) + \sum_{k \notin p[i]} w_{ik}^{(o)} \left( \mathbf{\beta}^{(o)} \cdot \mathbf{x}_{ik}^{(o)} + u_k^{(o)} \right) \tag{18} \end{equation}\]

where superscripts \((g)\) and \((o)\) denote member and non-member effects, respectively. This allows both groups to have different structural effects and different aggregation mechanisms.

In bml, this is specified using two mm() blocks with different mmid variables — one for members (\(k \in p[i]\)) and one for non-members (\(k \notin p[i]\)):

mm(id = id(pid_g, gid), vars = vars(x1, x2), fn = fn(w ~ 1/n), RE = TRUE) +
mm(id = id(pid_o, gid), vars = vars(x1, x2), fn = fn(w ~ 1/n), RE = TRUE)

Data format: Since the two mm() blocks use different member ID variables (pid_g and pid_o), the data should be in a stacked long format where each member gets its own row, with NA in the mmid column it does not belong to:

gid  pid_g  pid_o  x1    x2    dur_wkb  event_wkb
1    1      NA     0.5   0.3   365      0
1    2      NA     0.4   0.2   365      0
1    NA     3      0.3   0.1   365      0
1    NA     4      0.7   0.5   365      0

Government parties have pid_g filled and pid_o = NA; opposition parties have pid_o filled and pid_g = NA. Group-level variables (e.g., dur_wkb, event_wkb) are repeated across all rows within a group, as in the standard long format.

Comparison with Conventional MMMM

The conventional MMMM (as implemented in brms or MLwiN) is a special case:

\[\begin{equation} y_i^{(g)} = \mathbf{\beta}^{(g)} \cdot \mathbf{x}_i^{(g)} + \mathbf{\beta}^{(p)} \cdot \bar{\mathbf{x}}_{i}^{(p)} + \sum_{k \in p[i]} w_{ik} u_k^{(p)} + u_i^{(g)} \end{equation}\]

where \(\bar{\mathbf{x}}_{i}^{(p)} = \sum_k w_{ik} \mathbf{x}_{ik}^{(p)}\) is pre-aggregated using imposed weights \(w_{ik}\).

The extended MMMM generalizes this by:

Aggregating both systematic and random components within the model
Endogenizing weights as functions of covariates with estimable parameters
Enabling multiple aggregation mechanisms via multiple mm() blocks

\[\begin{equation} y_i^{(g)} = \mathbf{\beta}^{(g)} \cdot \mathbf{x}_i^{(g)} + \sum_{k \in p[i]} w_{ik} \left( \mathbf{\beta}^{(p)} \cdot \mathbf{x}_{ik}^{(p)} + u_k^{(p)} \right) + u_i^{(g)}, \quad w_{ik} = f(\mathbf{x}_{ik}^{(w)}, n_i) \end{equation}\]

Model Assumptions

The extended MMMM assumes:

Independence of random components across levels: \(u_i^{(g)} \perp u_j^{(c)} \perp u_k^{(p)}\). If violated, variance estimates may be biased. More complex covariance structures can be accommodated if needed.
Random effects are uncorrelated with covariates (random effects assumption). However, including within-cluster means of covariates or using within-between specifications isolates within-cluster effects, yielding identical point estimates to fixed effect models.
Correct functional form for weight function. The logistic-type function in Equation (8) is flexible, but other forms can be specified using the fn() syntax.
Proper model specification: Misspecifying a weight covariate as a structural covariate (or vice versa) yields null effects for the misspecified variable while correctly identifying properly specified ones, provided there is no omitted variable bias.

Estimation

Model parameters are estimated using Bayesian Markov chain Monte Carlo (MCMC) via JAGS because:

The likelihood has no closed-form solution due to the parameterized weight function
Bayesian estimation captures uncertainty in variance estimates, yielding credible intervals even when maximum likelihood methods report zero variance
Weakly informative priors can be specified for all parameters to regularize estimates in small samples

The bml package provides a user-friendly interface to JAGS with automated prior specification, convergence diagnostics, and post-estimation tools.

Further Information

A comprehensive presentation of the model is provided in Rosche (2026).