A Bayesian Framework for Latent Compliance Modeling in Cluster Randomized Trials with One-Sided Noncompliance
Pith reviewed 2026-05-18 16:01 UTC · model grok-4.3
The pith
A Bayesian latent mixture model estimates ITT and CACE effects within implementation types for cluster randomized trials with noncompliance.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By using a latent mixture model to summarize heterogeneity in cluster-level implementation based on baseline characteristics and observed implementation measures, and linking these latent implementation types to individual compliance and outcomes through a joint model, the framework enables estimation of finite- and super-population intent-to-treat and CACE estimands, both marginally and within latent implementation types.
What carries the argument
Latent mixture model for cluster implementation heterogeneity connected via a joint model to individual compliance and outcomes.
If this is right
- Causal effects can be estimated specifically within different latent cluster implementation types.
- Individual compliance in control clusters can be imputed using the joint model.
- Both finite-population and super-population versions of ITT and CACE become available.
- Analysis of the METRIcAL trial yields insights into effect variation beyond standard marginal ITT results.
Where Pith is reading between the lines
- The approach may extend to two-sided noncompliance if the mixture model is adjusted for bidirectional deviations.
- Identifying clusters likely to fall into high-implementation types could guide targeted support during trial rollout.
- Similar latent modeling of partial observability could apply to other multilevel settings with interference.
Load-bearing premise
The latent mixture model correctly captures how baseline characteristics and observed implementation measures relate to the unobserved compliance behaviors in control clusters.
What would settle it
If external data on compliance in control clusters shows large discrepancies from the model's imputations, or if estimated effects within latent types prove highly sensitive to the number of mixture components chosen, the framework's ability to deliver reliable within-type estimates would be challenged.
read the original abstract
In pragmatic cluster randomized controlled trials (PCRCTs), healthcare providers are randomized while both providers and patients may deviate from the assigned intervention. In many PCRCTs, cluster-level implementation is measured using multiple continuous metrics, while individual compliance is recorded as a binary indicator. Standard complier average causal effect (CACE) estimands focus on individual-level compliance and do not account for heterogeneity in implementation across clusters. When intervention uptake is shaped by both provider- and patient-level processes, it is of scientific interest to characterize how effects vary across these sources of compliance. We propose a Bayesian framework for PCRCTs with one-sided binary noncompliance at the individual level and one-sided partial compliance at the cluster level. The method uses a latent mixture model to summarize heterogeneity in cluster-level implementation based on baseline characteristics and observed implementation measures, and links these latent implementation types to individual compliance and outcomes through a joint model. Because compliance is only observed in treated clusters, the model imputes unobserved compliance behavior for clusters and individuals assigned to control. The framework enables estimation of finite- and super-population intent-to-treat (ITT) and CACE estimands, both marginally and within latent implementation types. We apply the method to the METRIcAL trial, a pragmatic cluster randomized study evaluating a personalized music intervention for nursing home residents with dementia. The analysis illustrates how accounting for implementation heterogeneity and individual compliance can provide insights beyond standard ITT analyses.}{Causal inference; Principal stratification; Complier average causal effect; Cluster randomized trials; Noncompliance; Bayesian methods; Latent variable models; Interference.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a Bayesian latent mixture modeling framework for pragmatic cluster randomized trials with one-sided noncompliance at both the cluster (partial compliance via continuous implementation metrics) and individual (binary compliance) levels. A latent mixture summarizes cluster-level implementation heterogeneity conditional on baselines and observed metrics (available only in treated clusters); these latent types are then linked through a joint model to individual compliance probabilities and outcomes. The approach imputes compliance for control clusters and individuals to enable estimation of finite- and super-population ITT and CACE estimands, both marginally and stratified by latent implementation type. The method is applied to the METRIcAL trial of a personalized music intervention for nursing-home residents with dementia.
Significance. If the core modeling assumptions hold, the framework provides a principled way to move beyond standard ITT or marginal CACE analyses in pragmatic CRTs by recovering implementation-stratified effects. This is potentially valuable for trials in which both provider-level uptake and patient-level compliance shape intervention delivery, as it can generate more actionable insights about heterogeneity without requiring direct observation of compliance under control.
major comments (2)
- [§3.1–3.2] §3.1–3.2: The identification of within-type CACE estimands rests on the assumption that the regression parameters relating latent implementation types to individual compliance probabilities (estimated exclusively from treated clusters) are transportable to control clusters. Because implementation metrics are observed only under treatment, any unmeasured cluster-level confounders that affect both implementation and compliance will bias the imputed compliance indicators and the resulting within-type CACE; the manuscript does not provide a formal sensitivity analysis or bounding approach for this extrapolation.
- [§4.1, Eq. (12)–(15)] §4.1, Eq. (12)–(15): The joint likelihood factors the latent-type distribution, compliance model, and outcome model, but the paper does not demonstrate that the finite-population and super-population versions of the within-type CACE are separately identified under the stated priors when the number of latent types is estimated from the data rather than fixed a priori.
minor comments (2)
- [Table 1] Table 1: the column headers for the implementation metrics are not fully aligned with the variable descriptions in the text; adding a footnote clarifying the scaling of the continuous metrics would improve readability.
- [§5.3] §5.3: the posterior predictive checks for the control-arm compliance imputation are only summarized graphically; reporting numerical calibration metrics (e.g., coverage of imputed compliance probabilities against any available external validation) would strengthen the results section.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments on our manuscript. We address each major comment below and outline revisions that will strengthen the presentation and robustness of the proposed framework.
read point-by-point responses
-
Referee: [§3.1–3.2] §3.1–3.2: The identification of within-type CACE estimands rests on the assumption that the regression parameters relating latent implementation types to individual compliance probabilities (estimated exclusively from treated clusters) are transportable to control clusters. Because implementation metrics are observed only under treatment, any unmeasured cluster-level confounders that affect both implementation and compliance will bias the imputed compliance indicators and the resulting within-type CACE; the manuscript does not provide a formal sensitivity analysis or bounding approach for this extrapolation.
Authors: We agree that transportability of the compliance-model parameters across treatment arms is a central identifying assumption. This assumption is standard in principal stratification settings with one-sided noncompliance and is analogous to conditional ignorability given the observed covariates and latent types. We will add a formal sensitivity analysis in the revised manuscript (new subsection in Section 5) that introduces a sensitivity parameter governing the strength of unmeasured cluster-level confounding and reports how the within-type CACE estimates change under a range of plausible values. We will also include a brief bounding exercise following the approach of Ding and VanderWeele (2016) adapted to the latent-type setting. revision: yes
-
Referee: [§4.1, Eq. (12)–(15)] §4.1, Eq. (12)–(15): The joint likelihood factors the latent-type distribution, compliance model, and outcome model, but the paper does not demonstrate that the finite-population and super-population versions of the within-type CACE are separately identified under the stated priors when the number of latent types is estimated from the data rather than fixed a priori.
Authors: We appreciate the referee highlighting this point. In the current implementation the number of latent types K is selected via WAIC/LOO-CV on the treated clusters and then treated as fixed for the joint posterior; conditional on K the finite-population and super-population within-type CACEs are identified through the joint posterior under the stated priors. We will revise Section 4.1 to explicitly state this conditioning, provide a short identification argument for both estimands given K, and add a brief simulation check confirming that the two versions remain distinguishable in finite samples when K is data-driven. If the referee prefers a fully Bayesian treatment of unknown K (e.g., via a Dirichlet process), we are prepared to explore that extension as well. revision: partial
Circularity Check
No circularity: framework defines new joint model and estimands from first principles
full rationale
The paper specifies a Bayesian latent mixture model with explicit likelihood components for observed implementation metrics in treated clusters, binary compliance (observed only under treatment), and outcomes. Finite- and super-population ITT and CACE estimands (marginal and within latent types) are defined as posterior functionals of the joint posterior; they are not algebraically identical to any fitted parameter or input summary by construction. Imputation for control clusters follows directly from the conditional distribution under the stated model assumptions rather than from a tautological re-expression of treated-cluster fits. No load-bearing step reduces to a self-citation, ansatz smuggled via citation, or renaming of a known result. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption One-sided noncompliance at the individual level and one-sided partial compliance at the cluster level
- domain assumption Compliance behavior can be imputed for control clusters and individuals using the latent mixture structure
invented entities (1)
-
latent implementation types
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose a Bayesian method for PCRCTs with one-sided binary noncompliance at the individual level and one-sided partial compliance at the cluster level. Our Bayesian model classifies clusters into latent compliance strata based on pre-treatment characteristics, partial compliance status, and individual outcomes.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Because compliance is only observed in the treatment arm, the method imputes unobserved compliance for control clusters and the individuals within them.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Springer, 1992. Jonas H. Ellenberg. Intent-to-treat analysis versus as-treated analysis.Drug Information Journal, 30(2):535–544,
work page 1992
-
[2]
URL https://journals.sagepub.com/doi/abs/10.1177/009286159603000229. Nicole S Erler, Dimitris Rizopoulos, Joost van Rosmalen, Vincent WV Jaddoe, Oscar H Franco, and Em- manuel MEH Lesaffre. Dealing with missing covariates in epidemiologic studies: a comparison between multiple imputation and a full bayesian approach.Statistics in medicine, 35(17):2955–297...
-
[3]
Constantine E Frangakis and Donald B Rubin
URL https://dx.doi.org/10.1093/biostatistics/3.2.147. Constantine E Frangakis and Donald B Rubin. Principal stratification in causal inference.Biometrics, 58(1): 21–29, 2002. ISSN 0006-341X. Constantine E Frangakis, Donald B Rubin, and Xiao-Hua Zhou. Clustered encouragement designs with indi- vidual noncompliance: Bayesian inference with randomization, an...
-
[4]
*" indicates that the quantity can be observed,
ISSN 0891-9887. A BAYESIAN METHOD TO ADDRESS MULTI-LEVEL NONCOMPLIANCE1 APPENDIX i j W i Yij (1, Dij )Y ji (0, Dij )D ij (1)D ij (0)C i(1)C i(0)S i Zi Xij 1 1 1 * ? * 0 * 0 ? * * 1 2 1 * ? * 0 * 0 ? * * 1 3 1 * ? * 0 * 0 ? * * 2 1 0 ? * ? 0 ? 0 ? * * 2 2 0 ? * ? 0 ? 0 ? * * 2 3 0 ? * ? 0 ? 0 ? * * 2 4 0 ? * ? 0 ? 0 ? * * TABLEA1 Observable and unobservabl...
-
[5]
a) Sample fromP(µ S(m) k |O,S (m−1),C mis(m−1),D mis(m−1),θ (m) −µS k )fork= 1, ..., K
Sample fromP(θ (m)|O,S (m−1),C mis(m−1),D mis(m−1)). a) Sample fromP(µ S(m) k |O,S (m−1),C mis(m−1),D mis(m−1),θ (m) −µS k )fork= 1, ..., K. We have, P(µ S(m) k |O,S (m−1),C mis(m−1),D mis(m−1),θ (m) −µS k )∝ MVN(µS k |mµS , VµS) Y i:S (m−1) i =k MVN(C(m−1) i ,Z i|µS k ,Σ (m−1)) The prior specification for isµ S(m) k is semi-conjugate, and implies P(µ S(m...
work page 1993
-
[6]
0v 2 β . By the conjugacy of the prior distribution, P(B (m) k |O,S (m−1),C mis(m−1),D mis(m−1),θ (m) −B )∝MVN(M B,Ω B), whereΩ B = 1 σ 2(m−1) k Xobs k t Xobs k +V −1 B −1 andM B = ΩB 1 σ 2(m−1) k Xobs k t Y obs ij − ¯¯ϕY(m−1) k . h) Sample fromP(σ 2(m) k |O,S (m−1),C mis(m−1),D mis(m−1)θ(m) −σ2)fork= 1, ..., K. The complete conditio...
-
[7]
Sample fromP(S (m)|O,C mis(m−1),D mis(m−1),θ (m)). The completer conditional distribution forS (m) is P(S (m)|O,C mis(m−1),D mis(m−1),θ (m) −S )∝ KY k=1 I f p Y i=1 π(m)I(S i=k) k KY k=1 I f p Y i=1 MVN(C(m−1) i ,Z i|µS(m) k ,Σ (m))I(S i=k)× KY k=1 I f p Y i=1 niY j=1 h N(Y obs ij |Xobs ij B(m) k +ϕ Y(m) i , σ2(m) k ) iI(S i=k) × KY k=1 I f p Y i=1 niY j=...
-
[8]
Sample fromP(C mis(m)|O,S (m),D mis(m−1),θ (m)) Without loss of generality, assumeW i = 0andS (m) i =k. The complete conditional dis- tribution forC mis(m) i is P(C mis(m) i |O,S (m),D mis(m−1)θ(m))∝MVN(C i,Z i|µS(m) k ,Σ (m)) ∝MVN(C i|MCi,Ω Ci) whereM Ci =µ C(m) k + Σ(m) CZ Σ(m)−1 ZC (Zi −µ Z(m) k ),Ω Ci = Σ(m) CC −Σ (m) CZ Σ(m) ZC Σ(m) ZC and Σ(m) = " Σ...
-
[9]
Without loss of generality, assume thatW i = 0andS (m) i =k
Sample fromP(D mis(m)|O, ,S(m),C mis(m),θ (m)). Without loss of generality, assume thatW i = 0andS (m) i =k. The complete conditional distribution ofD mis(m) is P(D mis(m) ij |O,S (m),C mis(m),θ (m))∝Bernoulli Dij|probit−1(µD(m) k +X ijα(m) k +ϕ D(m) i ) × N(Y obs ij |XobsB(m) k +ϕ Y(m) i , σ2(m) k ). LetX D=1 ij = [1,X ij,(1−W ij), W i ∗X ij, W i]andX D=...
-
[10]
Form= 1, ..., M: a) Obtainθ (m),S (m),C mis(m) andD mis(m) by implementing the Gibbs sampling algo- rithm outlined in Section A1. b) For individuals in clusters such thatS (m) i =k, sample from Y mis(m) ij ∼N(X mis ij B(m) k +ϕ Y(m) i , σ2(m) k ). c) Repeat (b) fork= 1, ..., K
-
[11]
Combine theMdraws to obtainn Ymis(1),S (1),D mis(1),C mis(1), ....,Y mis(M) ,S (M) ,D mis(M) ,C mis(M) o . A3. Imputing Treatment and Control Values in the Approximate Super-population. In step 2(b) of Procedure 3 described in Section 3.3.3 we impute treatment and control values in the super-population by sampling Y(1,D) (m,r),Y(0,D) (m,r),D (m,r),C (m,r)...
-
[12]
b) Sample{C (m,r) i ,Z (m,r) i }from MVN µS(m) k ,Σ (m)
Fori= 1, ..., I f p, 10 a) SampleS (m,r) i from Categorical(K, π(m)). b) Sample{C (m,r) i ,Z (m,r) i }from MVN µS(m) k ,Σ (m) . c) SampleD (m,r) i from Bernoulli probit−1(µD(m) k +X ijα(m) k +ϕ D(m) i ) . d) SampleY ij(1, Dij)(m,r) fromN(µ Y(m) k +X ijβ(m) 0,k +D (m,r) ij ∗(δ (m) 1,k +X ijβ(m) 1,k ) + ϕY(m) i , σ2(m) k ). e) SampleY ij(0, Dij)(m,r) fromN(...
-
[13]
Esp "X k I(S i =k) µY k +X ijβ0,k +D ij ∗(X ijβ1,k +δ 1,k) |Si =k,X ij ## =E sp
Combine these draws to obtainn Y(1,D) (m,r),Y(0,D) (m,r),D (m,r),C (m,r),Z (m,r),S (m,r) o Repeat this procedure forr= 1, ..., Randm= 1, ..., M. A4. Super-population Estimand Derivation as a Function of Model Parameters.We derive the super-population estimands described in section 3.2 as functions of the model pa- rameters. Let pijk =probit −1 Xijαk +µ D ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.