Mixture of Finite Mixtures Model for Basket Trial

Guanyu Hu; Junxian Geng; Ruitao Lin; Tianjian Zhou

arxiv: 2011.04135 · v2 · submitted 2020-11-09 · 📊 stat.ME · stat.AP

Mixture of Finite Mixtures Model for Basket Trial

Junxian Geng , Tianjian Zhou , Ruitao Lin , Guanyu Hu This is my paper

Pith reviewed 2026-05-24 14:21 UTC · model grok-4.3

classification 📊 stat.ME stat.AP

keywords basket trialmixture of finite mixturesBayesian hierarchical modelclusteringshrinkage estimationoncologyvemurafenib

0 comments

The pith

A two-step MFM clustering step followed by within-cluster BHM shrinkage balances pooled and stratified analysis in basket trials by consistently estimating the number of clusters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Basket trials enroll cohorts of different cancer subtypes that share one molecular target, but standard Bayesian hierarchical models risk over-shrinkage when the exchangeability assumption fails across all cohorts. The paper addresses this by first treating the cohorts as a clustering problem and using the mixture of finite mixtures model to recover the number of groups with a consistency guarantee. Only after the clusters are identified does the method apply Bayesian hierarchical shrinkage within each cluster. This produces treatment-effect estimates that borrow strength among similar cohorts while leaving dissimilar ones unpooled. The approach is evaluated in simulations that vary cluster structure and applied to data from the vemurafenib basket trial.

Core claim

The mixture of finite mixtures model supplies a consistent estimator for the unknown number of clusters among cohorts; once those clusters are obtained, Bayesian hierarchical modeling can be applied under the exchangeability assumption only inside each cluster, thereby avoiding the over-shrinkage that occurs when the model assumes all cohorts are exchangeable.

What carries the argument

Mixture of finite mixtures (MFM) model, which is used to group cohorts that share similar treatment effects and to deliver a consistent estimate of the cluster count before within-cluster shrinkage is performed.

If this is right

Standard BHM applied to the entire set of cohorts is replaced by BHM applied only inside recovered clusters, reducing the chance that dissimilar cohorts pull each other's estimates toward a common mean.
The procedure sits between full pooling of all data and completely separate analysis of each cohort.
Because the MFM step is consistent for the number of clusters, the final shrinkage estimates inherit the usual posterior contraction properties of BHM inside each correctly identified group.
Application to the vemurafenib trial produces cluster-specific estimates that differ from both the fully pooled and the fully stratified results.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the MFM clustering step succeeds on real basket-trial data, the same two-step logic could be tested on other multi-arm oncology designs that also exhibit partial exchangeability.
The method's performance will be sensitive to the prior placed on the number of clusters inside the MFM model; modest changes in that prior could alter the recovered partition.
When the true number of clusters is one, the procedure should behave like standard BHM; when the true number equals the number of cohorts, it should behave like stratified analysis.

Load-bearing premise

The data-generating process really consists of a finite number of exchangeable clusters whose membership can be recovered by the MFM step even when sample sizes per cohort are small.

What would settle it

A simulation experiment in which the MFM step returns an incorrect number of clusters when the true cluster structure is known would show that the two-step procedure does not reliably separate cohorts before shrinkage.

Figures

Figures reproduced from arXiv: 2011.04135 by Guanyu Hu, Junxian Geng, Ruitao Lin, Tianjian Zhou.

read the original abstract

With the recent paradigm shift from cytotoxic drugs to new generation of target therapy and immuno-oncology therapy during oncology drug developments, patients with various cancer (sub)types may be eligible to participate in a basket trial if they have the same molecular target. Bayesian hierarchical modeling (BHM) are widely used in basket trial data analysis, where they adaptively borrow information among different cohorts (subtypes) rather than fully pool the data together or doing stratified analysis based on each cohort. Those approaches, however, may have the risk of over shrinkage estimation because of the invalidated exchangeable assumption. We propose a two-step procedure to find the balance between pooled and stratified analysis. In the first step, we treat it as a clustering problem by grouping cohorts into clusters that share the similar treatment effect. In the second step, we use shrinkage estimator from BHM to estimate treatment effects for cohorts within each cluster under exchangeable assumption. For clustering part, we adapt the mixture of finite mixtures (MFM) approach to have consistent estimate of the number of clusters. We investigate the performance of our proposed method in simulation studies and apply this method to Vemurafenib basket trial data analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes a two-step procedure for basket trial data analysis. Cohorts are first clustered via an adapted mixture of finite mixtures (MFM) model intended to produce a consistent estimate of the number of clusters; Bayesian hierarchical modeling (BHM) shrinkage is then applied only within the resulting clusters under the exchangeability assumption. The approach is evaluated in simulation studies and illustrated on the Vemurafenib basket trial.

Significance. If the MFM step reliably recovers the correct number of clusters and cohort assignments in the small-sample regime typical of basket trials, the procedure would provide a data-driven compromise between full pooling and fully stratified analysis, mitigating the over-shrinkage risk of standard BHM when exchangeability fails across all cohorts.

major comments (3)

[Abstract / clustering step] Abstract and clustering-step description: the claim that the MFM is adapted 'to have consistent estimate of the number of clusters' is stated without specifying the modification, citing the relevant consistency theorem, or demonstrating that finite-sample recovery remains reliable when the number of cohorts is small (e.g., 5–10) and per-cohort sample sizes are modest (e.g., 10–30), the regime in which basket-trial data are typically collected. Because the subsequent within-cluster BHM step presupposes correctly identified exchangeable groups, misclustering would produce either over- or under-shrinkage and undermine the central claim.
[Simulation studies] Simulation studies (as referenced in the abstract): no quantitative results are supplied on cluster-recovery metrics (adjusted Rand index, proportion of correctly estimated K, misassignment rates), bias or MSE of the final effect estimates, or direct comparisons against standard BHM and stratified estimators under the small-n, moderate-separation conditions that define the target setting. Without these diagnostics it is impossible to verify that the two-step procedure achieves the claimed balance.
[Real-data application] Real-data application: the Vemurafenib analysis should report the estimated number of clusters, the cohort-to-cluster assignments, and side-by-side numerical comparison of the resulting posterior means and intervals with those obtained from a single full BHM; absent this information the practical advantage of the procedure cannot be assessed.

minor comments (1)

[Abstract] The abstract would benefit from a single sentence summarizing the key quantitative findings (e.g., cluster-recovery rate or MSE improvement) rather than merely stating that simulations were performed.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments on our manuscript. We address each of the major comments below and indicate the revisions we plan to make.

read point-by-point responses

Referee: [Abstract / clustering step] Abstract and clustering-step description: the claim that the MFM is adapted 'to have consistent estimate of the number of clusters' is stated without specifying the modification, citing the relevant consistency theorem, or demonstrating that finite-sample recovery remains reliable when the number of cohorts is small (e.g., 5–10) and per-cohort sample sizes are modest (e.g., 10–30), the regime in which basket-trial data are typically collected. Because the subsequent within-cluster BHM step presupposes correctly identified exchangeable groups, misclustering would produce either over- or under-shrinkage and undermine the central claim.

Authors: We agree that additional details are needed to support the claim regarding the adaptation of the MFM model. In the revised manuscript, we will explicitly describe the modification made to the standard MFM approach to ensure consistency in estimating the number of clusters, cite the appropriate consistency theorem, and include new simulation results evaluating cluster recovery performance in the small-sample regime typical of basket trials (5-10 cohorts, 10-30 patients per cohort). revision: yes
Referee: [Simulation studies] Simulation studies (as referenced in the abstract): no quantitative results are supplied on cluster-recovery metrics (adjusted Rand index, proportion of correctly estimated K, misassignment rates), bias or MSE of the final effect estimates, or direct comparisons against standard BHM and stratified estimators under the small-n, moderate-separation conditions that define the target setting. Without these diagnostics it is impossible to verify that the two-step procedure achieves the claimed balance.

Authors: We acknowledge the need for more detailed quantitative evaluations in the simulation studies. The revised version will include cluster-recovery metrics such as the adjusted Rand index, the proportion of simulations where K is correctly estimated, and misassignment rates. We will also report bias and MSE for the final effect estimates and provide direct comparisons with standard BHM and stratified estimators under the relevant small-n conditions. revision: yes
Referee: [Real-data application] Real-data application: the Vemurafenib analysis should report the estimated number of clusters, the cohort-to-cluster assignments, and side-by-side numerical comparison of the resulting posterior means and intervals with those obtained from a single full BHM; absent this information the practical advantage of the procedure cannot be assessed.

Authors: We agree that the real-data application section would benefit from these additional details. In the revision, we will report the estimated number of clusters and the specific cohort-to-cluster assignments from the MFM step for the Vemurafenib trial. We will also include a side-by-side comparison of the posterior means and credible intervals from our two-step procedure with those from a standard full BHM analysis. revision: yes

Circularity Check

0 steps flagged

No circularity: two-step MFM-BHM is an algorithmic combination of standard methods

full rationale

The paper describes a two-step procedure that first adapts the mixture of finite mixtures (MFM) model to cluster cohorts and then applies Bayesian hierarchical modeling (BHM) shrinkage within clusters. This is presented as a methodological proposal relying on existing MFM and BHM formulations. No equations reduce a claimed prediction or result to a fitted parameter by construction, no load-bearing self-citations justify core premises, and no uniqueness theorems or ansatzes are smuggled in. The adaptation for 'consistent estimate of the number of clusters' is stated as a modification of standard MFM theory rather than derived from the paper's own outputs. The procedure is self-contained against external benchmarks and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The method rests on standard mixture-model and hierarchical-model assumptions plus the domain claim that treatment effects are piecewise exchangeable; no new entities are postulated and the only free parameters are those internal to MFM and BHM.

free parameters (2)

MFM concentration parameter
Controls the prior on the number of clusters and is typically chosen or estimated from data.
BHM variance components
Hyperparameters that govern the degree of shrinkage within each cluster.

axioms (1)

domain assumption Exchangeability holds within each recovered cluster
Invoked to justify applying standard BHM after the clustering step.

pith-pipeline@v0.9.0 · 5738 in / 1203 out tokens · 28694 ms · 2026-05-24T14:21:54.808836+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 1 internal anchor

[1]

Berry, S. M., K. R. Broglio, S. Groshen, and D. A. Berry (2013). Bayesian hierarchical modeling of patient subpopulations: efﬁcient designs of phase ii oncology clinical trials. Clinical Trials 10(5), 720–734

work page 2013
[2]

Blackwell, D., J. B. MacQueen, et al. (1973). Ferguson distributions via P ´olya urn schemes. The Annals of Statistics 1(2), 353–355

work page 1973
[3]

Chu, Y . and Y . Yuan (2018). A bayesian basket trial design using a calibrated bayesian hierarchical model. Clinical Trials 15(2), 149–158

work page 2018
[4]

Dahl, D. B. (2006). Model-based clustering for expression data via a dirichlet process mixture model. Bayesian inference for gene expression and proteomics 4 , 201–218

work page 2006
[5]

Eisenhauer, E. A., P. Therasse, J. Bogaerts, L. H. Schwartz, D. Sargent, R. Ford, J. Dancey, S. Arbuck, S. Gwyther, M. Mooney, et al. (2009). New response evaluation criteria in solid tumours: revised recist guideline (version 1.1). European journal of cancer 45 (2), 228–247

work page 2009
[6]

Bhattacharya, and D

Geng, J., A. Bhattacharya, and D. Pati (2019). Probabilistic community detection with unknown number of communities. Journal of the American Statistical Association 114 (526), 893–905

work page 2019
[7]

Geng, J. and E. H. Slate (2020). Discovery among binary biomarkers in heterogeneous populations. In Statistical Modeling in Biomedical Research , pp. 213–232. Springer

work page 2020
[8]

On posterior contraction of parameters and interpretability in Bayesian mixture modeling

Guha, A., N. Ho, and X. Nguyen (2019). On posterior contraction of parameters and interpretability in bayesian mixture modeling. arXiv preprint arXiv:1901.05078

work page internal anchor Pith review Pith/arXiv arXiv 2019
[9]

Hyman, D. M., I. Puzanov, V . Subbiah, J. E. Faris, I. Chau, J.-Y . Blay, J. Wolf, N. S. Raje, E. L. Diamond, A. Hollebecque, et al. (2015). Vemurafenib in multiple nonmelanoma cancers with braf v600 mutations. New England Journal of Medicine 373 (8), 726–736

work page 2015
[10]

Lin, N. U., E. Q. Lee, H. Aoyama, I. J. Barani, D. P. Barboriak, B. G. Baumert, M. Bendszus, P. D. Brown, D. R. Camidge, S. M. Chang, et al. (2015). Response assessment criteria for brain metastases: proposal from the rano group. The lancet oncology 16 (6), e270–e278. 17

work page 2015
[11]

Miller, J. W. and M. T. Harrison (2018). Mixture models with a prior on the number of components.Journal of the American Statistical Association 113 (521), 340–356

work page 2018
[12]

Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics 9 (2), 249–265

work page 2000
[13]

Wandel, S

Neuenschwander, B., S. Wandel, S. Roychoudhury, and S. Bailey (2016). Robust exchangeability designs for early phase clinical trials with multiple strata. Pharmaceutical statistics 15(2), 123–134

work page 2016
[14]

Thall, P. F., J. K. Wathen, B. N. Bekele, R. E. Champlin, L. H. Baker, and R. S. Benjamin (2003). Hierarchi- cal bayesian approaches to phase ii trials in diseases with multiple subtypes. Statistics in medicine 22(5), 763–780

work page 2003
[15]

Woodcock, J. and L. M. LaVange (2017). Master protocols to study multiple therapies, multiple diseases, or both. New England Journal of Medicine 377 (1), 62–70

work page 2017
[16]

M¨uller, A

Xu, Y ., P. M¨uller, A. M. Tsimberidou, and D. Berry (2019). A nonparametric Bayesian basket trial design. Biometrical Journal 61(5), 1160–1174

work page 2019
[17]

Hu, and W

Yin, F., G. Hu, and W. Shen (2020). Analysis of professional basketball ﬁeld goal attempts via a Bayesian matrix clustering approach. arXiv preprint arXiv:2010.08495. 18

work page arXiv 2020

[1] [1]

Berry, S. M., K. R. Broglio, S. Groshen, and D. A. Berry (2013). Bayesian hierarchical modeling of patient subpopulations: efﬁcient designs of phase ii oncology clinical trials. Clinical Trials 10(5), 720–734

work page 2013

[2] [2]

Blackwell, D., J. B. MacQueen, et al. (1973). Ferguson distributions via P ´olya urn schemes. The Annals of Statistics 1(2), 353–355

work page 1973

[3] [3]

Chu, Y . and Y . Yuan (2018). A bayesian basket trial design using a calibrated bayesian hierarchical model. Clinical Trials 15(2), 149–158

work page 2018

[4] [4]

Dahl, D. B. (2006). Model-based clustering for expression data via a dirichlet process mixture model. Bayesian inference for gene expression and proteomics 4 , 201–218

work page 2006

[5] [5]

Eisenhauer, E. A., P. Therasse, J. Bogaerts, L. H. Schwartz, D. Sargent, R. Ford, J. Dancey, S. Arbuck, S. Gwyther, M. Mooney, et al. (2009). New response evaluation criteria in solid tumours: revised recist guideline (version 1.1). European journal of cancer 45 (2), 228–247

work page 2009

[6] [6]

Bhattacharya, and D

Geng, J., A. Bhattacharya, and D. Pati (2019). Probabilistic community detection with unknown number of communities. Journal of the American Statistical Association 114 (526), 893–905

work page 2019

[7] [7]

Geng, J. and E. H. Slate (2020). Discovery among binary biomarkers in heterogeneous populations. In Statistical Modeling in Biomedical Research , pp. 213–232. Springer

work page 2020

[8] [8]

On posterior contraction of parameters and interpretability in Bayesian mixture modeling

Guha, A., N. Ho, and X. Nguyen (2019). On posterior contraction of parameters and interpretability in bayesian mixture modeling. arXiv preprint arXiv:1901.05078

work page internal anchor Pith review Pith/arXiv arXiv 2019

[9] [9]

Hyman, D. M., I. Puzanov, V . Subbiah, J. E. Faris, I. Chau, J.-Y . Blay, J. Wolf, N. S. Raje, E. L. Diamond, A. Hollebecque, et al. (2015). Vemurafenib in multiple nonmelanoma cancers with braf v600 mutations. New England Journal of Medicine 373 (8), 726–736

work page 2015

[10] [10]

Lin, N. U., E. Q. Lee, H. Aoyama, I. J. Barani, D. P. Barboriak, B. G. Baumert, M. Bendszus, P. D. Brown, D. R. Camidge, S. M. Chang, et al. (2015). Response assessment criteria for brain metastases: proposal from the rano group. The lancet oncology 16 (6), e270–e278. 17

work page 2015

[11] [11]

Miller, J. W. and M. T. Harrison (2018). Mixture models with a prior on the number of components.Journal of the American Statistical Association 113 (521), 340–356

work page 2018

[12] [12]

Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics 9 (2), 249–265

work page 2000

[13] [13]

Wandel, S

Neuenschwander, B., S. Wandel, S. Roychoudhury, and S. Bailey (2016). Robust exchangeability designs for early phase clinical trials with multiple strata. Pharmaceutical statistics 15(2), 123–134

work page 2016

[14] [14]

Thall, P. F., J. K. Wathen, B. N. Bekele, R. E. Champlin, L. H. Baker, and R. S. Benjamin (2003). Hierarchi- cal bayesian approaches to phase ii trials in diseases with multiple subtypes. Statistics in medicine 22(5), 763–780

work page 2003

[15] [15]

Woodcock, J. and L. M. LaVange (2017). Master protocols to study multiple therapies, multiple diseases, or both. New England Journal of Medicine 377 (1), 62–70

work page 2017

[16] [16]

M¨uller, A

Xu, Y ., P. M¨uller, A. M. Tsimberidou, and D. Berry (2019). A nonparametric Bayesian basket trial design. Biometrical Journal 61(5), 1160–1174

work page 2019

[17] [17]

Hu, and W

Yin, F., G. Hu, and W. Shen (2020). Analysis of professional basketball ﬁeld goal attempts via a Bayesian matrix clustering approach. arXiv preprint arXiv:2010.08495. 18

work page arXiv 2020