pith. sign in

arxiv: 2605.15847 · v1 · pith:5JIWX75Fnew · submitted 2026-05-15 · 📊 stat.ME · stat.CO

Bayesian Inference for Non-Conjugate Distance Dependent Chinese Restaurant Process Models

Pith reviewed 2026-05-20 16:11 UTC · model grok-4.3

classification 📊 stat.ME stat.CO
keywords distance dependent Chinese restaurant processreversible jump MCMCBayesian inferencenon-conjugate likelihoodsclusteringmoment matchingtrans-dimensional sampling
0
0 comments X

The pith

Reversible jump MCMC enables Bayesian inference for distance-dependent Chinese restaurant processes with non-conjugate likelihoods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a reversible jump Markov chain Monte Carlo method for Bayesian inference in distance-dependent Chinese restaurant process models when cluster parameters lack conjugacy to the likelihood. Updating cluster assignments then changes the dimension of the parameter space, which defeats standard samplers. The authors introduce birth and death proposals including prior-based, independence, and data-driven moment-matching versions that target high-posterior regions, plus a posterior-resampling step for fixed-dimension moves. They demonstrate the approach on both discrete and continuous data examples. A sympathetic reader would care because the ddCRP prior already incorporates pairwise distance covariates into flexible clustering, yet non-conjugate likelihoods had previously blocked routine use of this prior.

Core claim

The authors develop a reversible jump Markov chain Monte Carlo framework to address the challenge of trans-dimensional parameter spaces in non-conjugate ddCRP models. They introduce and compare proposal strategies for birth and death moves, such as prior-based, independence, and data-driven moment-matching proposals that target high posterior density regions. For fixed-dimensional moves, they propose a posterior resampling strategy to improve acceptance rates. Simulations and an application to Old Faithful eruption durations demonstrate the effectiveness of moment-matched proposals.

What carries the argument

Reversible jump MCMC framework using birth and death moves with moment-matching proposals to update cluster parameters in trans-dimensional spaces.

If this is right

  • The framework applies to ddCRP models with both discrete and continuous observation models.
  • Moment-matched proposals supply a data-driven alternative that targets regions of high posterior density.
  • Posterior resampling raises acceptance rates for fixed-dimensional moves while preserving computational efficiency.
  • The overall methodology supplies a general RJMCMC approach for non-conjugate ddCRP models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same proposal ideas could be tested on other covariate-dependent clustering priors that produce trans-dimensional moves.
  • If mixing remains reliable on larger datasets, the method would support routine use of distance covariates in non-conjugate settings such as spatial or survival data.
  • Performance comparisons against alternative trans-dimensional samplers on the same examples would clarify relative advantages.

Load-bearing premise

The moment-matching and other proposal strategies achieve sufficient acceptance rates and mixing in the trans-dimensional birth-death moves without requiring extensive tuning or getting trapped in local modes.

What would settle it

Running the sampler on the Old Faithful data or on simulated datasets and finding persistently low acceptance rates or poor mixing across different numbers of clusters would show the proposals do not work as claimed.

Figures

Figures reproduced from arXiv: 2605.15847 by Joseph Marsh, Rowland G. Seymour, Theodore Kypraios.

Figure 1
Figure 1. Figure 1: Simulated Poisson data for the overlapping scenario ( [PITH_FULL_IMAGE:figures/full_fig_p013_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Trace plots for the Poisson simulated data. Top: number of clusters [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Marginal posterior distribution 𝑃 (𝐾 ∣ c, y) for all sampler configurations. The dashed vertical line marks the true number of clusters 𝐾∗ = 3. 16 [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Posterior inference on the number of clusters [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: MAP cluster assignments for the Old Faithful dataset ( [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Posterior predictive check for the Old Faithful analysis. Red points are observed [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗
read the original abstract

The distance dependent Chinese Restaurant Process (ddCRP) provides a flexible prior distribution for clustering observations, incorporating covariate information through pairwise distances and accommodating a rich variety of cluster structures. When cluster parameters are conjugate to the likelihood, Bayesian inference is straightforward. In the non-conjugate setting, however, inference becomes substantially more challenging due to the trans-dimensional parameter spaces that arise as cluster assignments change. We develop a reversible jump Markov chain Monte Carlo (RJMCMC) framework to address this challenge, targeting the dimension-changing nature of cluster parameter vectors when observation assignments are updated. We introduce and compare several proposal strategies for birth and death moves, including prior-based, independence, and data-driven moment-matching proposals that target regions of high posterior density. For fixed-dimensional moves, we propose a posterior resampling strategy that improves acceptance rates while maintaining computational efficiency. Through a simulation study and an application to Old Faithful eruption durations, we demonstrate moment-matched proposals offer a principled, data-driven alternative to prior-based proposals. The resulting methodology provides a general RJMCMC framework for ddCRP models with non-conjugate likelihoods, demonstrated here on both discrete and continuous observation models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper develops a reversible jump Markov chain Monte Carlo (RJMCMC) framework for Bayesian inference in distance-dependent Chinese Restaurant Process (ddCRP) models under non-conjugate likelihoods. It introduces and compares proposal strategies for birth-death moves (prior-based, independence, and moment-matching) along with a posterior resampling approach for fixed-dimensional updates, and demonstrates the method via a simulation study and an application to Old Faithful eruption durations.

Significance. If the moment-matching proposals deliver reliable acceptance rates and mixing, the work would supply a practical general-purpose RJMCMC tool for covariate-informed clustering when conjugate priors are unavailable. The explicit comparison of proposal families and the coverage of both discrete and continuous observation models constitute clear strengths; the manuscript also correctly grounds the construction in standard RJMCMC theory.

major comments (2)
  1. [Simulation study and real-data application] Simulation study and real-data application sections: quantitative acceptance rates, effective sample sizes, and mixing diagnostics for the birth-death moves (especially the moment-matching proposals) are not reported. Without these numbers it is impossible to verify that the proposals traverse the space of cluster assignments and parameter vectors at usable rates for the claimed general framework.
  2. [Abstract and methodology] Abstract and methodology sections: the claim that moment-matching proposals constitute a 'principled, data-driven alternative' for arbitrary non-conjugate likelihoods rests on only two observation models. No evidence is supplied that the same construction (or simple variant) continues to work when the conditional posterior for a new cluster parameter is skewed, multimodal, or high-dimensional, which directly affects the load-bearing assumption that the framework is general.
minor comments (2)
  1. [Abstract] The abstract would benefit from a single sentence reporting the range of acceptance rates observed for the moment-matching proposals.
  2. [Methodology] Notation for the proposal densities and the acceptance probability ratio should be made fully explicit in the RJMCMC algorithm description to facilitate reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and positive overall assessment of the manuscript. We address each major comment point by point below, indicating the revisions that will be incorporated.

read point-by-point responses
  1. Referee: [Simulation study and real-data application] Simulation study and real-data application sections: quantitative acceptance rates, effective sample sizes, and mixing diagnostics for the birth-death moves (especially the moment-matching proposals) are not reported. Without these numbers it is impossible to verify that the proposals traverse the space of cluster assignments and parameter vectors at usable rates for the claimed general framework.

    Authors: We agree that quantitative diagnostics are necessary to demonstrate the practical performance of the RJMCMC moves. In the revised manuscript we will report acceptance rates for birth and death moves under each proposal family, effective sample sizes for the cluster parameters, and mixing diagnostics (trace plots and autocorrelation times) from both the simulation study and the Old Faithful application. These additions will directly address the concern about usable traversal rates. revision: yes

  2. Referee: [Abstract and methodology] Abstract and methodology sections: the claim that moment-matching proposals constitute a 'principled, data-driven alternative' for arbitrary non-conjugate likelihoods rests on only two observation models. No evidence is supplied that the same construction (or simple variant) continues to work when the conditional posterior for a new cluster parameter is skewed, multimodal, or high-dimensional, which directly affects the load-bearing assumption that the framework is general.

    Authors: The moment-matching construction is derived without reference to conjugacy and relies only on the availability of the first two moments of the conditional posterior for a new cluster; this derivation itself is general. The two observation models (discrete and continuous) serve as representative cases. We nevertheless accept that the Gaussian approximation may be inadequate for strongly skewed, multimodal or high-dimensional posteriors. We will revise the abstract and methodology sections to qualify the generality statement and add a short discussion of these limitations together with possible extensions. revision: partial

Circularity Check

0 steps flagged

No significant circularity in the RJMCMC framework for non-conjugate ddCRP

full rationale

The paper develops a reversible jump MCMC sampler for ddCRP models under non-conjugate likelihoods by applying standard RJMCMC theory to handle dimension-changing cluster parameters. Proposal mechanisms (prior, independence, moment-matching) are constructed as standard Metropolis-Hastings proposals whose acceptance ratios follow directly from the target posterior; moment-matching is a heuristic approximation to the conditional posterior rather than a fitted parameter renamed as a prediction. Validation occurs via separate simulation studies and a real-data application (Old Faithful), which are external to the derivation. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the core chain. The methodology rests on externally verifiable MCMC correctness rather than reducing to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the standard construction of the ddCRP prior, the validity of reversible jump MCMC for trans-dimensional sampling, and the assumption that the chosen proposal distributions can adequately explore the posterior.

axioms (2)
  • domain assumption The distance-dependent Chinese restaurant process defines a valid exchangeable random partition when pairwise distances are supplied.
    Invoked as the modeling foundation for incorporating covariate information into clustering.
  • standard math Reversible jump MCMC can be constructed to satisfy detailed balance for the target posterior when birth and death moves are properly designed.
    Standard theoretical justification for the dimension-changing sampler.

pith-pipeline@v0.9.0 · 5730 in / 1348 out tokens · 56567 ms · 2026-05-20T16:11:34.516741+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

  1. [1]

    , year =

    Silverman, Bernard W. , year =. Multiple-Systems Analysis for the Quantification of Modern Slavery: Classical and. Journal of the Royal Statistical Society Series A: Statistics in Society , publisher =. doi:10.1111/rssa.12505 , number =

  2. [2]

    Proceedings of the 31st International Conference on Machine Learning , series =

    Bartunov, Sergey and Vetrov, Dmitry , title =. Proceedings of the 31st International Conference on Machine Learning , series =. 2014 , publisher =

  3. [3]

    Marketing Science , publisher =

    Dew, Ryan and Ansari, Asim , year =. Marketing Science , publisher =. doi:10.1287/mksc.2017.1050 , number =

  4. [4]

    Multiple System Estimation of Victims of Human Trafficking: Model Assessment and Selection , volume =

    Cruyff, Maarten and Overstall, Antony and Papathomas, Michail and McCrea, Rachel , year =. Multiple System Estimation of Victims of Human Trafficking: Model Assessment and Selection , volume =. Crime & Delinquency , publisher =. doi:10.1177/0011128720981908 , number =

  5. [5]

    Estimating Mixed Memberships With Sharp Eigenvector Deviations

    Riva-Palacio, Alan and Leisen, Fabrizio and Griffin, Jim , year =. Survival Regression Models With Dependent. Journal of the American Statistical Association , publisher =. doi:10.1080/01621459.2020.1864381 , number =

  6. [6]

    and Kypraios, Theodore and O’Neill, Philip D

    Seymour, Rowland G. and Kypraios, Theodore and O’Neill, Philip D. , year =. Proceedings of the National Academy of Sciences , publisher =. doi:10.1073/pnas.2118425119 , number =

  7. [7]

    and Dank, Meredith , year =

    Vincent, Kyle and Zhang, Sheldon X. and Dank, Meredith , year =. Searching for Sex Trafficking Victims: Using a Novel Link-Tracing Method Among Commercial Sex Workers in Muzaffarpur, India , volume =. Crime & Delinquency , publisher =. doi:10.1177/0011128719890265 , number =

  8. [8]

    A partially pooled network scale-up method model: detailed estimation of child sexual exploitation material trafficking prevalence in Philippine municipalities , ISSN =

    Nyarko-Agyei, Albert and Moser, Scott and Seymour, Rowland G and Brewster, Ben and Li, Sabrina L and Weir, Esther and Landman, Todd and Wyman, Emily and Torres, Christine Belle and Fell, Imogen and Boyd, Doreen , year =. A partially pooled network scale-up method model: detailed estimation of child sexual exploitation material trafficking prevalence in Ph...

  9. [9]

    Blei and Peter I

    David M. Blei and Peter I. Frazier , title =. Journal of Machine Learning Research , year =

  10. [10]

    , title =

    Green, Peter J. , title =. Biometrika , volume =. 1995 , month =. doi:10.1093/biomet/82.4.711 , eprint =

  11. [11]

    Ferguson , journal =

    Thomas S. Ferguson , journal =. A

  12. [12]

    Exchangeability and related topics

    Aldous, David J. Exchangeability and related topics. \'E cole d' \'E t \'e de Probabilit \'e s de Saint-Flour XIII --- 1983. 1985

  13. [13]

    Probability Theory and Related Fields , author =

    Exchangeable and partially exchangeable random partitions , volume =. Probability Theory and Related Fields , author =. 1995 , pages =. doi:10.1007/BF01213386 , language =

  14. [14]

    Escobar and Mike West , journal =

    Michael D. Escobar and Mike West , journal =

  15. [15]

    2000 , pages =

    Journal of Computational and Graphical Statistics , author =. 2000 , pages =. doi:10.1080/10618600.2000.10474879 , language =

  16. [16]

    and Green, Peter J

    Richardson, Sylvia. and Green, Peter J. , title =. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume =. doi:https://doi.org/10.1111/1467-9868.00095 , url =. https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/1467-9868.00095 , year =

  17. [17]

    Journal of Computational and Graphical Statistics , volume =

    Sonia Jain and Radford M Neal , title =. Journal of Computational and Graphical Statistics , volume =. 2004 , publisher =

  18. [18]

    Neal , title =

    Sonia Jain and Radford M. Neal , title =. 2007 , doi =

  19. [19]

    Journal of Machine Learning Research , year =

    Aki Vehtari and Daniel Simpson and Andrew Gelman and Yuling Yao and Jonah Gabry , title =. Journal of Machine Learning Research , year =

  20. [20]

    ADAPTIVELY SCALING THE METROPOLIS ALGORITHM USING EXPECTED SQUARED JUMPED DISTANCE , volume =

    Cristian Pasarica and Andrew Gelman , journal =. ADAPTIVELY SCALING THE METROPOLIS ALGORITHM USING EXPECTED SQUARED JUMPED DISTANCE , volume =

  21. [21]

    Mixtures of D irichlet processes with applications to B ayesian nonparametric problems

    Antoniak, Charles E. Mixtures of D irichlet processes with applications to B ayesian nonparametric problems. Ann. Stat

  22. [22]

    Ferguson distributions via P olya urn schemes

    Blackwell, David and MacQueen, James B. Ferguson distributions via P olya urn schemes. Ann. Stat

  23. [23]

    The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator

    Pitman, Jim and Yor, Marc. The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. Ann. Probab

  24. [24]

    , title =

    Sethuraman, J. , title =. Statistica Sinica , year =

  25. [25]

    Gibbs sampling methods for stick-breaking priors

    Ishwaran, Hemant and James, Lancelot F. Gibbs sampling methods for stick-breaking priors. J. Am. Stat. Assoc

  26. [26]

    , title =

    MacEachern, Steven N. , title =. ASA Proceedings of the Section on. 1999 , publisher =

  27. [27]

    Random partition distribution indexed by pairwise information

    Dahl, David B and Day, Ryan and Tsai, Jerry W. Random partition distribution indexed by pairwise information. J. Am. Stat. Assoc

  28. [28]

    An ANOVA model for dependent random measures

    De Iorio, Maria and M \"u ller, Peter and Rosner, Gary L and MacEachern, Steven N. An ANOVA model for dependent random measures. J. Am. Stat. Assoc

  29. [29]

    Hierarchical D irichlet processes

    Teh, Yee Whye and Jordan, Michael I and Beal, Matthew J and Blei, David M. Hierarchical D irichlet processes. J. Am. Stat. Assoc

  30. [30]

    Order-based dependent D irichlet processes

    Griffin, J E and Steel, M F J. Order-based dependent D irichlet processes. J. Am. Stat. Assoc

  31. [31]

    The dependent D irichlet process and related models

    Quintana, Fernando A and M \"u ller, Peter and Jara, Alejandro and MacEachern, Steven N. The dependent D irichlet process and related models. Stat. Sci

  32. [32]

    and Ungureanu, A

    Ghosh, S. and Ungureanu, A. B. and Sudderth, E. B. and Blei, D. M. , title =. Advances in Neural Information Processing Systems , volume =

  33. [33]

    and Phung, D

    Li, C. and Phung, D. and Rana, S. and Venkatesh, S. , title =. Proceedings of the 2013 IEEE International Conference on Multimedia and Expo , year =

  34. [34]

    Data clustering using side information dependent C hinese restaurant processes

    Li, Cheng and Rana, Santu and Phung, Dinh and Venkatesh, Svetha. Data clustering using side information dependent C hinese restaurant processes. Knowl. Inf. Syst

  35. [35]

    , title =

    Dahl, David B. , title =. 2005 , type =

  36. [36]

    Bayesian analysis of mixture models with an unknown number of components---an alternative to reversible jump methods

    Stephens, Matthew. Bayesian analysis of mixture models with an unknown number of components---an alternative to reversible jump methods. Ann. Stat

  37. [37]

    Mixture models with a prior on the number of components

    Miller, Jeffrey W and Harrison, Matthew T. Mixture models with a prior on the number of components. J. Am. Stat. Assoc

  38. [38]

    Transdimensional Markov chains

    Sisson, Scott A. Transdimensional Markov chains. J. Am. Stat. Assoc