pith. sign in

arxiv: 2512.22098 · v4 · pith:A7XOYFSFnew · submitted 2025-12-26 · 📊 stat.ME · math.PR· math.ST· q-bio.PE· stat.CO· stat.TH

Exact inference via quasi-conjugacy in two-parameter Poisson-Dirichlet hidden Markov models

Pith reviewed 2026-05-21 16:17 UTC · model grok-4.3

classification 📊 stat.ME math.PRmath.STq-bio.PEstat.COstat.TH
keywords Poisson-Dirichlet diffusionhidden Markov modelexact inferencequasi-conjugacycoagulation operatorpure-death processunlabelled partitionsnonparametric Bayesian
0
0 comments X

The pith

A duality to a pure-death process on partitions yields closed-form recursive inference for two-parameter Poisson-Dirichlet hidden Markov models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a nonparametric hidden Markov model in which the hidden states are time-evolving probability distributions generated by a two-parameter Poisson-Dirichlet diffusion, and the observations are unlabelled partitions obtained by exchangeable sampling. To handle the intractable likelihood that would otherwise require label enumeration or simulation, the authors exploit an explicit duality between the diffusion and a pure-death process on partitions. Coagulation operators translate the arrival of each new observation into a simple update rule. The resulting forward and backward recursions deliver exact posterior distributions for the latent distribution at any time point together with exact predictive distributions for future or missing partitions.

Core claim

By combining the duality of the two-parameter Poisson-Dirichlet diffusion with a pure-death process on partitions and the action of coagulation operators, the model admits closed-form recursive updates that compute the exact posterior of the latent state at arbitrary times and the exact predictive distribution of future or interpolated partitions, thereby enabling online and offline inference and forecasting with full uncertainty quantification without MCMC or sequential Monte Carlo.

What carries the argument

The duality between the two-parameter Poisson-Dirichlet diffusion and a pure-death process on partitions, together with coagulation operators that encode the effect of new data, produces the closed-form recursive updates.

If this is right

  • Exact posterior distributions of the latent state become available at every time point.
  • Predictive distributions for future observations and for interpolated partitions can be obtained in closed form.
  • Both online filtering and offline smoothing run in linear time without simulation.
  • Uncertainty quantification is obtained directly from the exact marginals rather than from particle approximations.
  • The method reports higher accuracy and lower variance than particle filtering on the same data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same duality-plus-coagulation construction may supply exact inference for other exchangeable partition-valued processes that admit a similar death-process representation.
  • The recursive structure could be embedded inside larger hierarchical models where the parameters of the Poisson-Dirichlet diffusion themselves carry hyperpriors.
  • Because the updates are deterministic and exact, they offer a natural benchmark for testing the accuracy of approximate methods on partition data.

Load-bearing premise

The duality and coagulation operators must continue to produce exact closed-form updates when applied to the observed partitions.

What would settle it

Run the exact recursive procedure and an independent long MCMC chain on the same small simulated dataset with known true latent partitions; the two posterior distributions over the latent state at a chosen time must differ by more than Monte Carlo error if the claim is false.

Figures

Figures reproduced from arXiv: 2512.22098 by Dario Span\`o, Marco Dalla Pria, Matteo Ruggiero.

Figure 1
Figure 1. Figure 1: Unlabelled partitions obtained over four intervals from the Infectious dataset. 2005). Classical cases admit exact filters—Kalman, Baum–Welch, Wonham (Kalman, 1960; Baum and Petrie, 1966; Wonham, 1965)—and later work extends to certain diffusions (Chaleyat-Maurel and Genon-Catalot, 2006, 2009; Papaspiliopoulos and Ruggiero, 2014). In more complex settings, inference typically relies on sequential Monte Car… view at source ↗
Figure 2
Figure 2. Figure 2: Left: P4 = {(1, 1, 1, 1),(2, 1, 1),(2, 2),(3, 1),(4)} represented via Young diagrams. Right: an indexed coagulation of ω = (2, 1, 1) and γ = (3, 2, 1, 1), yielding µ = (4, 3, 2, 1, 1). tween observations over time. Together, these results form the basis for the filtering and smoothing methods developed in Sections 3–4; full proofs and technical details are deferred to Section A.2. 2.1 The Ewens–Pitman samp… view at source ↗
Figure 3
Figure 3. Figure 3: Examples of coagulation sets coag(Ω, Γ) (orange) from Ω (yellow) and Γ (red). The notation a b means b parts equal to a. Left: Ω = {(12 )}, Γ = {(2)}. Center: Ω = {(1, 1)}, Γ = {(2, 1)}. Right: Ω = {(1), (12 )}, Γ = {(2)}; coagulation occurs only between Ω and Γ, hence µ = (3) arises from merging ω = (1) with γ = (2). Any indexed coagulation of two partitions thus produces a partition µ coarsening their jo… view at source ↗
Figure 4
Figure 4. Figure 4: Latent partition support in recursive filtering via prediction and update. Left: initial support Λ = {(2, 1)} at t0. Center: prediction expands Λ this to L(Λ) using the dual process. Right: update with π 1 = (1, 1) produces coag(L(Λ), π1 ) as the new support. Each node represents a component PDλ α,θ. Offline estimation (smoothing). While filtering tracks the evolving law of the latent state X(tk) using dat… view at source ↗
Figure 5
Figure 5. Figure 5: Smoothing via forward-backward latent structure. The forward latent partition support Λ0:k (left) is coagulated with the backward latent partition support Ωk:N (right), to form the coagulation set coag(Λ0:k, Ωk:N ) (middle). The latter supports the latent partitions in the posterior law of X(tk)|Π0:N . Forecasting and interpolation. The same recursive structure extends naturally to inference at unobserved … view at source ↗
Figure 6
Figure 6. Figure 6: Recursive inference in the hidden PD model. Arrows denote prediction (solid blue), Bayesian update (dashed red), interpolation (dashed green), and backward smoothing (dashed purple). the posterior distribution of heterozygosity at time t is itself a finite mixture of the corresponding distributions induced by the PDλ α,θ laws, from which sampling is straightforward. This strategy is implemented in Section … view at source ↗
Figure 7
Figure 7. Figure 7: Left: Runtime (seconds, log scale) of Algorithm 6 for sample sizes |γ| = 10, 30, 50, and various pruning strategies (top-weighted components or total mass thresholds). Filtering draws 104 samples after each update using Algorithm 4. Param￾eters: α = 0.1, θ = 1.5, N = 9, ∆k = 0.2 for all k, with PDα,θ truncation threshold ε = 0.005. Right: Associated negatively oriented interval scores (Equation (12)) based… view at source ↗
Figure 8
Figure 8. Figure 8: Posterior inference for heterozygosity using filtering (top left), smoothing (top right), bootstrap particle filter (bottom left) and independent priors (bottom right). Black line: true H2 trajectory; white diamonds: observed heterozygosity Hˆ2(π k). Violin plots show posterior densities; dashed lines show 95% credible intervals. 4.4 Numerical experiments We illustrate our methodology using synthetic data … view at source ↗
Figure 9
Figure 9. Figure 9: Interval score comparison across five methods. Filtering and smoothing based on MLEs perform comparably to oracle methods. Bootstrap particle filtering yields less accurate results besides being more demanding computationally. Independent priors yield lower accuracy due to lack of temporal borrowing. the BPF (bottom left panel) performs reasonably well, it requires expensive simulation of PDα,θ￾diffusion p… view at source ↗
Figure 10
Figure 10. Figure 10: Top: online filtering (left) and offline smoothing (right) of heterozygosity for the Infectious dataset. Bottom: corresponding posteriors from a bootstrap particle filter (left) and independent priors (right). Violin plots: posterior densities; triangles: 95% credible intervals; white diamonds: observed heterozygosities. All methods use global MLEs (ˆα, θˆ) except the independent prior approach, which est… view at source ↗
Figure 11
Figure 11. Figure 11: Proportion of filtering mixture components needed to reach 90%, 95%, and 99% cumulative probability. Smoothing. The smoothed law νk|0:N from Theorem 4 combines information from past data π 0:k and future data π k+1:N . For any measurable A ⊂ ∇, νk|0:N (A) = Z A νk|0:k (dx) Pr Π k+1 = π k+1 , . . . , Π N = π N |X(tk) = x  , (cf. (33)), which separates the forward contribution νk|0:k (Theorem 3) from the b… view at source ↗
Figure 12
Figure 12. Figure 12: Filtering (left) and smoothing (right) posterior estimates for the three largest coordinates of the system. Violin plots with dashed intervals show posterior distributions and 95% credible intervals; solid black lines denote the true values; diamonds indicate observed relative frequencies. In Section 4.5 of the main text we noted that the bootstrap particle filter (BPF) can produce multimodal posterior de… view at source ↗
Figure 13
Figure 13. Figure 13: Marginal filtering densities at the fifth observation time, comparing the bootstrap particle filter (pink) with the dual filter (blue) under two kernel bandwidths (left: 0.001; right: 0.01). The vertical dashed line marks the observed heterozygosity. prohibitive in practice. References Ascolani, F., Lijoi, A., and Ruggiero, M. (2023). Smoothing distributions for conditional Fleming– Viot and Dawson–Watana… view at source ↗
read the original abstract

We introduce a nonparametric model for inferring time-evolving, unobserved probability distributions from discrete-time data consisting of unlabelled partitions. The latent process is a two-parameter Poisson-Dirichlet diffusion, and observations arise via exchangeable sampling. Applications include social and genetic data where only aggregate clustering summaries are observed. To address the intractable likelihood, we develop a tractable inferential framework that avoids label enumeration and direct simulation of the latent state. We exploit a duality between the diffusion and a pure-death process on partitions, together with coagulation operators that encode the effect of new data. These yield closed-form, recursive updates for forward and backward inference. We compute exact posterior distributions of the latent state at arbitrary times and predictive distributions of future or interpolated partitions. This enables online and offline inference and forecasting with full uncertainty quantification, bypassing MCMC and sequential Monte Carlo. Compared to particle filtering, our method achieves higher accuracy, lower variance, and substantial computational gains. We illustrate the methodology with synthetic experiments and a social network application, recovering interpretable patterns in time-varying heterozygosity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces a hidden Markov model with latent two-parameter Poisson-Dirichlet diffusion for time-evolving probability distributions, observed via exchangeable sampling that produces unlabelled partitions. It develops an exact inferential framework by exploiting a duality between the diffusion and a pure-death process on partitions, together with coagulation operators that encode the effect of new observations. These yield closed-form recursive forward and backward updates, permitting exact posterior distributions of the latent state at arbitrary times and exact predictive distributions for future or interpolated partitions. The approach is claimed to enable online/offline inference and forecasting with full uncertainty quantification while bypassing MCMC and sequential Monte Carlo; empirical comparisons show higher accuracy and lower variance than particle filtering, with illustrations on synthetic data and a social-network application.

Significance. If the claimed exact closed-form recursions hold without hidden discretization or moment closure, the work would constitute a meaningful advance in nonparametric Bayesian dynamic modeling. It supplies a tractable route to full posterior inference for partition-valued time series, which is relevant to genetics and social-science applications where only aggregate clustering summaries are observed. The reported computational gains and improved accuracy relative to particle methods would be practically useful if the exactness is rigorously verified.

major comments (2)
  1. [§3] §3 (Duality construction): the duality between the two-parameter PD diffusion and the pure-death process must be shown to remain closed under the coagulation operators for general α, θ; the death rates and coagulation kernels are known to be more intricate than the one-parameter case, and any mismatch would force truncation or loss of the Markov property, breaking the exact recursive forward-backward updates asserted in the abstract.
  2. [Theorem 4.1] Theorem 4.1 / Eq. (12): the claim that the coagulation operators yield closed-form parameter updates for the posterior must be accompanied by the explicit functional form of the updated (α, θ) or partition-valued state after an arbitrary exchangeable sample; without this, it is impossible to confirm that no numerical integration or enumeration is required.
minor comments (2)
  1. [Figure 3] Figure 3 caption: the description of the recovered time-varying heterozygosity patterns could be expanded to indicate which features are directly attributable to the exact posterior versus the model assumptions.
  2. [§2.2] Notation in §2.2: the symbol for the coagulation kernel should be introduced with an explicit reference to the exchangeable partition probability function it acts upon.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review of our manuscript. The comments highlight important points regarding the rigor of the duality construction and the explicitness of the update rules. We respond to each major comment below and indicate where revisions will be made to improve clarity and verifiability.

read point-by-point responses
  1. Referee: [§3] §3 (Duality construction): the duality between the two-parameter PD diffusion and the pure-death process must be shown to remain closed under the coagulation operators for general α, θ; the death rates and coagulation kernels are known to be more intricate than the one-parameter case, and any mismatch would force truncation or loss of the Markov property, breaking the exact recursive forward-backward updates asserted in the abstract.

    Authors: We appreciate the referee drawing attention to this aspect. Section 3 derives the duality explicitly for the two-parameter Poisson-Dirichlet diffusion by computing the death rates and coagulation kernels for general α and θ. We show that these operators preserve the required structure: applying coagulation to the current partition state yields a new partition whose subsequent evolution under the pure-death process matches the updated diffusion dynamics. This closure follows from the quasi-conjugacy relation, which ensures the Markov property is retained without discretization or truncation. The supplementary material contains the generator calculations confirming commutation for the general case. To make this verification more immediate, we will insert a short lemma in the revised Section 3 that isolates the closure argument under coagulation for arbitrary α, θ. revision: partial

  2. Referee: [Theorem 4.1] Theorem 4.1 / Eq. (12): the claim that the coagulation operators yield closed-form parameter updates for the posterior must be accompanied by the explicit functional form of the updated (α, θ) or partition-valued state after an arbitrary exchangeable sample; without this, it is impossible to confirm that no numerical integration or enumeration is required.

    Authors: We agree that the explicit functional form strengthens the claim. Theorem 4.1 and Equation (12) present the recursive structure, while the proof supplies the concrete mapping: after an exchangeable sample, the updated parameters (α, θ) and the partition state are obtained by adding the new clusters induced by the observations and applying the coagulation operator to the current latent partition. These operations are algebraic on the partition representation and require neither integration nor label enumeration. In the revision we will extract the explicit update rules from the proof and state them directly in the theorem (or as a corollary) so that readers can verify the closed-form character without consulting the supplementary material. revision: yes

Circularity Check

0 steps flagged

Derivation chain is self-contained via independent duality and operators

full rationale

The paper develops exact forward-backward recursions for the two-parameter Poisson-Dirichlet HMM by invoking a duality to a pure-death process on partitions together with coagulation operators that encode exchangeable observations. These structures are introduced as mathematical properties of the latent diffusion and sampling model rather than being defined in terms of the target posterior or predictive quantities. No equation or step in the abstract or described framework reduces a claimed prediction to a fitted parameter, a self-referential definition, or a load-bearing self-citation whose validity is presupposed by the present work. The central claim therefore remains independent of its own outputs and does not trigger any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 1 invented entities

The central claim depends on the validity of the duality between the diffusion and pure-death process, and the quasi-conjugacy property allowing closed-form updates.

free parameters (1)
  • two parameters of the Poisson-Dirichlet diffusion
    The latent process is defined as a two-parameter Poisson-Dirichlet diffusion whose specific parameter values are part of the model specification.
axioms (2)
  • domain assumption Observations arise via exchangeable sampling from the latent distributions.
    This is stated as the observation model in the abstract.
  • domain assumption The latent process is a two-parameter Poisson-Dirichlet diffusion.
    Central modeling choice for the hidden Markov model.
invented entities (1)
  • coagulation operators no independent evidence
    purpose: To encode the effect of new data in the inference updates.
    Introduced as part of the tractable inferential framework.

pith-pipeline@v0.9.0 · 5734 in / 1557 out tokens · 69259 ms · 2026-05-21T16:17:37.318876+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

6 extracted references · 6 canonical work pages

  1. [1]

    Ascolani, F., Lijoi, A., and Ruggiero, M. (2023). Smoothing distributions for conditional Fleming– Viot and Dawson–Watanabe diffusions.Bernoulli, 29(2):1410–1434. Balocchi, C., Favaro, S., and Naulet, Z. (2022). Bayesian nonparametric inference for” species- sampling” problems.arXiv preprint arXiv:2203.06076. Baum, L. E. and Petrie, T. (1966). Statistical...

  2. [2]

    and Fitzpatrick, M

    Bunge, J. and Fitzpatrick, M. (1993). Estimating the number of species: a review.Journal of the American Statistical Association, 88(421):364–373. Capp´ e, O., Moulines, E., and Ryd´ en, T. (2005).Inference in Hidden Markov Models. Springer series in statistics. Caron, F., Neiswanger, W., Wood, F., Doucet, A., and Davy, M. (2017). Generalized Polya Urn fo...

  3. [3]

    and Genon-Catalot, V

    Chaleyat-Maurel, M. and Genon-Catalot, V. (2006). Computable infinite-dimensional filters with applications to discretized diffusion processes.Stochastic processes and their applications, 116(10):1447–1467. Chaleyat-Maurel, M. and Genon-Catalot, V. (2009). Filtering the Wright-Fisher diffusion.ESAIM: Probability and Statistics, 13:197–217. Chatzis, S. P. ...

  4. [4]

    Costantini, C., De Blasi, P., Ethier, S

    Springer. Costantini, C., De Blasi, P., Ethier, S. N., Ruggiero, M., and Spano, D. (2017). Wright–Fisher construction of the two-parameter Poisson–Dirichlet diffusion.The Annals of Applied Probability, pages 1923–1950. Ethier, S. N. (2014). A property of Petrov’s diffusion.Electron. Commun. Probab, 19:1–4. Feng, S. (2010).The Poisson-Dirichlet distributio...

  5. [5]

    Gael, J., Teh, Y., and Ghahramani, Z. (2008). The infinite factorial hidden Markov model.Advances in Neural Information Processing Systems,

  6. [6]

    Gillespie, D. T. (2007). Stochastic simulation of chemical kinetics.Annu. Rev. Phys. Chem., 58(1):35–55. Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American statistical Association, 102(477):359–378. Griffiths, R. C. (1984). Asymptotic line-of-descent distributions.Journal of Mathemati...