A Unification of Discrete, Gaussian, and Simplicial Diffusion

Alan N. Amin; Alex Ali; Andrew Gordon Wilson; Aniruddh Raghu; Joshua Rollins; Nuria Alina Chandra; Sebastian W. Ober; Yucen Lily Li

arxiv: 2512.15923 · v2 · submitted 2025-12-17 · 💻 cs.LG

A Unification of Discrete, Gaussian, and Simplicial Diffusion

Nuria Alina Chandra , Yucen Lily Li , Alan N. Amin , Alex Ali , Joshua Rollins , Sebastian W. Ober , Aniruddh Raghu , Andrew Gordon Wilson This is my paper

Pith reviewed 2026-05-16 21:16 UTC · model grok-4.3

classification 💻 cs.LG

keywords diffusion modelsdiscrete diffusionsimplicial diffusionWright-Fisher modelpopulation geneticsunificationDNA generationmulti-domain training

0 comments

The pith

Discrete, Gaussian, and simplicial diffusion arise as different parameterizations and large-population limits of the Wright-Fisher population genetics model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the three main diffusion approaches for discrete sequences such as DNA or language tokens are not independent techniques but instead correspond to distinct choices of parameterization within the same Wright-Fisher stochastic process. Simplicial diffusion and Gaussian diffusion appear specifically as limiting cases when the effective population size becomes large. This shared foundation makes it possible to translate results from mathematical genetics into stable numerical methods for simplicial diffusion and to train a single model whose test-time behavior can be switched among the three domains. Experiments confirm that the resulting Wright-Fisher simplicial diffusion is more stable than earlier simplex-based methods and that multi-domain training yields performance competitive with models trained on any single domain alone.

Core claim

All three major methods of diffusion for discrete sequences—discrete diffusion, Gaussian diffusion in Euclidean space, and diffusion on the simplex—are different parameterizations of the Wright-Fisher population genetics model. Simplicial and Gaussian diffusion emerge as two large-population limits of this process. The resulting theory formally connects the likelihoods and hyperparameters across the three families and supplies stable stochastic processes for simplicial diffusion drawn from the genetics literature. A single trained model can then perform diffusion in any of the three domains at test time.

What carries the argument

The Wright-Fisher model, a finite-population stochastic process from population genetics that tracks changes in allele frequencies under drift; it supplies the common dynamics whose specific discretizations and scaling limits recover the three diffusion schemes.

If this is right

Likelihoods and hyperparameters of discrete, Gaussian, and simplicial diffusion become formally interchangeable through their shared Wright-Fisher parameterization.
Stable numerical schemes for simplicial diffusion follow directly from existing mathematical genetics results.
A single model can be trained once and then deployed for diffusion in any of the three domains at test time.
Wright-Fisher simplicial diffusion achieves higher stability and better performance than prior simplicial methods on conditional DNA generation tasks.
Models trained jointly across domains remain competitive with models trained on any one domain separately.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Varying the effective population size parameter inside the Wright-Fisher framework could yield new families of diffusion schedules that interpolate continuously between the three regimes.
The large-population limits may clarify when Gaussian approximations remain accurate for discrete data and when they break down.
Tools developed for analyzing convergence rates in population-genetics models could be repurposed to study sampling efficiency and mixing times in diffusion generative models.
The unification suggests testing whether other discrete generative processes outside diffusion, such as certain autoregressive or flow models, also admit Wright-Fisher interpretations.

Load-bearing premise

The specific discretizations and stochastic processes used in existing discrete, Gaussian, and simplicial diffusion models match instances or limits of Wright-Fisher dynamics exactly, without extra approximations that would break equivalence of likelihoods and hyperparameters.

What would settle it

A side-by-side computation of exact transition probabilities or marginal likelihoods between a Wright-Fisher-derived simplicial process and a standard simplex diffusion implementation that shows systematic, non-negligible differences persisting even after population-size scaling is accounted for.

Figures

Figures reproduced from arXiv: 2512.15923 by Alan N. Amin, Alex Ali, Andrew Gordon Wilson, Aniruddh Raghu, Joshua Rollins, Nuria Alina Chandra, Sebastian W. Ober, Yucen Lily Li.

**Figure 1.** Figure 1: Discrete, Gaussian, and Simplicial diffusion for discrete data are unified by WrightFisher diffusion. (a) Wright-Fisher diffusion with population size ζ “ 6, showing mutation and reproduction processes across generations. (b) The three diffusion methods emerge as different limits of Wright-Fisher: discrete diffusion corresponds to ζ “ 1, while Gaussian and simplicial diffusion arise as ζ Ñ 8 with zero and… view at source ↗

**Figure 2.** Figure 2: Discrete diffusion with a large population converges to Gaussian diffusion. With ζ “ 1000, we show example trajectories p⃗xtqt that converge to approximate Gaussians near ⃗π. Proof idea: As ζ Ñ 8, by the law of large numbers, ⃗xt approaches ⃗xT 0 e τtL which itself goes to the stationary distribution of L. We can therefore decompose ⃗xt ´ ⃗π “ ⃗xT 0 e τ ζ t L ´ ⃗π looooomooooon signal `⃗xt ´ ⃗xT 0 e τ ζ… view at source ↗

**Figure 3.** Figure 3: The hollow parameterization leads to realistic reverse path samples. ζ “ 300. Loss comparison Thm. 4.1 suggests that there is virtually no difference to training a discrete diffusion model with ζ “ 10100 and training Gaussian diffusion with Alg. 2 on a computer, suggesting their ELBOs are comparable. Yet the limiting Gaussian ELBO is infinite! [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: emb of amino acids from BLOSUM L. embpx0q from Thm. 4.1 for L from Amin et al. [2]. Hyperparameter comparison Thm. 4.1 gives us a formula for emb determined by the slowest-decaying directions in L. App. E.4 also shows that every emb can be induced from some L. Remarkably, this connection accommodates embeddings in different dimensions R r : r is determined by the dimension of the dominant eigenspace of L. … view at source ↗

**Figure 5.** Figure 5: Improved simplicial diffusion performs accurate conditional DNA generation. We generate DNA samples of length 500 conditioned on accessibility with a classifier. (a) For an example target, we plot predicted accessibility profiles at the centre 150 positions of 5 example samples from each model. We smooth profiles with a bandwidth of 2. (b) For 1000 targets and 10 samples from each model, we plot the error … view at source ↗

**Figure 6.** Figure 6: The sufficient statistic parameterization represents ⃗xt from all diffusion models in the same space. 9 [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: The sufficient statistic parametrization enables a single model to perform competitive discrete, Gaussian, and simplicial diffusion. We compare individual models for each modality with a single unified model using the SSP. (a) We train on proteins and measure sample quality by predicted protein fold-ability (pLDDT). Each model was trained for the same amount of time. (b) We train on language and measure sa… view at source ↗

**Figure 8.** Figure 8: The argmax of Gaussian diffusion appears different from discrete diffusion in simulation, despite having the same marginals. We compare example paths of pppargmaxpwtqqtq (left, red; we show Gaussian diffusion wt in grey), pppz˜tqtq for uniform discrete diffusion (centre, blue), and their empirical marginals over 10’000 simulations (right); we simulate using a grid size of 0.0001. Note the two processes ha… view at source ↗

**Figure 9.** Figure 9: Leveraging mathematical genetics literature, we build fast and stable simplicial diffusion. (a) We plot the time it takes to sample a sequence of D “ 500 using an SDE, versus our exact sampling for various values of t on an A100 80GB GPU. We threshold switching to the Griffiths approximation at τt “ 0.1. (b) For τ “ 0.1 and B “ 3 we sample 3 ˆ 107 points from the exact sampling method, Griffith’s approxima… view at source ↗

**Figure 10.** Figure 10: The sufficient statistic parametrization enables a single model to perform competitive discrete, Gaussian, and simplicial optimization of antibodies. Using our protein models from [PITH_FULL_IMAGE:figures/full_fig_p042_10.png] view at source ↗

**Figure 11.** Figure 11: The SSP enables a single model to fit image data across 3 modalities. We perform the analysis of [PITH_FULL_IMAGE:figures/full_fig_p043_11.png] view at source ↗

**Figure 12.** Figure 12: The SSP results in no noticeable drop in generation quality for image models. We plot samples from models trained on MNIST. 44 [PITH_FULL_IMAGE:figures/full_fig_p044_12.png] view at source ↗

read the original abstract

To model discrete sequences such as DNA, proteins, and language using diffusion, practitioners must choose between three major methods: diffusion in discrete space, Gaussian diffusion in Euclidean space, or diffusion on the simplex. Despite their shared goal, these models have disparate algorithms, theoretical structures, and tradeoffs: discrete diffusion has the most natural domain, Gaussian diffusion has more mature algorithms, and diffusion on the simplex in principle combines the strengths of the other two but in practice suffers from a numerically unstable stochastic processes. Ideally we could see each of these models as instances of the same underlying framework, and enable practitioners to switch between models for downstream applications. However previous theories have only considered connections in special cases. Here we build a theory unifying all three methods of discrete diffusion as different parameterizations of the same underlying process: the Wright-Fisher population genetics model. In particular, we find simplicial and Gaussian diffusion as two large-population limits. Our theory formally connects the likelihoods and hyperparameters of these models and leverages decades of mathematical genetics literature to unlock stable simplicial diffusion. Finally, we relieve the practitioner of balancing model trade-offs by demonstrating it is possible to train a single model that can perform diffusion in any of these three domains at test time. Our experiments show that Wright-Fisher simplicial diffusion is more stable and outperforms previous simplicial diffusion models on conditional DNA generation. We also show that we can train models on multiple domains at once that are competitive with models trained on any individual domain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Wright-Fisher unification of the three diffusion types is the real contribution, with a workable multi-domain model as the practical payoff, though the discrete kernel match needs explicit verification.

read the letter

The main thing here is that the authors treat discrete, Gaussian, and simplicial diffusion as different parameterizations of the same Wright-Fisher process, with the Gaussian and simplicial versions arising as large-population limits. They also show that one model can be trained to run in any of the three domains at test time. That framing pulls in the existing genetics literature to stabilize the simplicial case, which then beats earlier simplicial models on conditional DNA generation while the joint models stay competitive with single-domain ones. Those results are concrete and useful for anyone who wants to avoid picking a formulation in advance. The soft spot is whether the finite discrete kernels line up exactly with Wright-Fisher multinomial sampling. If the paper's discretization or absorbing-state handling introduces even a modest mismatch, the claimed equivalence of likelihoods and hyperparameters does not hold without extra approximations, and the theoretical interchangeability weakens. The derivations need to be laid out plainly so that point can be checked. This is worth the time of anyone working on sequence diffusion in biology or NLP. A reader who cares about theoretical connections or flexible training setups will find it worthwhile. It has enough substance to go to serious peer review.

Referee Report

2 major / 1 minor

Summary. The manuscript claims to unify discrete, Gaussian, and simplicial diffusion models for discrete sequences by framing them as different parameterizations of the Wright-Fisher population genetics model. Simplicial and Gaussian diffusion are derived as two large-population limits of this process. The theory formally connects the likelihoods and hyperparameters across the three approaches, leverages mathematical genetics results to stabilize simplicial diffusion, and demonstrates that a single model can be trained to perform diffusion in any of the three domains at test time. Experiments report that the Wright-Fisher simplicial variant is more stable and outperforms prior simplicial models on conditional DNA generation, while multi-domain models remain competitive with single-domain baselines.

Significance. If the claimed exact equivalences hold, the unification would let practitioners interchange diffusion paradigms without retraining and borrow numerical-stability techniques from the mathematical-genetics literature. The multi-domain training result is practically attractive for applications involving DNA, proteins, or language. The work’s main contribution is the theoretical linkage rather than new algorithms, so its significance rests on the tightness of the Wright-Fisher correspondence and the reproducibility of the reported performance gains.

major comments (2)

[Unification theory] Abstract and unification theory: the claim that standard discrete diffusion transition kernels match Wright-Fisher multinomial sampling exactly (required for identical marginal likelihoods and interchangeable hyperparameters) must be verified by direct comparison of the forward noising kernels, rate matrices, and absorbing-state handling. Any discretization mismatch would break the single-model multi-domain training justification.
[Large-population limits] Large-population limits derivations: while the Gaussian and simplicial limits are standard diffusion approximations, the manuscript must show that the specific discretizations and stochastic processes used in existing models correspond exactly to instances of Wright-Fisher dynamics without extra approximations that would invalidate the claimed equivalence of likelihoods and hyperparameters.

minor comments (1)

[Experiments] The abstract states that Wright-Fisher simplicial diffusion outperforms prior simplicial models on DNA generation, but the experimental section should include explicit protocol details, baseline implementations, and statistical significance tests to allow independent verification.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. The comments highlight important points about the rigor of the claimed equivalences, which we address by strengthening explicit comparisons and derivations in the revised manuscript. We believe these changes preserve the core contribution while improving clarity and verifiability.

read point-by-point responses

Referee: [Unification theory] Abstract and unification theory: the claim that standard discrete diffusion transition kernels match Wright-Fisher multinomial sampling exactly (required for identical marginal likelihoods and interchangeable hyperparameters) must be verified by direct comparison of the forward noising kernels, rate matrices, and absorbing-state handling. Any discretization mismatch would break the single-model multi-domain training justification.

Authors: We appreciate the referee's emphasis on explicit verification. The original manuscript derives the equivalence by showing that the discrete diffusion forward process is exactly the multinomial sampling step of the Wright-Fisher model under the chosen parameterization (with mutation rates governing the absorbing-state behavior). In the revision we have added a dedicated subsection (Section 3.2) that tabulates the forward noising kernels side-by-side, compares the infinitesimal rate matrices, and confirms that the absorbing-state handling is identical via the standard population-genetics mutation operator. These direct comparisons establish that the marginal likelihoods coincide exactly and that hyperparameters transfer without adjustment, thereby justifying the multi-domain training result. No discretization mismatch exists under the model definitions used. revision: yes
Referee: [Large-population limits] Large-population limits derivations: while the Gaussian and simplicial limits are standard diffusion approximations, the manuscript must show that the specific discretizations and stochastic processes used in existing models correspond exactly to instances of Wright-Fisher dynamics without extra approximations that would invalidate the claimed equivalence of likelihoods and hyperparameters.

Authors: We agree that the large-population limits must be shown to align precisely with the discretizations employed in prior Gaussian and simplicial diffusion models. The revised manuscript augments the derivations in Section 4 with explicit statements that the time-discretized Ornstein-Uhlenbeck process recovered in the Gaussian limit and the Dirichlet-multinomial process recovered in the simplicial limit are obtained directly from the Wright-Fisher generator without additional approximations beyond the classical large-N diffusion limit (citing the standard convergence theorems from mathematical genetics). We further verify that the noise schedules and step sizes used in the literature correspond one-to-one to the Wright-Fisher time parameterization, preserving the exact equivalence of likelihoods and hyperparameters. These additions eliminate any ambiguity about extraneous approximations. revision: yes

Circularity Check

0 steps flagged

No circularity: unification rests on external Wright-Fisher model from mathematical genetics

full rationale

The paper presents discrete, Gaussian, and simplicial diffusion as different parameterizations of the pre-existing Wright-Fisher population genetics process, with the latter two arising as large-population limits. This connection is explicitly grounded in decades of external mathematical genetics literature rather than any internal fitting, self-definition, or self-citation chain. The abstract states the theory 'formally connects the likelihoods and hyperparameters' by leveraging that literature to stabilize simplicial diffusion, and demonstrates multi-domain training without reducing any claimed equivalence to a tautology or fitted input. No load-bearing step reduces by construction to the paper's own inputs; the derivation is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

With only the abstract available, the ledger is necessarily incomplete. The central claim rests on the domain assumption that discrete diffusion processes can be exactly reparameterized as Wright-Fisher dynamics and that the Gaussian and simplicial forms emerge cleanly as large-population limits. No new invented entities are introduced. No specific free parameters are named in the abstract.

axioms (1)

domain assumption Discrete, Gaussian, and simplicial diffusion processes correspond to instances or large-population limits of the Wright-Fisher population genetics model
This mapping is the load-bearing step that allows the claimed unification of likelihoods and hyperparameters.

pith-pipeline@v0.9.0 · 5591 in / 1453 out tokens · 39954 ms · 2026-05-16T21:16:06.204701+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages · 1 internal anchor

[1]

Alamdari, N

S. Alamdari, N. Thakkar, R. van den Berg, A. X. Lu, N. Fusi, A. P. Amini, and K. K. Yang. Protein generation with evolutionary diffusion: sequence is all you need.bioRxiv, Sept. 2023

work page 2023
[2]

A. N. Amin, N. Gruver, and A. G. Wilson. Why masking diffusion works: Condition on the jump schedule for improved discrete diffusion. InFrontiers in Probabilistic Inference: Learning meets Sampling, Apr. 2025

work page 2025
[3]

B. D. O. Anderson. Reverse-time diffusion equation models.Stoch. Process. Their Appl., 12(3): 313–326, May 1982

work page 1982
[4]

Austin, D

J. Austin, D. D. Johnson, J. Ho, D. Tarlow, and R. Van Den Berg. Structured denoising diffusion models in discrete state-spaces.Adv. Neural Inf. Process. Syst., 34:17981–17993, 2021

work page 2021
[5]

Avdeyev, C

P. Avdeyev, C. Shi, Y . Tan, K. Dudnyk, and J. Zhou. Dirichlet diffusion score model for biological sequence generation.arXiv [cs.LG], May 2023

work page 2023
[6]

Baron, A

E. Baron, A. N. Amin, R. Weitzman, D. S. Marks, and A. G. Wilson. A diffusion model to shrink proteins while maintaining their function. InThe Exploration in AI Today Workshop at ICML 2025, June 2025

work page 2025
[7]

R. F. Bass.Stochastic Processes. Cambridge University Press, Oct. 2011

work page 2011
[8]

Benton, Y

J. Benton, Y . Shi, V . De Bortoli, G. Deligiannidis, and A. Doucet. From denoising diffusions to denoising markov models.J. R. Stat. Soc. Series B Stat. Methodol., 86(2):286–301, Apr. 2024

work page 2024
[9]

Calderon, R

D. Calderon, R. Blecher-Gonen, X. Huang, S. Secchia, J. Kentro, R. M. Daza, B. Martin, A. Dulja, C. Schaub, C. Trapnell, E. Larschan, K. M. O’Connor-Giles, E. E. M. Furlong, and J. Shendure. The continuum of <i>drosophila</i> embryonic development at single- cell resolution.Science, 377(6606):eabn5800, 2022. doi: 10.1126/science.abn5800. URL https://www.s...

work page doi:10.1126/science.abn5800 2022
[10]

Campbell, J

A. Campbell, J. Benton, V . De Bortoli, T. Rainforth, G. Deligiannidis, and A. Doucet. A continuous time framework for discrete denoising models. InAdvances in Neural Information Processing Systems, Oct. 2022

work page 2022
[11]

N. A. Chandra, Y . Hu, J. D. Buenrostro, S. Mostafavi, and A. Sasse. Refining sequence-to- activity models by increasing model resolution.bioRxiv, 2025. doi: 10.1101/2025.01.24.634804

work page doi:10.1101/2025.01.24.634804 2025
[12]

Davis, S

O. Davis, S. Kessler, M. Petrache, I. I. Ceylan, M. Bronstein, and A. J. Bose. Fisher flow matching for generative modeling over discrete data.arXiv [cs.LG], May 2024. 11

work page 2024
[13]

Dieleman, L

S. Dieleman, L. Sartran, A. Roshannai, N. Savinov, Y . Ganin, P. H. Richemond, A. Doucet, R. Strudel, C. Dyer, C. Durkan, C. Hawthorne, R. Leblond, W. Grathwohl, and J. Adler. Continuous diffusion for categorical data.arXiv.org, 2022

work page 2022
[14]

Eijkelboom, G

F. Eijkelboom, G. Bartosh, C. Andersson Naesseth, M. Welling, and J.-W. van de Meent. Variational flow matching for graph generation.Advances in Neural Information Processing Systems, 37:11735–11764, 2024

work page 2024
[15]

S. N. Ethier and T. G. Kurtz.Markov Processes: Characterisation and Convergence. Probability & Mathematical Statistics S. John Wiley & Sons, Nashville, TN, May 1986

work page 1986
[16]

Floto, T

G. Floto, T. Jonsson, M. Nica, S. Sanner, and E. Z. Zhu. Diffusion on the probability simplex. arXiv [cs.LG], Sept. 2023

work page 2023
[17]

F. Gotze. On the rate of convergence in the multivariate CLT.Ann. Probab., 19(2):724–739, 1991

work page 1991
[18]

R. C. Griffiths. Asymptotic line-of-descent distributions.J. Math. Biol., 21(1):67–75, Dec. 1984

work page 1984
[19]

Gruver, S

N. Gruver, S. D. Stanton, N. C. Frey, T. G. J. Rudner, I. Hotzel, J. Lafrance-Vanasse, A. Rajpal, K. Cho, and A. G. Wilson. Protein design with guided discrete diffusion. InThirty-seventh Conference on Neural Information Processing Systems, Nov. 2023

work page 2023
[20]

X. Han, S. Kumar, and Y . Tsvetkov. SSD-LM: Semi-autoregressive simplex-based diffusion language model for text generation and modular control.arXiv [cs.CL], Oct. 2022

work page 2022
[21]

B. L. Hie, V . R. Shanker, D. Xu, T. U. J. Bruun, P. A. Weidenbacher, S. Tang, W. Wu, J. E. Pak, and P. S. Kim. Efficient evolution of human antibodies from general protein language models. Nat. Biotechnol., 42(2):275–283, Apr. 2023

work page 2023
[22]

J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 6840–6851. Curran Associates, Inc., 2020

work page 2020
[23]

F. M. Hoppe. Polya-like urns and the ewens’ sampling formula.J. Math. Biol., 20(1):91–94, Aug. 1984

work page 1984
[24]

P. A. Jenkins and D. Spanò. Exact simulation of the Wright–Fisher diffusion.Ann. Appl. Probab., 27(3):1478–1509, June 2017

work page 2017
[25]

Johansson and Others.mpmath: a Python library for arbitrary-precision floating-point arithmetic (version 0.14), Feb

F. Johansson and Others.mpmath: a Python library for arbitrary-precision floating-point arithmetic (version 0.14), Feb. 2010

work page 2010
[26]

D. D. Johnson, J. Austin, R. van den Berg, and D. Tarlow. Beyond in-place corruption: Insertion and deletion in denoising probabilistic models. InICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models, 2021

work page 2021
[27]

M. Kimura. Solution of a process of random genetic drift with a continuous model.Proc. Natl. Acad. Sci. U. S. A., 41(3):144–150, Mar. 1955

work page 1955
[28]

B. Li, Z. Gao, and L. Xu. Unifying continuous and discrete text diffusion with non-simultaneous diffusion processes.arXiv [cs.CL], May 2025

work page 2025
[29]

Z. Li, Y . Ni, G. Xia, W. Beardall, A. Das, G.-B. Stan, and Y . Zhao. Absorb & escape: Overcoming single model limitations in generating heterogeneous genomic sequences.Advances in Neural Information Processing Systems, 37:21949–21978, 2024

work page 2024
[30]

Z. Lin, H. Akin, R. Rao, B. Hie, Z. Zhu, W. Lu, N. Smetanin, R. Verkuil, O. Kabeli, Y . Shmueli, A. dos Santos Costa, M. Fazel-Zarandi, T. Sercu, S. Candido, and A. Rives. Evolutionary-scale 12 prediction of atomic-level protein structure with a language model.Science, 379(6637):1123– 1130, 2023. doi: 10.1126/science.ade2574. URL https://www.science.org/d...

work page doi:10.1126/science.ade2574 2023
[31]

Lou and S

A. Lou and S. Ermon. Reflected diffusion models.ICML, abs/2304.04740:22675–22701, Apr. 2023

work page arXiv 2023
[32]

A. Lou, C. Meng, and S. Ermon. Discrete diffusion modeling by estimating the ratios of the data distribution. In41 st International Conference on Machine Learning, Oct. 2023

work page 2023
[33]

S. Luo, Y . Su, X. Peng, S. Wang, J. Peng, and J. Ma. Antigen-specific antibody design and optimization with diffusion-based generative models for protein structures. InAdvances in Neural Information Processing Systems 35. Cold Spring Harbor Laboratory, July 2022

work page 2022
[34]

R. K. Mahabadi, H. Ivison, J. Tae, J. Henderson, I. Beltagy, M. E. Peters, and A. Cohan. TESS: Text-to-text self-conditioned simplex diffusion.arXiv [cs.CL], May 2023

work page 2023
[35]

J. W. Miller. Asymptotic normality, concentration, and coverage of generalized posteriors. arXiv [math.ST], July 2019

work page 2019
[36]

J. Ou, S. Nie, K. Xue, F. Zhu, J. Sun, Z. Li, and C. Li. Your absorbing discrete diffusion secretly models the conditional distributions of clean data.arXiv [cs.LG], June 2024

work page 2024
[37]

Raghu, S

A. Raghu, S. W. Ober, M. Kazman, and H. Elliott. Guided sequence-structure generative modeling for iterative antibody optimization. InICLR 2025 Workshop on Generative and Experimental Perspectives for Biomolecular Design, 2025

work page 2025
[38]

P. H. Richemond, S. Dieleman, and A. Doucet. Categorical SDEs with simplex diffusion.arXiv [cs.LG], Oct. 2022

work page 2022
[39]

H. Robbins. A remark on stirling’s formula.Am. Math. Mon., 62(1):26, Jan. 1955

work page 1955
[40]

S. S. Sahoo, M. Arriola, Y . Schiff, A. Gokaslan, E. Marroquin, J. T. Chiu, A. Rush, and V . Kuleshov. Simple and effective masked diffusion language models.arXiv [cs.CL], June 2024

work page 2024
[41]

S. S. Sahoo, J. Deschenaux, A. Gokaslan, G. Wang, J. Chiu, and V . Kuleshov. The diffusion duality.arXiv [cs.LG], June 2025

work page 2025
[42]

Sarkar, Z

A. Sarkar, Z. Tang, C. Zhao, and P. K. Koo. Designing DNA with tunable regulatory activity using discrete diffusion.bioRxiv, page 2024.05.23.595630, May 2024

work page 2024
[43]

Shabalin, V

A. Shabalin, V . Meshchaninov, and D. Vetrov. Smoothie: Smoothing diffusion on token embeddings for text generation.arXiv [cs.CL], May 2025

work page 2025
[44]

J. Shi, K. Han, Z. Wang, A. Doucet, and M. K. Titsias. Simplified and generalized masked diffusion for discrete data.arXiv [cs.LG], June 2024

work page 2024
[45]

Stark, B

H. Stark, B. Jing, C. Wang, G. Corso, B. Berger, R. Barzilay, and T. Jaakkola. Dirichlet flow matching with applications to DNA sequence design.arXiv [q-bio.BM], Feb. 2024

work page 2024
[46]

C. Stone. Limit theorems for random walks, birth and death processes, and diffusion processes. Illinois J. Math., 7(4):638–660, Dec. 1963

work page 1963
[47]

B. E. Suzek, H. Huang, P. McGarvey, R. Mazumder, and C. H. Wu. UniRef: comprehensive and non-redundant UniProt reference clusters.Bioinformatics, 23(10):1282–1288, May 2007

work page 2007
[48]

S. Tang, Y . Zhang, A. Tong, and P. Chatterjee. Gumbel-softmax flow matching with straight- through guidance for controllable biological sequence generation.arXiv [cs.LG], Mar. 2025

work page 2025
[49]

S. Tavaré. Line-of-descent and genealogical processes, and their applications in population genetics models.Theor. Popul. Biol., 26(2):119–164, Oct. 1984

work page 1984
[50]

A. W. van der Vaart.Asymptotic Statistics. 1998. 13

work page 1998
[51]

X. Wang, Z. Zheng, F. Ye, D. Xue, S. Huang, and Q. Gu. Diffusion language models are versatile protein learners.ICML, abs/2402.18567, Feb. 2024

work page arXiv 2024
[52]

X. Wang, Z. Zheng, F. Ye, D. Xue, S. Huang, and Q. Gu. DPLM-2: A multimodal diffusion protein language model.arXiv [cs.LG], Oct. 2024

work page 2024
[53]

Winkler, L

L. Winkler, L. Richter, and M. Opper. Bridging discrete and continuous state spaces: Exploring the ehrenfest process in time-continuous diffusion models.arXiv [stat.ML], May 2024

work page 2024
[54]

R. Wu, F. Ding, R. Wang, R. Shen, X. Zhang, S. Luo, C. Su, Z. Wu, Q. Xie, B. Berger, J. Ma, and J. Peng. High-resolutionde novostructure prediction from primary sequence.bioRxiv, page 2022.07.21.500999, July 2022

work page 2022
[55]

K. K. Yang, N. Fusi, and A. X. Lu. Convolutions are competitive with transformers for protein sequence pretraining.Cell Systems, 15(3):286–294, 2024

work page 2024
[56]

Diffusion Models are Evolutionary Algorithms

Y . Zhang, B. Hartl, H. Hazan, and M. Levin. Diffusion models are evolutionary algorithms. arXiv preprint arXiv:2410.02543, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[57]

denoise” sequences; we call the choice of inputs and outputs of these neural networks the “parameterization

K. Zheng, Y . Chen, H. Mao, M.-Y . Liu, J. Zhu, and Q. Zhang. Masked diffusion models are secretly time-agnostic masked models and exploit inaccurate categorical sampling.arXiv [cs.LG], Sept. 2024. 14 A Extended related work We add more related work beyond those in Sec. 2. Classical theories unifying discrete and continuous stochastic processesThere is a ...

work page 2024
[58]

argmaxpw tq, and x“z 0 “w 0, they state “Since the transition zt Ñz s is Markov, we get: qpzs |w t, zt, xq “qpz s |z t, xq

has a similar idea, swapping the softmax for an asymmetric transformation and Gaussian diffusion with reflected Gaussian diffusion. With these simplifications however, the process is exactly (reflected) Gaussian diffusion except the input to the neural network is transformed onto a simplex; in particular, it doesn’t interact with the topology of the simpl...

work page 2025
[59]

Decompose Λ“ηVdiagp ⃗λ{ηqV T for a matrix VPR Bˆr with orthonormal columns, a vector λ of eigenvalues, and a scalar ηąmax i λi to be chosen later

Below we simply assume that1is not orthogonal to the top eigenspace ofΛ. Decompose Λ“ηVdiagp ⃗λ{ηqV T for a matrix VPR Bˆr with orthonormal columns, a vector λ of eigenvalues, and a scalar ηąmax i λi to be chosen later. For an orthonormal matrix UPR rˆr to be chosen later, define ˜V“ « Vdiagp ⃗λ{ηq1{2 UpI´diagp ⃗λ{ηqq1{2 ff so ˜V has orthonormal columns. ...

work page
[60]

(Convergence of marginals)⃗ xζ t ⇝⃗ zt for eacht. 37

work page
[61]

(Local uniform convergence of conditionals) Conditional distributions exist such that for each ⃗ vPRr, săt , and bounded compactly supported measurable function f, there is an ϵą0 , such that sup }⃗ w´⃗ v}ăϵ |E⃗ xζ t |⃗ xζ s “⃗ wf´E ⃗ zt|⃗ zs“⃗ wf| Ñ0

work page
[62]

500 and predicts a positive 250-dimensional vector that represents the predicted “accessibility-profile

(Tightness) For every ra, bs Ă p0,1q , there are β, θ, Mą0 such that for all s, tP ra, bs , supζąM E}⃗ xζ s ´⃗ xζ t }β ăCps´tq θ. Then, with the topology of convergence on compact sets11, the paths converge in distribution p⃗ xζ t qtPp0,1q ⇝p⃗ ztqtPp0,1q. Proof. Pick a compact set ra, bs Ă p0,1q . We show p⃗ xζ t qtPra,bs ⇝p⃗ ztqtPra,bs. Say p⃗ xζm t qtPr...

work page arXiv

[1] [1]

Alamdari, N

S. Alamdari, N. Thakkar, R. van den Berg, A. X. Lu, N. Fusi, A. P. Amini, and K. K. Yang. Protein generation with evolutionary diffusion: sequence is all you need.bioRxiv, Sept. 2023

work page 2023

[2] [2]

A. N. Amin, N. Gruver, and A. G. Wilson. Why masking diffusion works: Condition on the jump schedule for improved discrete diffusion. InFrontiers in Probabilistic Inference: Learning meets Sampling, Apr. 2025

work page 2025

[3] [3]

B. D. O. Anderson. Reverse-time diffusion equation models.Stoch. Process. Their Appl., 12(3): 313–326, May 1982

work page 1982

[4] [4]

Austin, D

J. Austin, D. D. Johnson, J. Ho, D. Tarlow, and R. Van Den Berg. Structured denoising diffusion models in discrete state-spaces.Adv. Neural Inf. Process. Syst., 34:17981–17993, 2021

work page 2021

[5] [5]

Avdeyev, C

P. Avdeyev, C. Shi, Y . Tan, K. Dudnyk, and J. Zhou. Dirichlet diffusion score model for biological sequence generation.arXiv [cs.LG], May 2023

work page 2023

[6] [6]

Baron, A

E. Baron, A. N. Amin, R. Weitzman, D. S. Marks, and A. G. Wilson. A diffusion model to shrink proteins while maintaining their function. InThe Exploration in AI Today Workshop at ICML 2025, June 2025

work page 2025

[7] [7]

R. F. Bass.Stochastic Processes. Cambridge University Press, Oct. 2011

work page 2011

[8] [8]

Benton, Y

J. Benton, Y . Shi, V . De Bortoli, G. Deligiannidis, and A. Doucet. From denoising diffusions to denoising markov models.J. R. Stat. Soc. Series B Stat. Methodol., 86(2):286–301, Apr. 2024

work page 2024

[9] [9]

Calderon, R

D. Calderon, R. Blecher-Gonen, X. Huang, S. Secchia, J. Kentro, R. M. Daza, B. Martin, A. Dulja, C. Schaub, C. Trapnell, E. Larschan, K. M. O’Connor-Giles, E. E. M. Furlong, and J. Shendure. The continuum of <i>drosophila</i> embryonic development at single- cell resolution.Science, 377(6606):eabn5800, 2022. doi: 10.1126/science.abn5800. URL https://www.s...

work page doi:10.1126/science.abn5800 2022

[10] [10]

Campbell, J

A. Campbell, J. Benton, V . De Bortoli, T. Rainforth, G. Deligiannidis, and A. Doucet. A continuous time framework for discrete denoising models. InAdvances in Neural Information Processing Systems, Oct. 2022

work page 2022

[11] [11]

N. A. Chandra, Y . Hu, J. D. Buenrostro, S. Mostafavi, and A. Sasse. Refining sequence-to- activity models by increasing model resolution.bioRxiv, 2025. doi: 10.1101/2025.01.24.634804

work page doi:10.1101/2025.01.24.634804 2025

[12] [12]

Davis, S

O. Davis, S. Kessler, M. Petrache, I. I. Ceylan, M. Bronstein, and A. J. Bose. Fisher flow matching for generative modeling over discrete data.arXiv [cs.LG], May 2024. 11

work page 2024

[13] [13]

Dieleman, L

S. Dieleman, L. Sartran, A. Roshannai, N. Savinov, Y . Ganin, P. H. Richemond, A. Doucet, R. Strudel, C. Dyer, C. Durkan, C. Hawthorne, R. Leblond, W. Grathwohl, and J. Adler. Continuous diffusion for categorical data.arXiv.org, 2022

work page 2022

[14] [14]

Eijkelboom, G

F. Eijkelboom, G. Bartosh, C. Andersson Naesseth, M. Welling, and J.-W. van de Meent. Variational flow matching for graph generation.Advances in Neural Information Processing Systems, 37:11735–11764, 2024

work page 2024

[15] [15]

S. N. Ethier and T. G. Kurtz.Markov Processes: Characterisation and Convergence. Probability & Mathematical Statistics S. John Wiley & Sons, Nashville, TN, May 1986

work page 1986

[16] [16]

Floto, T

G. Floto, T. Jonsson, M. Nica, S. Sanner, and E. Z. Zhu. Diffusion on the probability simplex. arXiv [cs.LG], Sept. 2023

work page 2023

[17] [17]

F. Gotze. On the rate of convergence in the multivariate CLT.Ann. Probab., 19(2):724–739, 1991

work page 1991

[18] [18]

R. C. Griffiths. Asymptotic line-of-descent distributions.J. Math. Biol., 21(1):67–75, Dec. 1984

work page 1984

[19] [19]

Gruver, S

N. Gruver, S. D. Stanton, N. C. Frey, T. G. J. Rudner, I. Hotzel, J. Lafrance-Vanasse, A. Rajpal, K. Cho, and A. G. Wilson. Protein design with guided discrete diffusion. InThirty-seventh Conference on Neural Information Processing Systems, Nov. 2023

work page 2023

[20] [20]

X. Han, S. Kumar, and Y . Tsvetkov. SSD-LM: Semi-autoregressive simplex-based diffusion language model for text generation and modular control.arXiv [cs.CL], Oct. 2022

work page 2022

[21] [21]

B. L. Hie, V . R. Shanker, D. Xu, T. U. J. Bruun, P. A. Weidenbacher, S. Tang, W. Wu, J. E. Pak, and P. S. Kim. Efficient evolution of human antibodies from general protein language models. Nat. Biotechnol., 42(2):275–283, Apr. 2023

work page 2023

[22] [22]

J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 6840–6851. Curran Associates, Inc., 2020

work page 2020

[23] [23]

F. M. Hoppe. Polya-like urns and the ewens’ sampling formula.J. Math. Biol., 20(1):91–94, Aug. 1984

work page 1984

[24] [24]

P. A. Jenkins and D. Spanò. Exact simulation of the Wright–Fisher diffusion.Ann. Appl. Probab., 27(3):1478–1509, June 2017

work page 2017

[25] [25]

Johansson and Others.mpmath: a Python library for arbitrary-precision floating-point arithmetic (version 0.14), Feb

F. Johansson and Others.mpmath: a Python library for arbitrary-precision floating-point arithmetic (version 0.14), Feb. 2010

work page 2010

[26] [26]

D. D. Johnson, J. Austin, R. van den Berg, and D. Tarlow. Beyond in-place corruption: Insertion and deletion in denoising probabilistic models. InICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models, 2021

work page 2021

[27] [27]

M. Kimura. Solution of a process of random genetic drift with a continuous model.Proc. Natl. Acad. Sci. U. S. A., 41(3):144–150, Mar. 1955

work page 1955

[28] [28]

B. Li, Z. Gao, and L. Xu. Unifying continuous and discrete text diffusion with non-simultaneous diffusion processes.arXiv [cs.CL], May 2025

work page 2025

[29] [29]

Z. Li, Y . Ni, G. Xia, W. Beardall, A. Das, G.-B. Stan, and Y . Zhao. Absorb & escape: Overcoming single model limitations in generating heterogeneous genomic sequences.Advances in Neural Information Processing Systems, 37:21949–21978, 2024

work page 2024

[30] [30]

Z. Lin, H. Akin, R. Rao, B. Hie, Z. Zhu, W. Lu, N. Smetanin, R. Verkuil, O. Kabeli, Y . Shmueli, A. dos Santos Costa, M. Fazel-Zarandi, T. Sercu, S. Candido, and A. Rives. Evolutionary-scale 12 prediction of atomic-level protein structure with a language model.Science, 379(6637):1123– 1130, 2023. doi: 10.1126/science.ade2574. URL https://www.science.org/d...

work page doi:10.1126/science.ade2574 2023

[31] [31]

Lou and S

A. Lou and S. Ermon. Reflected diffusion models.ICML, abs/2304.04740:22675–22701, Apr. 2023

work page arXiv 2023

[32] [32]

A. Lou, C. Meng, and S. Ermon. Discrete diffusion modeling by estimating the ratios of the data distribution. In41 st International Conference on Machine Learning, Oct. 2023

work page 2023

[33] [33]

S. Luo, Y . Su, X. Peng, S. Wang, J. Peng, and J. Ma. Antigen-specific antibody design and optimization with diffusion-based generative models for protein structures. InAdvances in Neural Information Processing Systems 35. Cold Spring Harbor Laboratory, July 2022

work page 2022

[34] [34]

R. K. Mahabadi, H. Ivison, J. Tae, J. Henderson, I. Beltagy, M. E. Peters, and A. Cohan. TESS: Text-to-text self-conditioned simplex diffusion.arXiv [cs.CL], May 2023

work page 2023

[35] [35]

J. W. Miller. Asymptotic normality, concentration, and coverage of generalized posteriors. arXiv [math.ST], July 2019

work page 2019

[36] [36]

J. Ou, S. Nie, K. Xue, F. Zhu, J. Sun, Z. Li, and C. Li. Your absorbing discrete diffusion secretly models the conditional distributions of clean data.arXiv [cs.LG], June 2024

work page 2024

[37] [37]

Raghu, S

A. Raghu, S. W. Ober, M. Kazman, and H. Elliott. Guided sequence-structure generative modeling for iterative antibody optimization. InICLR 2025 Workshop on Generative and Experimental Perspectives for Biomolecular Design, 2025

work page 2025

[38] [38]

P. H. Richemond, S. Dieleman, and A. Doucet. Categorical SDEs with simplex diffusion.arXiv [cs.LG], Oct. 2022

work page 2022

[39] [39]

H. Robbins. A remark on stirling’s formula.Am. Math. Mon., 62(1):26, Jan. 1955

work page 1955

[40] [40]

S. S. Sahoo, M. Arriola, Y . Schiff, A. Gokaslan, E. Marroquin, J. T. Chiu, A. Rush, and V . Kuleshov. Simple and effective masked diffusion language models.arXiv [cs.CL], June 2024

work page 2024

[41] [41]

S. S. Sahoo, J. Deschenaux, A. Gokaslan, G. Wang, J. Chiu, and V . Kuleshov. The diffusion duality.arXiv [cs.LG], June 2025

work page 2025

[42] [42]

Sarkar, Z

A. Sarkar, Z. Tang, C. Zhao, and P. K. Koo. Designing DNA with tunable regulatory activity using discrete diffusion.bioRxiv, page 2024.05.23.595630, May 2024

work page 2024

[43] [43]

Shabalin, V

A. Shabalin, V . Meshchaninov, and D. Vetrov. Smoothie: Smoothing diffusion on token embeddings for text generation.arXiv [cs.CL], May 2025

work page 2025

[44] [44]

J. Shi, K. Han, Z. Wang, A. Doucet, and M. K. Titsias. Simplified and generalized masked diffusion for discrete data.arXiv [cs.LG], June 2024

work page 2024

[45] [45]

Stark, B

H. Stark, B. Jing, C. Wang, G. Corso, B. Berger, R. Barzilay, and T. Jaakkola. Dirichlet flow matching with applications to DNA sequence design.arXiv [q-bio.BM], Feb. 2024

work page 2024

[46] [46]

C. Stone. Limit theorems for random walks, birth and death processes, and diffusion processes. Illinois J. Math., 7(4):638–660, Dec. 1963

work page 1963

[47] [47]

B. E. Suzek, H. Huang, P. McGarvey, R. Mazumder, and C. H. Wu. UniRef: comprehensive and non-redundant UniProt reference clusters.Bioinformatics, 23(10):1282–1288, May 2007

work page 2007

[48] [48]

S. Tang, Y . Zhang, A. Tong, and P. Chatterjee. Gumbel-softmax flow matching with straight- through guidance for controllable biological sequence generation.arXiv [cs.LG], Mar. 2025

work page 2025

[49] [49]

S. Tavaré. Line-of-descent and genealogical processes, and their applications in population genetics models.Theor. Popul. Biol., 26(2):119–164, Oct. 1984

work page 1984

[50] [50]

A. W. van der Vaart.Asymptotic Statistics. 1998. 13

work page 1998

[51] [51]

X. Wang, Z. Zheng, F. Ye, D. Xue, S. Huang, and Q. Gu. Diffusion language models are versatile protein learners.ICML, abs/2402.18567, Feb. 2024

work page arXiv 2024

[52] [52]

X. Wang, Z. Zheng, F. Ye, D. Xue, S. Huang, and Q. Gu. DPLM-2: A multimodal diffusion protein language model.arXiv [cs.LG], Oct. 2024

work page 2024

[53] [53]

Winkler, L

L. Winkler, L. Richter, and M. Opper. Bridging discrete and continuous state spaces: Exploring the ehrenfest process in time-continuous diffusion models.arXiv [stat.ML], May 2024

work page 2024

[54] [54]

R. Wu, F. Ding, R. Wang, R. Shen, X. Zhang, S. Luo, C. Su, Z. Wu, Q. Xie, B. Berger, J. Ma, and J. Peng. High-resolutionde novostructure prediction from primary sequence.bioRxiv, page 2022.07.21.500999, July 2022

work page 2022

[55] [55]

K. K. Yang, N. Fusi, and A. X. Lu. Convolutions are competitive with transformers for protein sequence pretraining.Cell Systems, 15(3):286–294, 2024

work page 2024

[56] [56]

Diffusion Models are Evolutionary Algorithms

Y . Zhang, B. Hartl, H. Hazan, and M. Levin. Diffusion models are evolutionary algorithms. arXiv preprint arXiv:2410.02543, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[57] [57]

denoise” sequences; we call the choice of inputs and outputs of these neural networks the “parameterization

K. Zheng, Y . Chen, H. Mao, M.-Y . Liu, J. Zhu, and Q. Zhang. Masked diffusion models are secretly time-agnostic masked models and exploit inaccurate categorical sampling.arXiv [cs.LG], Sept. 2024. 14 A Extended related work We add more related work beyond those in Sec. 2. Classical theories unifying discrete and continuous stochastic processesThere is a ...

work page 2024

[58] [58]

argmaxpw tq, and x“z 0 “w 0, they state “Since the transition zt Ñz s is Markov, we get: qpzs |w t, zt, xq “qpz s |z t, xq

has a similar idea, swapping the softmax for an asymmetric transformation and Gaussian diffusion with reflected Gaussian diffusion. With these simplifications however, the process is exactly (reflected) Gaussian diffusion except the input to the neural network is transformed onto a simplex; in particular, it doesn’t interact with the topology of the simpl...

work page 2025

[59] [59]

Decompose Λ“ηVdiagp ⃗λ{ηqV T for a matrix VPR Bˆr with orthonormal columns, a vector λ of eigenvalues, and a scalar ηąmax i λi to be chosen later

Below we simply assume that1is not orthogonal to the top eigenspace ofΛ. Decompose Λ“ηVdiagp ⃗λ{ηqV T for a matrix VPR Bˆr with orthonormal columns, a vector λ of eigenvalues, and a scalar ηąmax i λi to be chosen later. For an orthonormal matrix UPR rˆr to be chosen later, define ˜V“ « Vdiagp ⃗λ{ηq1{2 UpI´diagp ⃗λ{ηqq1{2 ff so ˜V has orthonormal columns. ...

work page

[60] [60]

(Convergence of marginals)⃗ xζ t ⇝⃗ zt for eacht. 37

work page

[61] [61]

(Local uniform convergence of conditionals) Conditional distributions exist such that for each ⃗ vPRr, săt , and bounded compactly supported measurable function f, there is an ϵą0 , such that sup }⃗ w´⃗ v}ăϵ |E⃗ xζ t |⃗ xζ s “⃗ wf´E ⃗ zt|⃗ zs“⃗ wf| Ñ0

work page

[62] [62]

500 and predicts a positive 250-dimensional vector that represents the predicted “accessibility-profile

(Tightness) For every ra, bs Ă p0,1q , there are β, θ, Mą0 such that for all s, tP ra, bs , supζąM E}⃗ xζ s ´⃗ xζ t }β ăCps´tq θ. Then, with the topology of convergence on compact sets11, the paths converge in distribution p⃗ xζ t qtPp0,1q ⇝p⃗ ztqtPp0,1q. Proof. Pick a compact set ra, bs Ă p0,1q . We show p⃗ xζ t qtPra,bs ⇝p⃗ ztqtPra,bs. Say p⃗ xζm t qtPr...

work page arXiv