Equivariant score-based generative models provably learn distributions with symmetries efficiently

Benjamin J. Zhang; Markos A. Katsoulakis; Ziyu Chen

arxiv: 2410.01244 · v1 · submitted 2024-10-02 · 📊 stat.ML · cs.LG· math.PR

Equivariant score-based generative models provably learn distributions with symmetries efficiently

Ziyu Chen , Markos A. Katsoulakis , Benjamin J. Zhang This is my paper

Pith reviewed 2026-05-23 20:31 UTC · model grok-4.3

classification 📊 stat.ML cs.LGmath.PR

keywords equivariant generative modelsscore matchinggroup symmetrydata augmentationWasserstein distanceinductive biasHamilton-Jacobi-Bellman

0 comments

The pith

Equivariant vector fields enable score-based generative models to learn group-invariant distributions without data augmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Score-based generative models typically require large datasets or augmentations to learn symmetric distributions effectively. This paper shows that incorporating equivariant structure into the score function parametrization allows learning the symmetrized distribution's score directly. The proof relies on improved Wasserstein-1 bounds for invariant data and equivalence of objectives shown via Hamilton-Jacobi-Bellman theory. This equivalence means equivariant models achieve the same optimality as augmented training but without the need for extra data samples. Non-equivariant models suffer from additional model-form error in their generalization bounds.

Core claim

The central claim is that for a group-invariant data distribution, the score-matching objective optimized over equivariant vector fields is equivalent to the objective on the group-augmented distribution, allowing efficient learning of the symmetrized score without explicit augmentation. This is established by analyzing the optimality conditions and using HJB theory to describe the inductive bias. Additionally, an improved d1 generalization bound is derived for such invariant cases, and non-equivariant fields are shown to produce strictly worse bounds.

What carries the argument

Equivariant vector fields in the score parametrization, whose optimality and equivalence to symmetrized score-matching is shown via Hamilton-Jacobi-Bellman theory.

Load-bearing premise

The underlying data distribution must be exactly invariant under the known group symmetry to allow construction of an exactly equivariant vector field.

What would settle it

Observing that a non-equivariant score model achieves equal or superior Wasserstein-1 generalization performance compared to an equivariant one on exactly group-invariant data would falsify the claim of worse bounds for non-equivariant models.

Figures

Figures reproduced from arXiv: 2410.01244 by Benjamin J. Zhang, Markos A. Katsoulakis, Ziyu Chen.

**Figure 2.** Figure 2: Score-based generative modeling for a simple 2D mixture of Gaussians. Training dataset is of size [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

read the original abstract

Symmetry is ubiquitous in many real-world phenomena and tasks, such as physics, images, and molecular simulations. Empirical studies have demonstrated that incorporating symmetries into generative models can provide better generalization and sampling efficiency when the underlying data distribution has group symmetry. In this work, we provide the first theoretical analysis and guarantees of score-based generative models (SGMs) for learning distributions that are invariant with respect to some group symmetry and offer the first quantitative comparison between data augmentation and adding equivariant inductive bias. First, building on recent works on the Wasserstein-1 ($\mathbf{d}_1$) guarantees of SGMs and empirical estimations of probability divergences under group symmetry, we provide an improved $\mathbf{d}_1$ generalization bound when the data distribution is group-invariant. Second, we describe the inductive bias of equivariant SGMs using Hamilton-Jacobi-Bellman theory, and rigorously demonstrate that one can learn the score of a symmetrized distribution using equivariant vector fields without data augmentations through the analysis of the optimality and equivalence of score-matching objectives. This also provides practical guidance that one does not have to augment the dataset as long as the vector field or the neural network parametrization is equivariant. Moreover, we quantify the impact of not incorporating equivariant structure into the score parametrization, by showing that non-equivariant vector fields can yield worse generalization bounds. This can be viewed as a type of model-form error that describes the missing structure of non-equivariant vector fields. Numerical simulations corroborate our analysis and highlight that data augmentations cannot replace the role of equivariant vector fields.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

For exactly invariant data with known group, equivariant SGMs tighten the d1 bound and can skip augmentation via HJB equivalence of objectives.

read the letter

The core result here is that when the target measure is exactly invariant under a known group, an equivariant score network achieves a strictly better Wasserstein-1 generalization bound than a non-equivariant one, and the HJB analysis shows the equivariant parametrization can recover the score of the orbit-averaged distribution without any data augmentation. That is the quantitative comparison the abstract advertises, and it is new relative to the base d1 guarantees cited from prior work on SGMs. The paper also supplies a model-form error term that quantifies the penalty paid by using a non-equivariant vector field on invariant data. Those pieces are cleanly stated and build directly on the referenced Wasserstein-1 and HJB literature without obvious circularity. The numerical simulations are presented only as corroboration, so they do not carry the main claims. The soft spot is the exact-invariance premise: both the improved bound and the “no augmentation needed” statement are proved only when the data measure is precisely group-invariant and an exactly equivariant field can be constructed. No modulus of continuity or degradation result is given for approximate invariance or for groups that must be learned or approximated. That is a real limitation for applications where symmetry is noisy or only partially known. The work is aimed at theorists and practitioners already using score-based models on physics or molecular data who want a justification for choosing equivariant architectures over augmentation. It is coherent on its own terms and deserves a serious referee; the claims are specific enough that reviewers can check the extensions of the cited d1 and HJB results. I would send it out rather than desk-reject.

Referee Report

2 major / 1 minor

Summary. The paper claims to deliver the first theoretical analysis of score-based generative models (SGMs) for group-invariant distributions. Building on recent Wasserstein-1 (d1) guarantees and empirical divergence results under symmetry, it derives an improved d1 generalization bound when the data distribution is exactly group-invariant. Using Hamilton-Jacobi-Bellman (HJB) theory, it shows that equivariant vector fields can learn the score of the symmetrized distribution without data augmentations by establishing optimality and equivalence of the corresponding score-matching objectives. It further quantifies worse generalization bounds arising from non-equivariant parametrizations (as a form of model-form error) and supports the claims with numerical simulations.

Significance. If the central claims hold under the stated assumptions, the work supplies the first quantitative comparison between data augmentation and equivariant inductive bias in SGMs, together with practical guidance that equivariant parametrization can replace augmentation when the group is known. The explicit use of recent d1 bounds and HJB equivalence for score-matching objectives is a methodological strength that makes the optimality argument falsifiable in principle. The identification of non-equivariant model-form error as a source of degraded bounds is a useful conceptual contribution.

major comments (2)

[Abstract; improved d1 bound section] Abstract and the section deriving the improved d1 bound: the improved generalization bound and the claim that equivariant fields suffice without augmentation are both stated only for exactly group-invariant data distributions and exactly equivariant vector fields. No quantitative continuity or degradation result is supplied for distributions at small total-variation distance from the orbit-averaged measure, which is load-bearing for the practical guidance that “one does not have to augment the dataset.”
[HJB analysis section] HJB analysis section: the optimality and equivalence of score-matching objectives that allow an equivariant field to target the symmetrized distribution without augmentation rest on exact invariance of the data measure and exact equivariance of the vector field under a known group action. The manuscript should either restrict the practical claim to this exact setting or supply an error bound that quantifies the effect of approximate invariance or approximate equivariance.

minor comments (1)

[Numerical simulations] The numerical simulations section would benefit from an explicit statement of the groups used, the precise metrics reported, and whether the data were generated exactly invariant or only approximately so.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We respond to each major comment below, providing our perspective on the points raised regarding the scope of our theoretical results.

read point-by-point responses

Referee: [Abstract; improved d1 bound section] Abstract and the section deriving the improved d1 bound: the improved generalization bound and the claim that equivariant fields suffice without augmentation are both stated only for exactly group-invariant data distributions and exactly equivariant vector fields. No quantitative continuity or degradation result is supplied for distributions at small total-variation distance from the orbit-averaged measure, which is load-bearing for the practical guidance that “one does not have to augment the dataset.”

Authors: The referee correctly observes that the improved d1 bound and the sufficiency claim for equivariant fields without augmentation are established only under exact group invariance of the data distribution and exact equivariance of the vector field. No quantitative continuity result is provided for distributions at small total-variation distance from the orbit-averaged measure. This is an accurate assessment of the manuscript's scope. Our analysis supplies the first such guarantees in the exact setting, which forms the foundation for the comparison between augmentation and equivariant bias. The practical guidance is framed in the context where the data distribution is group-invariant. We will make a partial revision by inserting a clarifying sentence in the abstract and discussion to explicitly restate the exact-invariance assumption and note that extensions to approximate invariance remain open. revision: partial
Referee: [HJB analysis section] HJB analysis section: the optimality and equivalence of score-matching objectives that allow an equivariant field to target the symmetrized distribution without augmentation rest on exact invariance of the data measure and exact equivariance of the vector field under a known group action. The manuscript should either restrict the practical claim to this exact setting or supply an error bound that quantifies the effect of approximate invariance or approximate equivariance.

Authors: The HJB analysis establishing optimality and equivalence of the score-matching objectives does rely on exact invariance of the data measure and exact equivariance of the vector field. No error bounds quantifying the effect of approximate invariance or equivariance are derived. Supplying such bounds would require a substantial technical extension beyond the present contribution. We will therefore follow the referee's alternative suggestion and restrict the practical claim more explicitly to the exact setting. A revision will be made to add appropriate caveats in the HJB section and conclusion, clarifying that the guidance applies under the stated exact assumptions. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on external d1 results and HJB analysis under stated invariance assumptions

full rationale

The improved d1 bound is explicitly built on 'recent works on the Wasserstein-1 (d1) guarantees of SGMs and empirical estimations of probability divergences under group symmetry' (abstract). The HJB equivalence for equivariant fields learning the symmetrized score is derived from optimality analysis of score-matching objectives under the paper's premise of exact group-invariance and known group (no self-citation load-bearing or self-definitional reduction visible). No predictions reduce to fitted inputs by construction, no uniqueness theorems imported from the same authors, and no ansatz smuggled via prior self-work. The derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, invented entities, or ad-hoc axioms are stated. The work relies on standard mathematical background (Wasserstein-1 metric, HJB equation) treated as given from prior literature.

axioms (2)

domain assumption Wasserstein-1 guarantees for SGMs from recent cited works hold and can be extended under group invariance
Abstract states the improved bound is built on these guarantees
standard math Hamilton-Jacobi-Bellman theory applies directly to the score-matching objective for equivariant vector fields
Abstract invokes HJB to demonstrate optimality and equivalence

pith-pipeline@v0.9.0 · 5827 in / 1412 out tokens · 23578 ms · 2026-05-23T20:31:36.267911+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SymDrift: One-Shot Generative Modeling under Symmetries
cs.LG 2026-05 unverdicted novelty 6.0

SymDrift makes drifting models produce symmetry-invariant samples in one step via symmetrized coordinate drifts or G-invariant embeddings, outperforming prior one-shot baselines on molecular benchmarks and cutting com...

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

B. D. Anderson. Reverse-time diffusion equation models. Stochastic Processes and their Applications, 12(3):313– 326, 1982

work page 1982
[2]

Berner, L

J. Berner, L. Richter, and K. Ullrich. An optimal control perspective on diffusion-based generative modeling. arXiv preprint arXiv:2211.01364, 2022

work page arXiv 2022
[3]

Birrell, M

J. Birrell, M. Katsoulakis, L. Rey-Bellet, and W. Zhu. Structure-preserving gans. In International Conference on Machine Learning, pages 1982–2020. PMLR, 2022

work page 1982
[4]

Birrell, M

J. Birrell, M. A. Katsoulakis, L. Rey-Bellet, B. Zhang, and W. Zhu. Nonlinear denoising score matching for enhanced learning of structured distributions. arXiv preprint arXiv:2405.15625, 2024

work page arXiv 2024
[5]

H. Chen, H. Lee, and J. Lu. Improved analysis of score-based generative modeling: User-friendly bounds under minimal smoothness assumptions. In International Conference on Machine Learning, pages 4735–4763. PMLR, 2023

work page 2023
[6]

S. Chen, S. Chewi, J. Li, Y . Li, A. Salim, and A. Zhang. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions. In The Eleventh International Conference on Learning Representations

work page
[7]

Z. Chen, M. Katsoulakis, L. Rey-Bellet, and W. Zhu. Sample complexity of probability divergences under group symmetry. In International Conference on Machine Learning, pages 4713–4734. PMLR, 2023

work page 2023
[8]

Z. Chen, M. A. Katsoulakis, L. Rey-Bellet, and W. Zhu. Statistical guarantees of group-invariant gans. arXiv preprint arXiv:2305.13517, 2023

work page arXiv 2023
[9]

Cohen and M

T. Cohen and M. Welling. Group equivariant convolutional networks. In International conference on machine learning, pages 2990–2999. PMLR, 2016

work page 2016
[10]

Conforti, A

G. Conforti, A. Durmus, and M. G. Silveri. Score diffusion models without early stopping: finite fisher information is all you need. arXiv preprint arXiv:2308.12240, 2023

work page arXiv 2023
[11]

De Bortoli

V . De Bortoli. Convergence of denoising diffusion models under the manifold hypothesis. Transactions on Machine Learning Research, 2022

work page 2022
[12]

L. C. Evans. Partial differential equations, volume 19. American Mathematical Society, 2022

work page 2022
[13]

Fleming and H

W. Fleming and H. Soner. Controlled Markov Processes and Viscosity Solutions. Applications of mathematics. Springer, 2006

work page 2006
[14]

Garcia Satorras, E

V . Garcia Satorras, E. Hoogeboom, F. Fuchs, I. Posner, and M. Welling. E (n) equivariant normalizing flows. Advances in Neural Information Processing Systems, 34:4181–4192, 2021

work page 2021
[15]

Hairer, C

E. Hairer, C. Lubich, and G. Wanner. Geometric numerical integration , volume 31 of Springer Series in Computational Mathematics. Springer-Verlag, Berlin, second edition, 2006. Structure-preserving algorithms for ordinary differential equations

work page 2006
[16]

J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

work page 2020
[17]

Hoogeboom, V

E. Hoogeboom, V . G. Satorras, C. Vignac, and M. Welling. Equivariant diffusion for molecule generation in 3d. In International conference on machine learning, pages 8867–8887. PMLR, 2022

work page 2022
[18]

Klein, A

L. Klein, A. Kr¨amer, and F. No´e. Equivariant flow matching. Advances in Neural Information Processing Systems, 36, 2024

work page 2024
[19]

K¨ohler, L

J. K¨ohler, L. Klein, and F. No´e. Equivariant flows: exact likelihood generative learning for symmetric densities. In International conference on machine learning, pages 5361–5370. PMLR, 2020

work page 2020
[20]

H. Lee, J. Lu, and Y . Tan. Convergence for score-based generative modeling with polynomial complexity. Advances in Neural Information Processing Systems, 35:22870–22882, 2022

work page 2022
[21]

Leimkuhler and S

B. Leimkuhler and S. Reich. Simulating hamiltonian dynamics. Number 14. Cambridge university press, 2004

work page 2004
[22]

H. Lu, S. Szabados, and Y . Yu. Diffusion models with group equivariance. InICML 2024 Workshop on Structured Probabilistic Inference & Generative Modeling, 2024

work page 2024
[23]

Mimikos-Stamatopoulos, B

N. Mimikos-Stamatopoulos, B. J. Zhang, and M. A. Katsoulakis. Score-based generative models are provably robust: an uncertainty quantification perspective. arXiv preprint arXiv:2405.15754, 2024

work page arXiv 2024
[24]

Spectral Normalization for Generative Adversarial Networks

T. Miyato, T. Kataoka, M. Koyama, and Y . Yoshida. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957, 2018. 17 A PREPRINT - OCTOBER 3, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2018
[25]

K. Oko, S. Akiyama, and T. Suzuki. Diffusion models are minimax optimal distribution estimators. InInternational Conference on Machine Learning, pages 26517–26582. PMLR, 2023

work page 2023
[26]

Singhal, M

R. Singhal, M. Goldstein, and R. Ranganath. What’s the score? automated denoising score matching for nonlinear diffusions. In International Conference on Machine Learning. PMLR, 2024

work page 2024
[27]

J. Song, C. Meng, and S. Ermon. Denoising diffusion implicit models. In International Conference on Learning Representations

work page
[28]

Song and S

Y . Song and S. Ermon. Generative modeling by estimating gradients of the data distribution.Advances in neural information processing systems, 32, 2019

work page 2019
[29]

Y . Song, S. Garg, J. Shi, and S. Ermon. Sliced score matching: A scalable approach to density and score estimation. In Uncertainty in Artificial Intelligence, pages 574–584. PMLR, 2020

work page 2020
[30]

Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2011
[31]

Tahmasebi and S

B. Tahmasebi and S. Jegelka. Sample complexity bounds for estimating probability divergences under invariances. In Forty-first International Conference on Machine Learning, 2024

work page 2024
[32]

H. V . Tran. Hamilton-Jacobi equations. Graduate studies in mathematics. American Mathematical Society, Providence, Rhode Island, 2021

work page 2021
[33]

P. Vincent. A connection between score matching and denoising autoencoders. Neural computation, 23(7):1661– 1674, 2011

work page 2011
[34]

B. J. Zhang and M. A. Katsoulakis. A mean-field games laboratory for generative modeling. arXiv preprint arXiv:2304.13534, 2023

work page arXiv 2023
[35]

B. J. Zhang, S. Liu, W. Li, M. A. Katsoulakis, and S. J. Osher. Wasserstein proximal operators describe score-based generative models and resolve memorization. arXiv preprint arXiv:2402.06162, 2024. 18

work page arXiv 2024

[1] [1]

B. D. Anderson. Reverse-time diffusion equation models. Stochastic Processes and their Applications, 12(3):313– 326, 1982

work page 1982

[2] [2]

Berner, L

J. Berner, L. Richter, and K. Ullrich. An optimal control perspective on diffusion-based generative modeling. arXiv preprint arXiv:2211.01364, 2022

work page arXiv 2022

[3] [3]

Birrell, M

J. Birrell, M. Katsoulakis, L. Rey-Bellet, and W. Zhu. Structure-preserving gans. In International Conference on Machine Learning, pages 1982–2020. PMLR, 2022

work page 1982

[4] [4]

Birrell, M

J. Birrell, M. A. Katsoulakis, L. Rey-Bellet, B. Zhang, and W. Zhu. Nonlinear denoising score matching for enhanced learning of structured distributions. arXiv preprint arXiv:2405.15625, 2024

work page arXiv 2024

[5] [5]

H. Chen, H. Lee, and J. Lu. Improved analysis of score-based generative modeling: User-friendly bounds under minimal smoothness assumptions. In International Conference on Machine Learning, pages 4735–4763. PMLR, 2023

work page 2023

[6] [6]

S. Chen, S. Chewi, J. Li, Y . Li, A. Salim, and A. Zhang. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions. In The Eleventh International Conference on Learning Representations

work page

[7] [7]

Z. Chen, M. Katsoulakis, L. Rey-Bellet, and W. Zhu. Sample complexity of probability divergences under group symmetry. In International Conference on Machine Learning, pages 4713–4734. PMLR, 2023

work page 2023

[8] [8]

Z. Chen, M. A. Katsoulakis, L. Rey-Bellet, and W. Zhu. Statistical guarantees of group-invariant gans. arXiv preprint arXiv:2305.13517, 2023

work page arXiv 2023

[9] [9]

Cohen and M

T. Cohen and M. Welling. Group equivariant convolutional networks. In International conference on machine learning, pages 2990–2999. PMLR, 2016

work page 2016

[10] [10]

Conforti, A

G. Conforti, A. Durmus, and M. G. Silveri. Score diffusion models without early stopping: finite fisher information is all you need. arXiv preprint arXiv:2308.12240, 2023

work page arXiv 2023

[11] [11]

De Bortoli

V . De Bortoli. Convergence of denoising diffusion models under the manifold hypothesis. Transactions on Machine Learning Research, 2022

work page 2022

[12] [12]

L. C. Evans. Partial differential equations, volume 19. American Mathematical Society, 2022

work page 2022

[13] [13]

Fleming and H

W. Fleming and H. Soner. Controlled Markov Processes and Viscosity Solutions. Applications of mathematics. Springer, 2006

work page 2006

[14] [14]

Garcia Satorras, E

V . Garcia Satorras, E. Hoogeboom, F. Fuchs, I. Posner, and M. Welling. E (n) equivariant normalizing flows. Advances in Neural Information Processing Systems, 34:4181–4192, 2021

work page 2021

[15] [15]

Hairer, C

E. Hairer, C. Lubich, and G. Wanner. Geometric numerical integration , volume 31 of Springer Series in Computational Mathematics. Springer-Verlag, Berlin, second edition, 2006. Structure-preserving algorithms for ordinary differential equations

work page 2006

[16] [16]

J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

work page 2020

[17] [17]

Hoogeboom, V

E. Hoogeboom, V . G. Satorras, C. Vignac, and M. Welling. Equivariant diffusion for molecule generation in 3d. In International conference on machine learning, pages 8867–8887. PMLR, 2022

work page 2022

[18] [18]

Klein, A

L. Klein, A. Kr¨amer, and F. No´e. Equivariant flow matching. Advances in Neural Information Processing Systems, 36, 2024

work page 2024

[19] [19]

K¨ohler, L

J. K¨ohler, L. Klein, and F. No´e. Equivariant flows: exact likelihood generative learning for symmetric densities. In International conference on machine learning, pages 5361–5370. PMLR, 2020

work page 2020

[20] [20]

H. Lee, J. Lu, and Y . Tan. Convergence for score-based generative modeling with polynomial complexity. Advances in Neural Information Processing Systems, 35:22870–22882, 2022

work page 2022

[21] [21]

Leimkuhler and S

B. Leimkuhler and S. Reich. Simulating hamiltonian dynamics. Number 14. Cambridge university press, 2004

work page 2004

[22] [22]

H. Lu, S. Szabados, and Y . Yu. Diffusion models with group equivariance. InICML 2024 Workshop on Structured Probabilistic Inference & Generative Modeling, 2024

work page 2024

[23] [23]

Mimikos-Stamatopoulos, B

N. Mimikos-Stamatopoulos, B. J. Zhang, and M. A. Katsoulakis. Score-based generative models are provably robust: an uncertainty quantification perspective. arXiv preprint arXiv:2405.15754, 2024

work page arXiv 2024

[24] [24]

Spectral Normalization for Generative Adversarial Networks

T. Miyato, T. Kataoka, M. Koyama, and Y . Yoshida. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957, 2018. 17 A PREPRINT - OCTOBER 3, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2018

[25] [25]

K. Oko, S. Akiyama, and T. Suzuki. Diffusion models are minimax optimal distribution estimators. InInternational Conference on Machine Learning, pages 26517–26582. PMLR, 2023

work page 2023

[26] [26]

Singhal, M

R. Singhal, M. Goldstein, and R. Ranganath. What’s the score? automated denoising score matching for nonlinear diffusions. In International Conference on Machine Learning. PMLR, 2024

work page 2024

[27] [27]

J. Song, C. Meng, and S. Ermon. Denoising diffusion implicit models. In International Conference on Learning Representations

work page

[28] [28]

Song and S

Y . Song and S. Ermon. Generative modeling by estimating gradients of the data distribution.Advances in neural information processing systems, 32, 2019

work page 2019

[29] [29]

Y . Song, S. Garg, J. Shi, and S. Ermon. Sliced score matching: A scalable approach to density and score estimation. In Uncertainty in Artificial Intelligence, pages 574–584. PMLR, 2020

work page 2020

[30] [30]

Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2011

[31] [31]

Tahmasebi and S

B. Tahmasebi and S. Jegelka. Sample complexity bounds for estimating probability divergences under invariances. In Forty-first International Conference on Machine Learning, 2024

work page 2024

[32] [32]

H. V . Tran. Hamilton-Jacobi equations. Graduate studies in mathematics. American Mathematical Society, Providence, Rhode Island, 2021

work page 2021

[33] [33]

P. Vincent. A connection between score matching and denoising autoencoders. Neural computation, 23(7):1661– 1674, 2011

work page 2011

[34] [34]

B. J. Zhang and M. A. Katsoulakis. A mean-field games laboratory for generative modeling. arXiv preprint arXiv:2304.13534, 2023

work page arXiv 2023

[35] [35]

B. J. Zhang, S. Liu, W. Li, M. A. Katsoulakis, and S. J. Osher. Wasserstein proximal operators describe score-based generative models and resolve memorization. arXiv preprint arXiv:2402.06162, 2024. 18

work page arXiv 2024