arxiv: 2605.06553 · v1 · submitted 2026-05-07 · 💻 cs.LG

Recognition: unknown

Diverse Sampling in Diffusion Models with Marginal Preserving Particle Guidance

Gal Vinograd , Idan Achituve , Ethan Fetaya

Authors on Pith no claims yet

Pith reviewed 2026-05-08 12:25 UTC · model grok-4.3

classification 💻 cs.LG

keywords diffusion modelsdiverse samplingparticle guidanceFokker-Planck symmetriesmarginal preservationdivergence-free dynamicsguidance mechanism

0 comments

The pith

EDDY uses divergence-free perturbations to increase diversity in diffusion samples while exactly preserving marginal distributions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents EDDY, a guidance mechanism for diffusion and flow matching models that boosts sample diversity without sacrificing quality or requiring retraining. It achieves this by exploiting symmetries of the Fokker-Planck equation to introduce drift perturbations that alter particle trajectories but leave the evolving marginal distribution unchanged. The method constructs these perturbations using kernel-based anti-symmetric pairwise matrix fields based on repulsive directions, with practical approximations for computational efficiency in applications like text-to-image generation. Experiments demonstrate improved diversity alongside maintained distributional fidelity compared to baselines.

Core claim

EDDY instantiates the principle of marginal-preserving diversification through divergence-free dynamics induced by anti-symmetric pairwise matrix fields, allowing changes to joint particle behavior while each particle's marginal distribution remains preserved throughout the generative process.

What carries the argument

Kernel-based anti-symmetric pairwise matrix fields that generate divergence-free drift perturbations exploiting Fokker-Planck symmetries.

If this is right

Enhances diversity in samples from synthetic distributions and text-to-image tasks.
Maintains strong fidelity to the target distribution without extra training.
Provides efficient approximations suitable for high-dimensional perceptual embeddings.
Outperforms standard baselines in balancing variety and quality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This approach could be adapted to other types of generative models that rely on stochastic differential equations.
Further research might explore combining EDDY with other guidance techniques for additional control over outputs.
Optimizations to the kernel computations could enable real-time diverse sampling in interactive applications.

Load-bearing premise

The anti-symmetric pairwise matrix fields can be computed or approximated without introducing artifacts or bias that would alter the preserved marginal distributions.

What would settle it

If applying EDDY results in either reduced diversity or deviation from the target distribution in controlled experiments on standard benchmarks, the effectiveness of the method would be called into question.

Figures

Figures reproduced from arXiv: 2605.06553 by Ethan Fetaya, Gal Vinograd, Idan Achituve.

**Figure 1.** Figure 1: Qualitative comparison on the prompt “No one is in the room but there are chairs. high quality, 8k.” I.I.D (top): Low diversity between samples. CADS (upper-middle): High-quality samples but with low prompt adherence (people in the second sample). PG (lower-middle): Increase in variety with visible artifacts. EDDY (bottom): Diverse output that preserves photorealism and prompt fidelity. [6, 18, 33]. These … view at source ↗

**Figure 2.** Figure 2: Transport fields induced by PG (left) and EDDY (right) for a standard Gaussian target (gray contours). Depicting how the two particles (stars) will affect a third test particle at each location. PG applies the mean negative RBF kernel gradient, producing a purely repulsive field that pushes particles away from one another and distorts the sampling distribution. EDDY constructs an anti-symmetric drift from … view at source ↗

**Figure 3.** Figure 3: Mode Coverage vs. wg view at source ↗

**Figure 5.** Figure 5: Diversity–quality trade-off curves for EDDY and Particle Guidance (PG) on MS-COCO. Each point corresponds to a different guidance strength; the white star marks the I.I.D baseline. The x-axis measures pairwise DINO similarity (← lower = more diverse). EDDY consistently Pareto-dominates PG across all quality metrics (CLIPScore, Aesthetic, CMMD, FID) at matched diversity levels. of pre-trained networks, the … view at source ↗

read the original abstract

We present EDDY (Exact-marginal Diversification via Divergence-free dYnamics), a guidance mechanism for diffusion and flow matching models that promotes diversity among samples generated while maintaining quality. EDDY exploits symmetries of the Fokker-Planck equation, using drift perturbations that change particle trajectories while preserving the evolving marginal distribution. We instantiate this principle through kernel-based anti-symmetric pairwise matrix fields, constructed from the repulsive directions. The resulting divergence-free dynamics promote diversity at the joint particle level while preserving each particle's marginal distribution without any additional training. As computing the guidance can be computationally expensive in cases such as text-to-image generation with perceptual embeddings, we propose practical approximations as an effective and efficient solution. Experiments on synthetic distributions and text-to-image generation show that EDDY improves diversity while maintaining strong distributional fidelity compared to common baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EDDY adds diversity to diffusion sampling via anti-symmetric kernel drifts that aim to keep the marginal fixed, but the high-dimensional approximations are the part that needs the most scrutiny.

read the letter

The central idea is to perturb particle trajectories during sampling with a drift term built from repulsive pairwise kernels arranged to be anti-symmetric, so the added field has zero divergence and the evolving marginal stays identical to the unguided process. This is a clean way to get more spread-out samples without retraining or changing the model itself. The experiments on synthetic distributions and text-to-image tasks report better diversity scores while FID and similar fidelity measures stay comparable to standard baselines, which is the practical payoff they are after. The construction itself looks distinct from classifier-free guidance or simple noise injection, and they are explicit about using Fokker-Planck symmetries rather than fitting something ad hoc. That part is worth crediting. The soft spot is exactly where the stress-test note points: exact evaluation of the kernel fields is too slow in high-dimensional embedding spaces, so they switch to approximations. Nothing in the abstract or the high-level description guarantees those approximations retain exact zero divergence, and any residual term can integrate along the reverse trajectory. If the full paper only shows empirical diversity gains without bounds on the induced marginal shift or ablation on the approximation error, that weakens the central claim. Standard metrics like FID are not guaranteed to catch small systematic drifts. This is the kind of paper that belongs in a reading group focused on sampling and guidance methods. Readers who already work on diffusion or flow matching will get the most out of the concrete construction and the reported numbers. It is worth sending to peer review because the core principle is grounded and the experiments are concrete enough to evaluate, even if the approximation analysis will probably need tightening.

Referee Report

2 major / 2 minor

Summary. The paper introduces EDDY, a guidance mechanism for diffusion and flow-matching models that adds divergence-free drift perturbations derived from anti-symmetric pairwise kernel fields. These perturbations are designed to increase sample diversity at the joint level while exactly preserving the evolving marginal distribution of each particle, without requiring additional training. The method is instantiated via repulsive directions in perceptual embeddings and includes practical approximations for high-dimensional cases such as text-to-image generation. Experiments on synthetic distributions and real text-to-image tasks are reported to demonstrate improved diversity metrics while maintaining distributional fidelity relative to standard baselines.

Significance. If the central theoretical guarantee holds under the proposed approximations, EDDY would constitute a principled, training-free approach to diversity enhancement that directly exploits Fokker-Planck symmetries. This could be valuable for generative modeling pipelines where retraining is costly, particularly in perceptual domains. The explicit construction of anti-symmetric matrix fields and the emphasis on exact marginal preservation distinguish it from heuristic guidance methods.

major comments (2)

[Section 3.2, Eq. (8)] Section 3.2 and Eq. (8): the claim that the approximated anti-symmetric kernel field remains exactly divergence-free (or that the residual divergence integrates to zero over the reverse trajectory) is not accompanied by a quantitative error bound or empirical verification on the final marginal. Given that exact evaluation is stated to be prohibitive in high dimensions, a concrete demonstration that the practical approximation (e.g., truncated kernel or low-rank projection) preserves the zero-divergence property to within a controllable tolerance is required to support the central fidelity claim.
[Section 4.2, Table 2] Section 4.2, Table 2: the reported diversity gains on text-to-image tasks are presented without error bars across multiple random seeds or runs, and without a direct comparison of the empirical marginal distribution (e.g., via MMD or Wasserstein distance to the unguided baseline) that would confirm the approximation does not introduce detectable drift.

minor comments (2)

[Section 3.1] Notation for the kernel matrix K and the anti-symmetric operator A should be introduced with an explicit definition before first use in Section 3.1 to avoid ambiguity.
[Section 4.1] The synthetic-distribution experiments would benefit from a plot of the empirical marginal density at the final time step overlaid with the target density to visually confirm preservation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our work. We address each of the major comments in detail below and outline the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: [Section 3.2, Eq. (8)] Section 3.2 and Eq. (8): the claim that the approximated anti-symmetric kernel field remains exactly divergence-free (or that the residual divergence integrates to zero over the reverse trajectory) is not accompanied by a quantitative error bound or empirical verification on the final marginal. Given that exact evaluation is stated to be prohibitive in high dimensions, a concrete demonstration that the practical approximation (e.g., truncated kernel or low-rank projection) preserves the zero-divergence property to within a controllable tolerance is required to support the central fidelity claim.

Authors: We appreciate the referee highlighting this important point regarding the approximations. By construction, the exact anti-symmetric pairwise kernel field is divergence-free, as the divergence of any anti-symmetric matrix field is zero. For the practical approximations employed in high-dimensional cases, such as truncated kernels or low-rank projections, small residuals may arise. We agree that providing a quantitative error bound or empirical verification would better support the claims. In the revised manuscript, we will add a dedicated subsection analyzing the approximation error, including theoretical bounds on the residual divergence where feasible and empirical measurements on synthetic datasets where exact computation is possible. This will quantify the tolerance and reinforce the marginal preservation property under the approximations used. revision: yes
Referee: [Section 4.2, Table 2] Section 4.2, Table 2: the reported diversity gains on text-to-image tasks are presented without error bars across multiple random seeds or runs, and without a direct comparison of the empirical marginal distribution (e.g., via MMD or Wasserstein distance to the unguided baseline) that would confirm the approximation does not introduce detectable drift.

Authors: We acknowledge that the presentation of results in Table 2 can be improved by including variability measures. The reported experiments were performed across multiple random seeds, and we will include error bars (standard deviations) in the updated table. Additionally, to directly address the concern about marginal preservation, we will incorporate comparisons using metrics such as Maximum Mean Discrepancy (MMD) or Wasserstein distance between the distributions generated with and without EDDY guidance. These additions will provide empirical evidence that the approximations do not introduce significant drift in the marginal distributions. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation follows directly from Fokker-Planck symmetries via explicit anti-symmetric construction

full rationale

The paper constructs the EDDY guidance drift from the Fokker-Planck equation by requiring the perturbation to satisfy a weighted divergence-free condition (∇·(u ρ)=0) so that the marginal evolution is unchanged. This condition is enforced by design through kernel-based anti-symmetric pairwise matrix fields built from repulsive directions; the construction is algebraic and does not presuppose the target diversity or marginal statistics. Practical approximations are introduced for computational reasons, but the core claim that exact anti-symmetric fields preserve the marginal is independent of any fitted parameter or self-referential definition. No load-bearing self-citation, uniqueness theorem, or ansatz smuggling appears in the provided derivation chain; the method is a direct, verifiable application of existing PDE symmetries to the reverse SDE.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the existence of exploitable symmetries in the Fokker-Planck equation that permit divergence-free perturbations without altering marginals; no free parameters or invented entities are mentioned in the abstract.

axioms (1)

domain assumption Symmetries of the Fokker-Planck equation allow construction of drift perturbations that preserve the evolving marginal distribution.
Invoked as the foundational principle for EDDY in the abstract.

pith-pipeline@v0.9.0 · 5439 in / 1314 out tokens · 47954 ms · 2026-05-08T12:25:20.405146+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

51 extracted references · 7 canonical work pages · 2 internal anchors

[1]

URL https://huggingface.co/black-forest-labs/ FLUX.1-dev

FLUX.1 [dev], 2024. URL https://huggingface.co/black-forest-labs/ FLUX.1-dev

2024
[2]

Building normalizing flows with stochastic interpolants

Michael Samuel Albergo and Eric Vanden-Eijnden. Building normalizing flows with stochastic interpolants. InThe Eleventh International Conference on Learning Representations, 2023. URLhttps://openreview.net/forum?id=li7qeBbCR1t

2023
[3]

Brian D. O. Anderson. Reverse-time diffusion equation models.Stochastic Processes and their Applications, 12(3):313–326, May 1982. ISSN 0304-4149. doi: 10.1016/0304-4149(82) 90051-5. URL https://www.sciencedirect.com/science/article/pii/ 0304414982900515

work page doi:10.1016/0304-4149(82 1982
[4]

Structured denoising diffusion models in discrete state-spaces.Advances in neural information processing systems, 34:17981–17993, 2021

Jacob Austin, Daniel D Johnson, Jonathan Ho, Daniel Tarlow, and Rianne Van Den Berg. Structured denoising diffusion models in discrete state-spaces.Advances in neural information processing systems, 34:17981–17993, 2021

2021
[5]

Jaakkola

Gabriele Corso, Hannes Stärk, Bowen Jing, Regina Barzilay, and Tommi S. Jaakkola. DiffDock: Diffusion steps, twists, and turns for molecular docking. InThe Eleventh International Con- ference on Learning Representations, 2023. URL https://openreview.net/forum? id=kKF8_K-mBbS

2023
[6]

Jaakkola

Gabriele Corso, Yilun Xu, Valentin De Bortoli, Regina Barzilay, and Tommi S. Jaakkola. Particle guidance: non-I.I.D. diverse sampling with diffusion models. InThe Twelfth Interna- tional Conference on Learning Representations, 2024. URL https://openreview.net/ forum?id=KqbCvIFBY7

2024
[7]

Vision transformers need registers

Timothée Darcet, Maxime Oquab, Julien Mairal, and Piotr Bojanowski. Vision transformers need registers. InThe Twelfth International Conference on Learning Representations, 2024. URLhttps://openreview.net/forum?id=2dnO3LLiJ1

2024
[8]

Diffusion models beat GANs on image synthesis

Prafulla Dhariwal and Alexander Nichol. Diffusion models beat GANs on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021

2021
[9]

URLhttp://www.jstor.org/stable/23239562

Bradley Efron. Tweedie’s Formula and Selection Bias.Journal of the American Statistical Association, 106(496):1602–1614, 2011. ISSN 0162-1459. URL https://www.jstor. org/stable/23239562

work page arXiv 2011
[10]

It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models

Anne Harrington, A Koepke, Shyamgopal Karthik, Trevor Darrell, and Alexei A Efros. It’s never too late: Noise optimization for collapse recovery in trained diffusion models.arXiv preprint arXiv:2601.00090, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[11]

Clipscore: A reference-free evaluation metric for image captioning

Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, and Yejin Choi. CLIPScore: A Reference-free Evaluation Metric for Image Captioning. In Marie-Francine Moens, Xuan- jing Huang, Lucia Specia, and Scott Wen-tau Yih, editors,Proceedings of the 2021 Confer- ence on Empirical Methods in Natural Language Processing, pages 7514–7528, Online and Punta Can...

work page doi:10.18653/v1/2021.emnlp-main.595 2021
[12]

GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equi- librium

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochre- iter. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equi- librium. InAdvances in Neural Information Processing Systems, volume 30. Curran Asso- ciates, Inc., 2017. URL https://proceedings.neurips.cc/paper/2017/hash/ 8a1d694707eb0fefe6587136907492...

2017
[13]

Classifier-free diffusion guidance

Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance. InNeurIPS 2021 Work- shop on Deep Generative Models and Downstream Applications, 2021. URL https: //openreview.net/forum?id=qw8AKxfYbI

2021
[14]

Denoising diffusion probabilistic models.Advances in Neural Information Processing Systems, 33:6840–6851, 2020

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Advances in Neural Information Processing Systems, 33:6840–6851, 2020. 10

2020
[15]

Equivariant diffusion for molecule generation in 3D

Emiel Hoogeboom, Vıctor Garcia Satorras, Clément Vignac, and Max Welling. Equivariant diffusion for molecule generation in 3D. InInternational conference on machine learning, pages 8867–8887. PMLR, 2022

2022
[16]

Simulation-free differential dynamics through neural conservation laws

Mengjian Hua, Eric Vanden-Eijnden, and Ricky TQ Chen. Simulation-free differential dynamics through neural conservation laws. InConference on Uncertainty in Artificial Intelligence, pages 1730–1744. PMLR, 2025

2025
[17]

An information-theoretic evaluation of generative models in learning multi-modal distributions.Advances in Neural Information Processing Systems, 36:9931–9943, 2023

Mohammad Jalali, Cheuk Ting Li, and Farzan Farnia. An information-theoretic evaluation of generative models in learning multi-modal distributions.Advances in Neural Information Processing Systems, 36:9931–9943, 2023

2023
[18]

SPARKE: Scalable prompt- aware diversity and novelty guidance in diffusion models via RKE score

Mohammad Jalali, Haoyu LEI, Amin Gohari, and Farzan Farnia. SPARKE: Scalable prompt- aware diversity and novelty guidance in diffusion models via RKE score. InThe Thirty- ninth Annual Conference on Neural Information Processing Systems, 2026. URL https: //openreview.net/forum?id=1YLpf8nUIq

2026
[19]

Rethinking FID: Towards a better evaluation metric for image generation

Sadeep Jayasumana, Srikumar Ramalingam, Andreas Veit, Daniel Glasner, Ayan Chakrabarti, and Sanjiv Kumar. Rethinking FID: Towards a better evaluation metric for image generation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9307–9315, 2024

2024
[20]

Torsional diffusion for molecular conformer generation.Advances in neural information processing systems, 35:24240–24253, 2022

Bowen Jing, Gabriele Corso, Jeffrey Chang, Regina Barzilay, and Tommi Jaakkola. Torsional diffusion for molecular conformer generation.Advances in neural information processing systems, 35:24240–24253, 2022

2022
[21]

Diverse text-to-image generation via contrastive noise optimization

Byungjun Kim, Soobin Um, and Jong Chul Ye. Diverse text-to-image generation via contrastive noise optimization. InThe Fourteenth International Conference on Learning Representations,
[22]

URLhttps://openreview.net/forum?id=EVRMnAREc3
[23]

Shielded diffusion: Generating novel and diverse images using sparse repellency

Michael Kirchhof, James Thornton, Louis Béthune, Pierre Ablin, Eugene Ndiaye, and Marco Cuturi. Shielded diffusion: Generating novel and diverse images using sparse repellency. In International Conference on Machine Learning, pages 30911–30942. PMLR, 2025

2025
[24]

Feedback guidance of diffusion models

Felix Koulischer, Florian Handke, Johannes Deleu, Thomas Demeester, and Luca Ambrogioni. Feedback guidance of diffusion models. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2026. URL https://openreview.net/forum?id= 8ySOcf7UpM

2026
[25]

Applying guidance in a limited interval improves sample and distribution quality in diffusion models.Advances in Neural Information Processing Systems, 37:122458–122483, 2024

Tuomas Kynkäänniemi, Miika Aittala, Tero Karras, Samuli Laine, Timo Aila, and Jaakko Lehtinen. Applying guidance in a limited interval improves sample and distribution quality in diffusion models.Advances in Neural Information Processing Systems, 37:122458–122483, 2024

2024
[26]

Laion-aesthetics predictor v2, 2022

LAION-AI Team. Laion-aesthetics predictor v2, 2022. URL https://github.com/ christophschuhmann/improved-aesthetic-predictor

2022
[27]

Diffusion-LM improves controllable text generation.Advances in neural information processing systems, 35:4328–4343, 2022

Xiang Li, John Thickstun, Ishaan Gulrajani, Percy S Liang, and Tatsunori B Hashimoto. Diffusion-LM improves controllable text generation.Advances in neural information processing systems, 35:4328–4343, 2022

2022
[28]

Microsoft COCO: Common objects in context

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft COCO: Common objects in context. InEuropean conference on computer vision, pages 740–755. Springer, 2014

2014
[29]

Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le. Flow matching for generative modeling. InThe Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=PqvMRDCJT9t

2023
[30]

Flow straight and fast: Learning to generate and transfer data with rectified flow

Xingchao Liu, Chengyue Gong, and qiang liu. Flow straight and fast: Learning to generate and transfer data with rectified flow. InThe Eleventh International Conference on Learning Representations, 2023. URLhttps://openreview.net/forum?id=XVjTT1nw5z. 11

2023
[31]

Importance-weighted non-IID sampling for flow matching models.arXiv preprint arXiv:2511.17812, 2025

Xinshuang Liu, Runfa Blark Li, Shaoxiu Wei, and Truong Nguyen. Importance-weighted non-IID sampling for flow matching models.arXiv preprint arXiv:2511.17812, 2025

work page arXiv 2025
[32]

Discrete diffusion modeling by estimating the ratios of the data distribution

Aaron Lou, Chenlin Meng, and Stefano Ermon. Discrete diffusion modeling by estimating the ratios of the data distribution. InForty-first International Conference on Machine Learning,
[33]

URLhttps://openreview.net/forum?id=CNicRIVIPA
[34]

ProCreate, don’t reproduce! propulsive energy diffusion for creative generation

Jack Lu, Ryan Teehan, and Mengye Ren. ProCreate, don’t reproduce! propulsive energy diffusion for creative generation. InEuropean Conference on Computer Vision, pages 397–414. Springer, 2024

2024
[35]

DiverseFlow: Sample-efficient diverse mode coverage in flows

Mashrur M Morshed and Vishnu Boddeti. DiverseFlow: Sample-efficient diverse mode coverage in flows. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 23303–23312, 2025

2025
[36]

Narcowich and Joseph D

Francis J. Narcowich and Joseph D. Ward. Generalized Hermite interpolation via matrix-valued conditionally positive definite functions.Mathematics of Computation, 63(208):661–661, January 1994. ISSN 0025-5718. doi: 10.1090/S0025-5718-1994-1254147-6. URL https: //www.ams.org/mcom/1994-63-208/S0025-5718-1994-1254147-6/

work page doi:10.1090/s0025-5718-1994-1254147-6 1994
[37]

Representing flow fields with divergence-free kernels for reconstruction.Proceedings of the ACM on Computer Graphics and Interactive Techniques, 8(4):1–21, 2025

Xingyu Ni, Jingrui Xing, Xingqiao Li, Bin Wang, and Baoquan Chen. Representing flow fields with divergence-free kernels for reconstruction.Proceedings of the ACM on Computer Graphics and Interactive Techniques, 8(4):1–21, 2025

2025
[38]

Representation Learning with Contrastive Predictive Coding

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding.arXiv preprint arXiv:1807.03748, 2018

work page internal anchor Pith review arXiv 2018
[39]

Inequalities for differential and integral equations

Baburao G Pachpatte et al. Inequalities for differential and integral equations. Technical report, Academic press, 1998

1998
[40]

Scaling group inference for diverse and high-quality generation

Gaurav Parmar, Or Patashnik, Daniil Ostashev, Kuan-Chieh Jackson Wang, Kfir Aberman, Srinivasa Narasimhan, and Jun-Yan Zhu. Scaling group inference for diverse and high-quality generation. InThe Fourteenth International Conference on Learning Representations, 2026. URLhttps://openreview.net/forum?id=IyTNxjTuWT

2026
[41]

SDXL: Improving latent diffusion models for high-resolution image synthesis

Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach. SDXL: Improving latent diffusion models for high-resolution image synthesis. InThe Twelfth International Conference on Learning Representations, 2024. URLhttps://openreview.net/forum?id=di52zR8xgf

2024
[42]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021

2021
[43]

Jack Richter-Powell, Yaron Lipman, and Ricky T. Q. Chen. Neural Con- servation Laws: A Divergence-Free Perspective.Advances in Neural In- formation Processing Systems, 35:38075–38088, December 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/hash/ f8d39584f87944e5dbe46ec76f19e20a-Abstract-Conference.html

2022
[44]

High- resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High- resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022

2022
[45]

Seyedmorteza Sadat, Jakob Buhmann, Derek Bradley, Otmar Hilliges, and Romann M. Weber. CADS: Unleashing the diversity of diffusion models through condition-annealed sampling. InThe Twelfth International Conference on Learning Representations, 2024. URL https: //openreview.net/forum?id=zMoNrajk2X

2024
[46]

Simple and effective masked diffusion language models.Advances in Neural Information Processing Systems, 37:130136–130184, 2024

Subham S Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, and V olodymyr Kuleshov. Simple and effective masked diffusion language models.Advances in Neural Information Processing Systems, 37:130136–130184, 2024. 12

2024
[47]

Deep unsuper- vised learning using nonequilibrium thermodynamics

Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsuper- vised learning using nonequilibrium thermodynamics. InInternational conference on machine learning, pages 2256–2265. pmlr, 2015

2015
[48]

Denoising diffusion implicit models

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. In International Conference on Learning Representations, 2021. URL https://openreview. net/forum?id=St1giarCHLP

2021
[49]

Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole

Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-Based Generative Modeling through Stochastic Differential Equations. In9th International Conference on Learning Representations ICLR. OpenReview.net, 2021

2021
[50]

A bound for the error in the normal approximation to the distribution of a sum of dependent random variables

Charles Stein. A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. InProceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory, volume 6.2, pages 583–603. University of California Press, January 1972

1972
[51]

GeoDiff: A geometric diffusion model for molecular conformation generation

Minkai Xu, Lantao Yu, Yang Song, Chence Shi, Stefano Ermon, and Jian Tang. GeoDiff: A geometric diffusion model for molecular conformation generation. InInternational Con- ference on Learning Representations, 2022. URL https://openreview.net/forum? id=PzcvxEMzvQC. 13 A Proof of Claim 2 Setup Consider the modified (true) dynamics dxi t = µt(xi t) +ψ i t(xi...

2022