A Tutorial on Diffusion Theory: From Differential Equations to Diffusion Models

Jiayi Fu; Yuxia Wang

arxiv: 2605.22586 · v1 · pith:MF62BIOCnew · submitted 2026-05-21 · 💻 cs.LG · cs.CL

A Tutorial on Diffusion Theory: From Differential Equations to Diffusion Models

Jiayi Fu , Yuxia Wang This is my paper

Pith reviewed 2026-05-22 07:09 UTC · model grok-4.3

classification 💻 cs.LG cs.CL

keywords diffusion modelsscore matchingstochastic differential equationsreverse SDEprobability flow ODEDDPMDDIM

0 comments

The pith

The standard noise-prediction objective is equivalent to score matching up to an additive constant independent of the model parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This tutorial develops diffusion models by starting from a conditional Gaussian forward process that adds noise to data points. It demonstrates that this process can be represented as both an ODE and an SDE, and when averaged over the data distribution, these yield marginal dynamics that transport the data distribution to a standard Gaussian. The paper then derives the reverse SDE and reverse probability-flow ODE, both controlled by the marginal score function. A central result is the equivalence between the usual noise-prediction training objective and score matching, differing only by a parameter-independent constant. This matters for readers because it explains why various diffusion sampling methods work and how they relate to score-based generative modeling.

Core claim

The paper shows that marginalizing the conditional Gaussian forward process produces forward ODE and SDE formulations transporting p_data to N(0,I). The reverse-time dynamics consist of a reverse SDE and a probability-flow ODE, both governed by the marginal score grad log p_t(x). This setup yields a score estimation training objective, with the result that the standard noise-prediction objective equals score matching plus a constant independent of model parameters. DDPM and DDIM are shown to share this objective, with their samplers corresponding to discrete versions of the reverse SDE and reverse ODE respectively.

What carries the argument

The marginal score function grad log p_t(x) that drives both the reverse SDE and the reverse probability-flow ODE.

If this is right

The reverse dynamics can be simulated using numerical integrators to generate samples from the data distribution.
DDPM sampling corresponds to a discrete approximation of the reverse SDE.
DDIM sampling corresponds to a discrete approximation of the reverse probability-flow ODE.
Guided generation is achieved by modifying the score with classifier guidance or classifier-free guidance.
Higher-order solvers such as DPM-Solver can be applied to the reverse ODE for faster sampling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The shown equivalence implies that any advance in efficient score estimation can be transferred to improve diffusion model training without changing the objective.
Treating the diffusion process as a deterministic ODE may enable new sampling algorithms that avoid the variance of stochastic paths.
The differential equation perspective could be used to analyze convergence rates or design custom noising schedules beyond the standard ones.

Load-bearing premise

The forward noising process is a Gaussian conditional distribution that admits equivalent ODE and SDE representations with well-defined marginals over the data distribution.

What would settle it

A calculation that shows the noise-prediction loss and the score-matching loss differ by a term that depends on the parameters of the model being trained would disprove the equivalence.

Figures

Figures reproduced from arXiv: 2605.22586 by Jiayi Fu, Yuxia Wang.

**Figure 2.** Figure 2: Marginalized forward process: the initial state [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Reverse process: the reverse dynamics start from Gaussian noise and progressively denoise [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

read the original abstract

This tutorial develops diffusion models from the viewpoint of differential equations. We begin with the conditional Gaussian forward process and show that this path admits both an ordinary differential equation (ODE) representation and a stochastic differential equation (SDE) representation. Averaging the conditional process over the data distribution then yields marginalized forward ODE and SDE formulations that transport the data distribution $p_0=p_{\mathrm{data}}$ to a Gaussian prior $p_1=\mathcal{N}(0,I)$. We next derive the corresponding reverse-time dynamics, namely the reverse SDE and the reverse probability-flow ODE, both of which are governed by the marginal score $\grad\log p_t(x)$. This leads to a training objective for score estimation and shows that the standard noise-prediction objective is equivalent to score matching up to an additive constant independent of the model parameters. We then discuss sampling methods for the learned reverse dynamics, including DPM-Solver, as well as guided sampling through classifier guidance and classifier-free guidance. Finally, we compare DDPM and DDIM with the reverse SDE/ODE framework and show that they share the same training objective, while DDPM sampling corresponds to discrete reverse-SDE sampling and DDIM sampling corresponds to reverse-ODE sampling.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This tutorial re-derives the standard SDE/ODE connections for diffusion models with clear steps but adds no new results or proofs.

read the letter

The main point is that this is a tutorial that starts from the conditional Gaussian forward process, shows it admits both ODE and SDE forms, marginalizes to get the data-to-noise transport, and then derives the reverse SDE and probability-flow ODE driven by the score. It correctly notes that the usual noise-prediction loss equals score matching plus a constant term independent of the model parameters, so the training objectives line up. Later parts cover DPM-Solver sampling, classifier and classifier-free guidance, and map DDPM to discrete reverse-SDE steps while DDIM maps to reverse-ODE steps. These links are presented in one connected narrative, which can save readers from chasing the same equivalences across multiple earlier papers. The derivations follow directly from the forward-process definition without circularity or invented quantities. The central claims line up with the existing literature the authors cite, and the stress-test note on the objective equivalence holds up because the extra term drops out of the gradient. No load-bearing contradictions appear in the outline. The soft spots are the expected ones for a tutorial: everything here is re-derivation rather than new mathematics or experiments, so the novelty is low and the scope stays explanatory. Without the full manuscript I cannot inspect every intermediate step or edge-case handling in the ODE/SDE transitions, but the abstract gives no sign of gaps or errors. The Gaussian conditional assumption is the standard one and is stated at the outset. This paper is aimed at readers who already know basic diffusion models and want the differential-equation framing spelled out in one place, or at students who prefer to see the score and noise-prediction views side by side. It is not aimed at experts hunting for advances. I would bring it to a reading group focused on generative-model foundations if the group wants a compact reference for the equivalences. I would not cite it in my own work. It deserves peer review for a venue that publishes tutorials or educational pieces, because the presentation is coherent and the math checks out on the points that are visible.

Referee Report

0 major / 3 minor

Summary. The paper is a tutorial that starts from the conditional Gaussian forward process and derives both its ODE and SDE representations. Marginalizing over the data distribution produces forward ODE/SDE dynamics that transport p_data to a standard Gaussian. The corresponding reverse-time SDE and probability-flow ODE are then obtained, both driven by the marginal score. This leads to a score-estimation training objective whose equivalence to the standard noise-prediction loss (up to a parameter-independent additive constant) is shown. The tutorial continues with sampling algorithms (including DPM-Solver), classifier and classifier-free guidance, and a comparison showing that DDPM corresponds to discrete reverse-SDE sampling while DDIM corresponds to reverse-ODE sampling, all sharing the same training objective.

Significance. If the derivations are accurate and clearly presented, the tutorial supplies a unified differential-equation perspective that connects the continuous SDE/ODE framework to the discrete DDPM/DDIM algorithms. The explicit demonstration that the noise-prediction objective differs from score matching only by an additive constant independent of model parameters is a standard but pedagogically useful result. The manuscript also supplies reproducible derivations and a consistent notation that could serve as a reference for newcomers to the field.

minor comments (3)

[§3] §3 (Reverse dynamics): the transition from the reverse SDE to the probability-flow ODE is stated without an explicit intermediate step showing how the diffusion term is removed; adding one line of algebra would improve readability.
[§4] §4 (Training objective): the claim that the constant term is independent of model parameters is correct but would benefit from a short parenthetical reminder that it equals E[||true score||²] evaluated under the marginal.
[§6] §6 (Comparison with DDPM/DDIM): the statement that both methods share the same training objective is accurate, yet the discrete-time indexing conventions (t = 0 … T versus continuous t ∈ [0,1]) are not aligned in a single equation; a small table mapping the two would eliminate potential confusion.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their accurate and positive summary of the manuscript, for highlighting its potential utility as a reference for newcomers, and for recommending minor revision. We appreciate the recognition that the explicit equivalence between the noise-prediction objective and score matching (differing only by a parameter-independent constant) is pedagogically useful, and that the connections between continuous SDE/ODE dynamics and discrete DDPM/DDIM sampling are clearly presented.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The tutorial derives the equivalence of the noise-prediction objective to score matching directly from the Gaussian conditional forward process and its marginalization, with the additive constant term shown to be independent of model parameters via explicit expansion of the loss. All central steps follow from the initial definitions of the forward ODE/SDE and the marginal score without any parameter fitting inside the paper, self-referential definitions, or load-bearing self-citations that reduce the result to its own inputs. The derivation is self-contained against the stated assumptions on the forward process.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The tutorial rests on standard properties of Gaussian processes and Itô calculus rather than introducing new fitted quantities or entities. No free parameters are introduced to support a novel claim.

axioms (2)

domain assumption The conditional forward process is Gaussian and admits both an ODE and an SDE representation.
Stated in the opening paragraph of the abstract as the starting point for all subsequent derivations.
domain assumption Averaging the conditional process over the data distribution yields well-defined marginal forward ODE and SDE that transport p_data to N(0,I).
Invoked immediately after the conditional-process statement to obtain the marginal dynamics.

pith-pipeline@v0.9.0 · 5744 in / 1288 out tokens · 36979 ms · 2026-05-22T07:09:09.805526+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

shows that the standard noise-prediction objective is equivalent to score matching up to an additive constant independent of the model parameters
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat recovery unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

conditional Gaussian forward kernel pt(x|x0) := N(x; αt x0, σ²t I)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · 4 internal anchors

[1]

Brian D. O. Anderson. Reverse-time diffusion equation models.Stochastic Processes and their Applications, 12(3):313–326, 1982

work page 1982
[2]

Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural ordinary differential equations. InAdvances in Neural Information Processing Systems, 2018

work page 2018
[3]

Diffusion models beat GANs on image synthesis

Prafulla Dhariwal and Alexander Nichol. Diffusion models beat GANs on image synthesis. In Advances in Neural Information Processing Systems, 2021

work page 2021
[4]

Denoising diffusion probabilistic models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems, 2020

work page 2020
[5]

Classifier-Free Diffusion Guidance

Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance.arXiv preprint arXiv:2207.12598, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[6]

An introduction to flow matching and diffusion models

Peter Holderrieth and Ezra Erives. An introduction to flow matching and diffusion models. arXiv, 2025

work page 2025
[7]

Estimation of non-normalized statistical models by score matching.Journal of Machine Learning Research, 6:695–709, 2005

Aapo Hyvärinen. Estimation of non-normalized statistical models by score matching.Journal of Machine Learning Research, 6:695–709, 2005

work page 2005
[8]

Elucidating the design space of diffusion-based generative models

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models. InAdvances in Neural Information Processing Systems, 2022

work page 2022
[9]

Kingma, Tim Salimans, Ben Poole, and Jonathan Ho

Diederik P. Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. Variational diffusion models. InAdvances in Neural Information Processing Systems, 2021

work page 2021
[10]

Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow matching for generative modeling. InInternational Conference on Learning Representations, 2023

work page 2023
[11]

DPM-Solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. DPM-Solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. InAdvances in Neural Information Processing Systems, 2022

work page 2022
[12]

DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. DPM- Solver++: Fast solver for guided sampling of diffusion probabilistic models.arXiv preprint arXiv:2211.01095, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[13]

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

Chenlin Meng, Yutong He, Yang Song, Jiaming Song, Jue Wang, and Stefano Ermon. SDEdit: Guided image synthesis and editing with stochastic differential equations.arXiv preprint arXiv:2108.01073, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[14]

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. GLIDE: Towards photorealistic image generation and editing with text-guided diffusion models.arXiv preprint arXiv:2112.10741, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[15]

Improved denoising diffusion probabilistic models

Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. InInternational Conference on Machine Learning, 2021. 52

work page 2021
[16]

Springer, 6th edition, 2003

Bernt Øksendal.Stochastic Differential Equations: An Introduction with Applications. Springer, 6th edition, 2003

work page 2003
[17]

Springer, 2nd edition, 1996

Hannes Risken.The Fokker–Planck Equation: Methods of Solution and Applications. Springer, 2nd edition, 1996

work page 1996
[18]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

work page 2022
[19]

Photorealistic text-to-image diffusion models with deep language understanding

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, Shiki Sagawa, Maithra Raghu, et al. Photorealistic text-to-image diffusion models with deep language understanding. InAdvances in Neural Information Processing Systems, 2022

work page 2022
[20]

Progressive distillation for fast sampling of diffusion models

Tim Salimans and Jonathan Ho. Progressive distillation for fast sampling of diffusion models. InInternational Conference on Learning Representations, 2022

work page 2022
[21]

Deep unsuper- vised learning using nonequilibrium thermodynamics

Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsuper- vised learning using nonequilibrium thermodynamics. InProceedings of the 32nd International Conference on Machine Learning, 2015

work page 2015
[22]

Denoising diffusion implicit models

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. In International Conference on Learning Representations, 2021

work page 2021
[23]

Generative modeling by estimating gradients of the data distribution

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. InAdvances in Neural Information Processing Systems, 2019

work page 2019
[24]

Improved techniques for training score-based generative models

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models. InAdvances in Neural Information Processing Systems, 2020

work page 2020
[25]

Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole

Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021

work page 2021
[26]

A connection between score matching and denoising autoencoders.Neural Computation, 23(7):1661–1674, 2011

Pascal Vincent. A connection between score matching and denoising autoencoders.Neural Computation, 23(7):1661–1674, 2011. 53

work page 2011

[1] [1]

Brian D. O. Anderson. Reverse-time diffusion equation models.Stochastic Processes and their Applications, 12(3):313–326, 1982

work page 1982

[2] [2]

Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. Neural ordinary differential equations. InAdvances in Neural Information Processing Systems, 2018

work page 2018

[3] [3]

Diffusion models beat GANs on image synthesis

Prafulla Dhariwal and Alexander Nichol. Diffusion models beat GANs on image synthesis. In Advances in Neural Information Processing Systems, 2021

work page 2021

[4] [4]

Denoising diffusion probabilistic models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems, 2020

work page 2020

[5] [5]

Classifier-Free Diffusion Guidance

Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance.arXiv preprint arXiv:2207.12598, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[6] [6]

An introduction to flow matching and diffusion models

Peter Holderrieth and Ezra Erives. An introduction to flow matching and diffusion models. arXiv, 2025

work page 2025

[7] [7]

Estimation of non-normalized statistical models by score matching.Journal of Machine Learning Research, 6:695–709, 2005

Aapo Hyvärinen. Estimation of non-normalized statistical models by score matching.Journal of Machine Learning Research, 6:695–709, 2005

work page 2005

[8] [8]

Elucidating the design space of diffusion-based generative models

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models. InAdvances in Neural Information Processing Systems, 2022

work page 2022

[9] [9]

Kingma, Tim Salimans, Ben Poole, and Jonathan Ho

Diederik P. Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. Variational diffusion models. InAdvances in Neural Information Processing Systems, 2021

work page 2021

[10] [10]

Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow matching for generative modeling. InInternational Conference on Learning Representations, 2023

work page 2023

[11] [11]

DPM-Solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. DPM-Solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. InAdvances in Neural Information Processing Systems, 2022

work page 2022

[12] [12]

DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. DPM- Solver++: Fast solver for guided sampling of diffusion probabilistic models.arXiv preprint arXiv:2211.01095, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[13] [13]

SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

Chenlin Meng, Yutong He, Yang Song, Jiaming Song, Jue Wang, and Stefano Ermon. SDEdit: Guided image synthesis and editing with stochastic differential equations.arXiv preprint arXiv:2108.01073, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021

[14] [14]

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Alex Nichol, Prafulla Dhariwal, Aditya Ramesh, Pranav Shyam, Pamela Mishkin, Bob McGrew, Ilya Sutskever, and Mark Chen. GLIDE: Towards photorealistic image generation and editing with text-guided diffusion models.arXiv preprint arXiv:2112.10741, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021

[15] [15]

Improved denoising diffusion probabilistic models

Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. InInternational Conference on Machine Learning, 2021. 52

work page 2021

[16] [16]

Springer, 6th edition, 2003

Bernt Øksendal.Stochastic Differential Equations: An Introduction with Applications. Springer, 6th edition, 2003

work page 2003

[17] [17]

Springer, 2nd edition, 1996

Hannes Risken.The Fokker–Planck Equation: Methods of Solution and Applications. Springer, 2nd edition, 1996

work page 1996

[18] [18]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

work page 2022

[19] [19]

Photorealistic text-to-image diffusion models with deep language understanding

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, Shiki Sagawa, Maithra Raghu, et al. Photorealistic text-to-image diffusion models with deep language understanding. InAdvances in Neural Information Processing Systems, 2022

work page 2022

[20] [20]

Progressive distillation for fast sampling of diffusion models

Tim Salimans and Jonathan Ho. Progressive distillation for fast sampling of diffusion models. InInternational Conference on Learning Representations, 2022

work page 2022

[21] [21]

Deep unsuper- vised learning using nonequilibrium thermodynamics

Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsuper- vised learning using nonequilibrium thermodynamics. InProceedings of the 32nd International Conference on Machine Learning, 2015

work page 2015

[22] [22]

Denoising diffusion implicit models

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. In International Conference on Learning Representations, 2021

work page 2021

[23] [23]

Generative modeling by estimating gradients of the data distribution

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. InAdvances in Neural Information Processing Systems, 2019

work page 2019

[24] [24]

Improved techniques for training score-based generative models

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models. InAdvances in Neural Information Processing Systems, 2020

work page 2020

[25] [25]

Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole

Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021

work page 2021

[26] [26]

A connection between score matching and denoising autoencoders.Neural Computation, 23(7):1661–1674, 2011

Pascal Vincent. A connection between score matching and denoising autoencoders.Neural Computation, 23(7):1661–1674, 2011. 53

work page 2011