Recognition: 2 theorem links
· Lean TheoremClassifier-Free Diffusion Guidance
Pith reviewed 2026-05-10 14:55 UTC · model grok-4.3
The pith
Conditional diffusion models can guide their own sampling by combining conditional and unconditional score estimates without needing a separate classifier.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We show that guidance can be indeed performed by a pure generative model without such a classifier: in what we call classifier-free guidance, we jointly train a conditional and an unconditional diffusion model, and we combine the resulting conditional and unconditional score estimates to attain a trade-off between sample quality and diversity similar to that obtained using classifier guidance.
What carries the argument
Classifier-free guidance: the scaled difference between the conditional score estimate and the unconditional score estimate, used to steer the reverse diffusion process at sampling time.
If this is right
- Only one model needs to be trained to enable both conditional generation and guidance.
- The guidance scale remains adjustable after training, just as with classifier guidance.
- No auxiliary classifier or its training data is required.
- The same sampling procedure yields controllable fidelity-diversity trade-offs.
Where Pith is reading between the lines
- Guidance emerges from having access to both conditional and unconditional distributions inside the same generative model rather than from external class supervision.
- The method lowers the setup cost for high-fidelity conditional sampling in settings where reliable classifiers are expensive or unavailable.
- It suggests that unconditional score estimates already encode information useful for directing conditional trajectories.
Load-bearing premise
That linearly combining the conditional and unconditional score estimates produces guidance behavior comparable to classifier gradients without introducing new artifacts or mode collapse.
What would settle it
Train the joint model on a standard dataset such as ImageNet, then compare the quality-diversity curves obtained by varying the guidance scale against those from a separately trained classifier; mismatch at high scales would falsify equivalence.
read the original abstract
Classifier guidance is a recently introduced method to trade off mode coverage and sample fidelity in conditional diffusion models post training, in the same spirit as low temperature sampling or truncation in other types of generative models. Classifier guidance combines the score estimate of a diffusion model with the gradient of an image classifier and thereby requires training an image classifier separate from the diffusion model. It also raises the question of whether guidance can be performed without a classifier. We show that guidance can be indeed performed by a pure generative model without such a classifier: in what we call classifier-free guidance, we jointly train a conditional and an unconditional diffusion model, and we combine the resulting conditional and unconditional score estimates to attain a trade-off between sample quality and diversity similar to that obtained using classifier guidance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that guidance in conditional diffusion models can be performed without a separate classifier by jointly training a single model on conditional and unconditional objectives (via random condition dropout) and linearly combining the resulting conditional and unconditional score estimates at sampling time, achieving a quality-diversity trade-off comparable to classifier guidance.
Significance. If the empirical results hold, the work is significant for simplifying the training pipeline of conditional diffusion models, eliminating the need for an auxiliary classifier, and providing a practical inference-time control mechanism. The joint training procedure and the score-difference identity enable this, with ImageNet experiments demonstrating similar FID/IS trade-offs across guidance scales; this has become a foundational technique in the field.
minor comments (3)
- §3.1, Eq. (5): the guidance formula is clearly derived, but the text should explicitly note that the linear combination is an approximation whose fidelity depends on the diffusion timestep and noise schedule, to better ground the 'similar trade-off' claim.
- Table 1: the caption does not specify the exact number of generated samples used for FID computation or whether the same random seed protocol was used across guidance scales, which affects reproducibility of the reported trade-offs.
- Figure 4: the y-axis scale for diversity metrics is not labeled consistently with the main text, making it difficult to directly compare the classifier-free and classifier-guided curves.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our work and the recommendation for minor revision. No specific major comments were raised in the report.
Circularity Check
No significant circularity detected
full rationale
The paper introduces a new joint training procedure (condition dropout to learn both conditional and unconditional diffusion scores) and an inference-time linear combination rule. The guidance formula is derived directly from the Bayes identity relating score differences to classifier gradients, which is a mathematical fact independent of the present work. No step reduces a claimed prediction to a fitted input by construction, no load-bearing uniqueness theorem is imported via self-citation, and the central claim is tested empirically on ImageNet rather than asserted by redefinition or renaming. The derivation chain is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- guidance scale
axioms (1)
- domain assumption A diffusion model can be trained jointly on conditional and unconditional objectives without one degrading the other.
Lean theorems connected to this paper
-
IndisputableMonolith.Cost.FunctionalEquationwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we jointly train a conditional and an unconditional diffusion model, and we combine the resulting conditional and unconditional score estimates to attain a trade-off between sample quality and diversity similar to that obtained using classifier guidance
-
IndisputableMonolith.Foundation.DAlembert.Inevitabilitybilinear_family_forced unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
classifier guidance combines the score estimate of a diffusion model with the gradient of an image classifier
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 60 Pith papers
-
Autoregressive Learning in Joint KL: Sharp Oracle Bounds and Lower Bounds
Joint KL yields horizon-free approximation but an information-theoretic lower bound of order Omega(H) for estimation error in autoregressive learning, with matching computationally efficient upper bounds.
-
Inference-Time Refinement Closes the Synthetic-Real Gap in Tabular Diffusion
Inference-time refinement of pre-trained tabular diffusion models via Bidirectional Chamfer Refinement achieves median 8.6% better downstream performance than real data across 15 benchmarks while preserving fidelity a...
-
A Priori Sampling of Transition States with Guided Diffusion
ASTRA reframes transition-state search as guided diffusion inference that samples the isodensity surface between metastable basins and converges to first-order saddles via score differences and physical forces.
-
Large Language Diffusion Models
LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.
-
Guided Diffusion Sampling for Precipitation Forecast Interventions
Gradient-guided diffusion sampling reduces extreme precipitation forecasts in data-driven weather models while producing more physically plausible changes than adversarial perturbations.
-
Diagnosing and Correcting Concept Omission in Multimodal Diffusion Transformers
Text embeddings in MM-DiTs contain a detectable omission signal for missing concepts, and amplifying it via OSI reduces concept omission in generated images on FLUX.1-Dev and SD3.5-Medium.
-
R-DMesh: Video-Guided 3D Animation via Rectified Dynamic Mesh Flow
R-DMesh generates high-fidelity 4D meshes aligned to video by disentangling base mesh, motion, and a learned rectification jump offset inside a VAE, then using Triflow Attention and rectified-flow diffusion.
-
HIR-ALIGN: Enhancing Hyperspectral Image Restoration via Diffusion-Based Data Generation
HIR-ALIGN augments limited target data for hyperspectral restoration by creating proxy clean images, synthesizing aligned HSIs with blur-robust diffusion and warp-based transfer, then finetuning models to lower target...
-
Stylized Text-to-Motion Generation via Hypernetwork-Driven Low-Rank Adaptation
A hypernetwork maps style motion embeddings to LoRA updates that stylize text-driven motion diffusion models with improved generalization to unseen styles via contrastive structuring of the style space.
-
Margin-calibrated Classifier Guidance for Property-driven Synthesis Planning
Margin-calibrated classifier guidance via Sequence Completion Ranking raises multi-step retrosynthesis solve rates from 16.8% to 95.3% on USPTO-190 and unlocks previously unsolvable targets.
-
Amortized Guidance for Image Inpainting with Pretrained Diffusion Models
AID amortizes guidance for diffusion inpainting by training a reusable module via an auxiliary Gaussian formulation and continuous-time actor-critic algorithm, improving quality-speed trade-off with under 1% overhead.
-
DirectTryOn: One-Step Virtual Try-On via Straightened Conditional Transport
DirectTryOn achieves state-of-the-art one-step virtual try-on performance by applying pure conditional transport, garment preservation loss, and self-consistency loss to straighten trajectories in pretrained generativ...
-
Generative Motion In-betweening by Diffusion over Continuous Implicit Representations
A latent diffusion model over continuous implicit neural representations samples INR parameters from sparse keyframes to reconstruct plausible, smooth, and diverse motions while preserving keyframe accuracy.
-
Constraint-Aware Flow Matching: Decision Aligned End-to-End Training for Constrained Sampling
Constraint-Aware Flow Matching integrates constraint projections into the flow matching training objective to align model dynamics with constrained sampling and reduce distributional shift.
-
MindVLA-U1: VLA Beats VA with Unified Streaming Architecture for Autonomous Driving
MindVLA-U1 introduces a unified streaming VLA with shared backbone, framewise memory, and language-guided action diffusion that surpasses human drivers on WOD-E2E planning metrics.
-
Is Monotonic Sampling Necessary in Diffusion Models?
Non-monotonic sampling schedules never improve upon monotonic baselines in diffusion models, with performance gaps ranging from substantial to negligible depending on the denoiser.
-
One-Step Generative Modeling via Wasserstein Gradient Flows
W-Flow achieves state-of-the-art one-step ImageNet 256x256 generation at 1.29 FID by training a static neural network to follow a Wasserstein gradient flow that minimizes Sinkhorn divergence, delivering roughly 100x f...
-
ScaleMoGen: Autoregressive Next-Scale Prediction for Human Motion Generation
ScaleMoGen introduces a scale-wise autoregressive framework that quantizes motions into hierarchical discrete tokens and predicts next-scale maps to achieve SOTA FID 0.030 on HumanML3D and text-guided editing.
-
Single-Shot HDR Recovery via a Video Diffusion Prior
Single-shot HDR is achieved by conditioning a video diffusion model on an LDR input to generate an exposure bracket and fusing the bracket with per-pixel weights from a lightweight UNet.
-
Coordinated Diffusion: Generating Multi-Agent Behavior Without Multi-Agent Demonstrations
CoDi decomposes the multi-agent diffusion score into pre-trained single-agent policies plus a gradient-free cost guidance term to generate coordinated behavior from single-agent data alone.
-
Efficient Adjoint Matching for Fine-tuning Diffusion Models
EAM speeds up adjoint matching for diffusion model reward fine-tuning by switching to linear base drift, allowing deterministic few-step solvers and closed-form adjoints with up to 4x faster convergence on text-to-ima...
-
Composing diffusion priors with explicit physical context via generative Gibbs sampling
GG-PA composes diffusion priors with physical context via a derived Gibbs sampler that is asymptotically exact as diffusion time approaches zero and exact at finite times for quadratic interactions.
-
Muninn: Your Trajectory Diffusion Model But Faster
Muninn accelerates diffusion trajectory planners up to 4.6x by spending an uncertainty budget to decide when to cache denoiser outputs, preserving performance and certifying bounded deviation from full computation.
-
Inverse Design for Conditional Distribution Matching
Defines Conditional Distribution Matching (CDM) as finding inputs whose induced conditional distributions match a target distribution and proposes the MLGD-F inference-time algorithm using pretrained diffusion models ...
-
Remix the Timbre: Diffusion-Based Style Transfer Across Polyphonic Stems
MixtureTT performs direct per-stem timbre transfer on polyphonic mixtures via a shared diffusion transformer, outperforming single-stem baselines on SATB choral data while eliminating cascaded separation errors.
-
TMPO: Trajectory Matching Policy Optimization for Diverse and Efficient Diffusion Alignment
TMPO uses Softmax Trajectory Balance to match policy probabilities over multiple trajectories to a Boltzmann reward distribution, improving diversity by 9.1% in diffusion alignment tasks.
-
TMPO: Trajectory Matching Policy Optimization for Diverse and Efficient Diffusion Alignment
TMPO replaces scalar reward maximization with trajectory-level matching to a Boltzmann distribution via Softmax-TB, improving generative diversity by 9.1% while keeping competitive reward performance.
-
TARO: Temporal Adversarial Rectification Optimization Using Diffusion Models as Purifiers
TARO builds a temporally guided score prior from high-noise and low-noise diffusion views to purify adversarial examples more robustly than uniform timestep methods.
-
Guidance Is Not a Hyperparameter: Learning Dynamic Control in Diffusion Language Models
Adaptive guidance trajectories learned via PPO outperform fixed-scale CFG on controllability-quality balance in three controlled NLP generation tasks with discrete diffusion models.
-
OphEdit: Training-Free Text-Guided Editing of Ophthalmic Surgical Videos
OphEdit enables text-guided editing of eye surgery videos without training by injecting preserved attention value tensors into the diffusion denoising process to maintain anatomical structure.
-
Test-Time Compositional Generalization in Diffusion Models via Concept Discovery
Diffusion models can extract reusable density-mode concepts from their time-indexed scores to enable compositional generation at test time on held-out benchmarks from ColorMNIST and CelebA.
-
DCR: Counterfactual Attractor Guidance for Rare Compositional Generation
DCR uses a counterfactual attractor and projection-based repulsion to suppress default completion bias in diffusion models, improving fidelity for rare compositional prompts while preserving quality.
-
A Flow Matching Algorithm for Many-Shot Adaptation to Unseen Distributions
FP-FM adapts flow matching models to unseen distributions via least-squares projection onto basis functions spanning training velocity fields, yielding improved precision and recall without inference-time training.
-
Autoregressive Visual Generation Needs a Prologue
Prologue introduces dedicated prologue tokens to decouple generation and reconstruction in AR visual models, significantly improving generation FID scores on ImageNet while maintaining reconstruction quality.
-
MaMi-HOI: Harmonizing Global Kinematics and Local Geometry for Human-Object Interaction Generation
MaMi-HOI counters geometric forgetting in diffusion models via a Geometry-Aware Proximity Adapter for precise contacts and a Kinematic Harmony Adapter for natural whole-body postures in human-object interactions.
-
Concurrence of Symmetry Breaking and Nonlocality Phase Transitions in Diffusion Models
Symmetry breaking and nonlocality phase transitions occur nearly simultaneously during diffusion model generation in modern transformers.
-
DMGD: Train-Free Dataset Distillation with Semantic-Distribution Matching in Diffusion Models
DMGD achieves better performance than fine-tuned SOTA methods in dataset distillation on ImageNet subsets by using semantic matching through conditional likelihood optimization and OT-based distribution matching in a ...
-
FluxFlow: Conservative Flow-Matching for Astronomical Image Super-Resolution
FluxFlow is a conservative pixel-space flow-matching framework for astronomical super-resolution that incorporates real atmospheric uncertainty and a training-free Wiener correction, outperforming baselines on a new 1...
-
Tempered Guided Diffusion
Tempered Guided Diffusion uses annealed SMC to produce consistent particle approximations to the posterior for training-free conditional diffusion sampling, outperforming independent guided trajectories in experiments.
-
AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics
AniMatrix generates anime videos by structuring artistic production rules into a controllable taxonomy and training the model to prioritize those rules over physical realism, achieving top scores from professional ani...
-
AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics
AniMatrix generates anime videos using a production knowledge taxonomy, dual-channel conditioning, style-motion curriculum, and deformation-aware preference optimization, outperforming baselines in animator evaluation...
-
AniMatrix: An Anime Video Generation Model that Thinks in Art, Not Physics
AniMatrix generates anime videos using a structured taxonomy of artistic production variables, dual-channel conditioning, a style-motion curriculum, and deformation-aware optimization to prioritize art over physics.
-
DirectEdit: Step-Level Accurate Inversion for Flow-Based Image Editing
DirectEdit achieves step-level accurate inversion for flow-based image editing by directly aligning forward paths, using attention feature injection and mask-guided noise blending to balance fidelity and editability w...
-
Structured Diffusion Bridges: Inductive Bias for Denoising Diffusion Bridges
Structured diffusion bridges with alignment constraints achieve near fully-paired quality in modality translation while working effectively in unpaired and semi-paired regimes.
-
VAnim: Rendering-Aware Sparse State Modeling for Structure-Preserving Vector Animation
VAnim creates open-domain text-to-SVG animations via sparse state updates on a persistent DOM tree, identification-first planning, and rendering-aware RL with a new 134k-example benchmark.
-
Focus on the Core: Empowering Diffusion Large Language Models by Self-Contrast
FoCore uses self-contrast on early-converging high-density tokens to boost diffusion LLM quality on reasoning benchmarks while cutting decoding steps by over 2x.
-
ScribbleEdit: Synthetic Data for Image Editing with Scribbles and Text
ScribbleEdit is a synthetic dataset combining scribbles and text for training image editing models that produce spatially aligned and semantically consistent results.
-
Posterior Augmented Flow Matching
PAFM augments flow matching with an importance-sampled mixture over an approximate posterior of target completions, yielding an unbiased lower-variance estimator that improves FID by up to 3.4 on ImageNet and CC12M.
-
Watch Your Step: Information Injection in Diffusion Models via Shadow Timestep Embedding
Timestep embeddings in diffusion models function as a separable side channel that can carry dedicated information for adversarial injection or detection.
-
FieryGS: In-the-Wild Fire Synthesis with Physics-Integrated Gaussian Splatting
FieryGS integrates LLM-based material reasoning, volumetric combustion simulation, and a unified renderer with 3D Gaussian Splatting to generate physically plausible and user-controllable fire in in-the-wild scenes.
-
AMGenC: Generating Charge Balanced Amorphous Materials
AMGenC generates guaranteed charge-balanced amorphous materials using element noise initialization combined with per-step soft and final discrete projections in a generative model.
-
ResetEdit: Precise Text-guided Editing of Generated Image via Resettable Starting Latent
ResetEdit embeds a recoverable discrepancy signal during image generation in diffusion models to reconstruct an approximate original latent for high-fidelity text-guided editing.
-
Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling
Talker-T2AV achieves better lip-sync accuracy, video quality, and audio quality than dual-branch baselines by separating high-level shared autoregressive modeling from modality-specific low-level diffusion refinement ...
-
Oracle Noise: Faster Semantic Spherical Alignment for Interpretable Latent Optimization
Oracle Noise optimizes diffusion model noise on a Riemannian hypersphere guided by key prompt words to preserve the Gaussian prior, eliminate norm inflation, and achieve faster semantic alignment than Euclidean methods.
-
$Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models
Z²-Sampling implicitly realizes zero-cost zigzag trajectories for curvature-aware semantic alignment in diffusion models by reducing multi-step paths via operator dualities and temporal caching while synthesizing a di...
-
CODA: Coordination via On-Policy Diffusion for Multi-Agent Offline Reinforcement Learning
CODA augments offline multi-agent RL with on-policy diffusion trajectories that evolve with the joint policy to enable coordination.
-
DCMorph: Face Morphing via Dual-Stream Cross-Attention Diffusion
DCMorph generates face morphs via decoupled cross-attention in identity-conditioned diffusion and DDIM spherical interpolation, achieving higher attack success rates on four face recognition systems than prior methods...
-
TacticGen: Grounding Adaptable and Scalable Generation of Football Tactics
TacticGen generates realistic, adaptable football tactics via a multi-agent diffusion transformer trained on 3.3M events and 100M frames, supporting rule-, language-, or model-based guidance at inference time.
-
DanceCrafter: Fine-Grained Text-Driven Controllable Dance Generation via Choreographic Syntax
DanceCrafter generates high-fidelity, text-controlled dance sequences using a new Choreographic Syntax framework and a large fine-grained motion dataset.
-
ScenarioControl: Vision-Language Controllable Vectorized Latent Scenario Generation
ScenarioControl introduces the first vision-language controllable generator for realistic vectorized 3D driving scenarios with temporal consistency across actor views.
Reference graph
Works this paper leans on
-
[1]
Large scale GAN training for high fidelity natural image synthesis
Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale GAN training for high fidelity natural image synthesis. In International Conference on Learning Representations, 2019
work page 2019
-
[2]
WaveGrad : Estimating gradients for waveform generation
Nanxin Chen, Yu Zhang, Heiga Zen, Ron J Weiss, Mohammad Norouzi, and William Chan. WaveGrad : Estimating gradients for waveform generation. International Conference on Learning Representations , 2021
work page 2021
-
[3]
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal and Alex Nichol. Diffusion models beat GAN s on image synthesis. arXiv preprint arXiv:2105.05233, 2021
work page internal anchor Pith review arXiv 2021
-
[4]
Semi-supervised learning by entropy minimization
Yves Grandvalet and Yoshua Bengio. Semi-supervised learning by entropy minimization. In Proceedings of the 17th International Conference on Neural Information Processing Systems, pp.\ 529--536, 2004
work page 2004
-
[5]
Suboptimal behavior of bayes and mdl in classification under misspecification
Peter Gr \"u nwald and John Langford. Suboptimal behavior of bayes and mdl in classification under misspecification. Machine Learning, 66 0 (2-3): 0 119--149, 2007
work page 2007
-
[6]
GANs trained by a two time-scale update rule converge to a local Nash equilibrium
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In Advances in Neural Information Processing Systems, pp.\ 6626--6637, 2017
work page 2017
-
[7]
Denoising diffusion probabilistic models
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems, pp.\ 6840--6851, 2020
work page 2020
-
[8]
Cascaded diffusion models for high fidelity image generation
Jonathan Ho, Chitwan Saharia, William Chan, David J Fleet, Mohammad Norouzi, and Tim Salimans. Cascaded diffusion models for high fidelity image generation. arXiv preprint arXiv:2106.15282, 2021
-
[9]
Estimation of non-normalized statistical models by score matching
Aapo Hyv \"a rinen and Peter Dayan. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6 0 (4), 2005
work page 2005
-
[10]
Glow: Generative flow with invertible 1x1 convolutions
Diederik P Kingma and Prafulla Dhariwal. Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems, pp.\ 10215--10224, 2018
work page 2018
-
[11]
Diederik P Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. Variational diffusion models. arXiv preprint arXiv:2107.00630, 2021
-
[12]
DiffWave: A Versatile Diffusion Model for Audio Synthesis
Zhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, and Bryan Catanzaro. DiffWave: A Versatile Diffusion Model for Audio Synthesis . International Conference on Learning Representations , 2021
work page 2021
-
[13]
Improved denoising diffusion probabilistic models
Alex Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. International Conference on Machine Learning , 2021
work page 2021
-
[14]
Generating diverse high-fidelity images with VQ-VAE-2
Ali Razavi, Aaron van den Oord, and Oriol Vinyals. Generating diverse high-fidelity images with VQ-VAE-2 . In Advances in Neural Information Processing Systems, pp.\ 14837--14847, 2019
work page 2019
-
[15]
ImageNet large scale visual recognition challenge
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115 0 (3): 0 211--252, 2015
work page 2015
-
[16]
Should EBM s model the energy or the score? In Energy Based Models Workshop-ICLR 2021, 2021
Tim Salimans and Jonathan Ho. Should EBM s model the energy or the score? In Energy Based Models Workshop-ICLR 2021, 2021
work page 2021
-
[17]
Improved techniques for training GAN s
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training GAN s. In Advances in Neural Information Processing Systems, pp.\ 2234--2242, 2016
work page 2016
-
[18]
Deep unsupervised learning using nonequilibrium thermodynamics
Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pp.\ 2256--2265, 2015
work page 2015
-
[19]
Generative modeling by estimating gradients of the data distribution
Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution. In Advances in Neural Information Processing Systems, pp.\ 11895--11907, 2019
work page 2019
-
[20]
Maximum likelihood training of score-based diffusion models
Yang Song, Conor Durkan, Iain Murray, and Stefano Ermon. Maximum likelihood training of score-based diffusion models. arXiv e-prints, pp.\ arXiv--2101, 2021 a
work page 2021
-
[21]
Score-based generative modeling through stochastic differential equations
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. International Conference on Learning Representations , 2021 b
work page 2021
-
[22]
A connection between score matching and denoising autoencoders
Pascal Vincent. A connection between score matching and denoising autoencoders. Neural Computation, 23 0 (7): 0 1661--1674, 2011
work page 2011
-
[23]
Logan: Latent optimisation for generative adversarial networks
Yan Wu, Jeff Donahue, David Balduzzi, Karen Simonyan, and Timothy Lillicrap. LOGAN : Latent optimisation for generative adversarial networks. arXiv preprint arXiv:1912.00953, 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.