Equivalence between Gaussian processes and linear diffusion models enables general conditioning on arbitrary pointwise likelihoods via ODE dynamics and Monte Carlo guidance approximation.
hub Canonical reference
Denoising diffusion probabilistic models.Advances in Neural Information Processing Systems, 33:6840–6851
Canonical reference. 80% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
AGMs use a lightweight learned potential V_phi with stop-gradient to selectively weight informative bridge samples in generative model training, yielding better fidelity and coverage.
GG-PA composes diffusion priors with physical context via a derived Gibbs sampler that is asymptotically exact as diffusion time approaches zero and exact at finite times for quadratic interactions.
A hierarchical variational formulation amortizes test-time guidance in diffusion models to achieve strong quality-speed tradeoffs with significantly reduced inference compute.
AVIS applies autoregressive diffusion models to video inverse problems by streaming restoration with measurement-consistent initialization, reducing latency from 114s to 4s and raising throughput to 1.18 FPS (or 5.91 FPS in the Flash variant).
DSL provides a continuous embedding framework where one denoiser supports a family of SNR paths for discrete sequences, improving MAUVE scores on OpenWebText and allowing random-order and hybrid sampling from a fine-tuned MDLM checkpoint.
PGID restores watermark detection in diffusion models by using progressive inversion-denoising cycles to correct latents displaced by removal or forgery attacks.
LENSEs improves representation-conditioned molecule generation by jointly training a multi-level representation head, perceptual loss, and REPA alignment on pretrained encoders, yielding 97.28% validity and 98.51% stability on GEOM-DRUG.
VGM²P achieves SOTA-comparable performance in offline MARL via value-guided conditional behavior cloning with MeanFlow, enabling efficient single-step action generation insensitive to regularization coefficients.
Drifting with Gaussian kernels exactly matches score-matching on smoothed distributions via Tweedie's formula, while Laplace kernels approximate this closely in high dimensions.
Flow matching critics outperform monolithic ones in RL by 2x performance and 5x sample efficiency via test-time error recovery through integration and multi-point velocity supervision that preserves feature plasticity.
OPAD enables reliable high-quality personalization of one-step diffusion models via multi-step teacher distillation combined with adversarial alignment losses.
Scaling an autoregressive Transformer to 20B parameters for text-to-image generation using image token sequences achieves new SOTA zero-shot FID of 7.23 and fine-tuned FID of 3.22 on MS-COCO.
A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.
Quantum diffusion models develop a distinct barren plateau beyond small qubit counts; an architectural enhancement and conditional formulation restore trainability for Hamiltonian-parameterized ground-state generation.
An adapted scaling law predicts GPU energy consumption for diffusion model inference with R² > 0.9 within architectures and strong cross-architecture generalization.
BADiff introduces joint training of diffusion models with quality conditioning derived from bandwidth to enable adaptive early-stop sampling that preserves appropriate perceptual quality.
citing papers explorer
-
Conditioning Gaussian Processes on Almost Anything
Equivalence between Gaussian processes and linear diffusion models enables general conditioning on arbitrary pointwise likelihoods via ODE dynamics and Monte Carlo guidance approximation.
-
Action-Inspired Generative Models
AGMs use a lightweight learned potential V_phi with stop-gradient to selectively weight informative bridge samples in generative model training, yielding better fidelity and coverage.
-
Composing diffusion priors with explicit physical context via generative Gibbs sampling
GG-PA composes diffusion priors with physical context via a derived Gibbs sampler that is asymptotically exact as diffusion time approaches zero and exact at finite times for quadratic interactions.
-
Hierarchical Variational Policies for Reward-Guided Diffusion
A hierarchical variational formulation amortizes test-time guidance in diffusion models to achieve strong quality-speed tradeoffs with significantly reduced inference compute.
-
Accelerating Video Inverse Problem Solvers with Autoregressive Diffusion Models
AVIS applies autoregressive diffusion models to video inverse problems by streaming restoration with measurement-consistent initialization, reducing latency from 114s to 4s and raising throughput to 1.18 FPS (or 5.91 FPS in the Flash variant).
-
Discrete Stochastic Localization for Non-autoregressive Generation
DSL provides a continuous embedding framework where one denoiser supports a family of SNR paths for discrete sequences, improving MAUVE scores on OpenWebText and allowing random-order and hybrid sampling from a fine-tuned MDLM checkpoint.
-
PGID: Progressive Guided Inversion and Denoising for Robust Watermark Detection
PGID restores watermark detection in diffusion models by using progressive inversion-denoising cycles to correct latents displaced by removal or forgery attacks.
-
Toward Better Geometric Representations for Molecule Generative Models
LENSEs improves representation-conditioned molecule generation by jointly training a multi-level representation head, perceptual loss, and REPA alignment on pretrained encoders, yielding 97.28% validity and 98.51% stability on GEOM-DRUG.
-
Value-Guidance MeanFlow for Offline Multi-Agent Reinforcement Learning
VGM²P achieves SOTA-comparable performance in offline MARL via value-guided conditional behavior cloning with MeanFlow, enabling efficient single-step action generation insensitive to regularization coefficients.
-
A Unified View of Score-Based and Drifting Models
Drifting with Gaussian kernels exactly matches score-matching on smoothed distributions via Tweedie's formula, while Laplace kernels approximate this closely in high dimensions.
-
What Does Flow Matching Bring To TD Learning?
Flow matching critics outperform monolithic ones in RL by 2x performance and 5x sample efficiency via test-time error recovery through integration and multi-point velocity supervision that preserves feature plasticity.
-
Adversarial Concept Distillation for One-Step Diffusion Personalization
OPAD enables reliable high-quality personalization of one-step diffusion models via multi-step teacher distillation combined with adversarial alignment losses.
-
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Scaling an autoregressive Transformer to 20B parameters for text-to-image generation using image token sequences achieves new SOTA zero-shot FID of 7.23 and fine-tuned FID of 3.22 on MS-COCO.
-
Hierarchical Awareness Adapters with Hybrid Pyramid Feature Fusion for Dense Depth Prediction
A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.
-
Mitigating Barren Plateaus in Quantum Denoising Diffusion Probabilistic Model
Quantum diffusion models develop a distinct barren plateau beyond small qubit counts; an architectural enhancement and conditional formulation restore trainability for Hamiltonian-parameterized ground-state generation.
-
Energy Scaling Laws for Diffusion Models: Quantifying Compute in Image Generation
An adapted scaling law predicts GPU energy consumption for diffusion model inference with R² > 0.9 within architectures and strong cross-architecture generalization.
-
BADiff: Bandwidth Adaptive Diffusion Model
BADiff introduces joint training of diffusion models with quality conditioning derived from bandwidth to enable adaptive early-stop sampling that preserves appropriate perceptual quality.