The paper establishes a reverse-time quantum diffusion framework that generates complex quantum ensembles from simple distributions by deriving and learning a feedback Hamiltonian from forward trajectory data.
super hub Mixed citations
Denoising Diffusion Probabilistic Models
Mixed citation behavior. Most common role is background (55%).
abstract
We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and our models naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding. On the unconditional CIFAR10 dataset, we obtain an Inception score of 9.46 and a state-of-the-art FID score of 3.17. On 256x256 LSUN, we obtain sample quality similar to ProgressiveGAN. Our implementation is available at https://github.com/hojonathanho/diffusion
hub tools
citation-role summary
citation-polarity summary
claims ledger
- abstract We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and our models naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding. On the unconditional CIFAR10 dataset, we obtain an Inception score
authors
co-cited works
representative citing papers
Generative diffusion and flow models are constructed to remain exactly on the Lorentz-invariant massless N-particle phase space manifold during sampling for particle physics applications.
MF-PID turns independent diffusion samples into mean-field interacting agents, proving that quadratic interactions yield exact linear mean interpolation and delivering 19-24% energy savings in demand-response control.
Promptbreeder evolves both task prompts and the mutation prompts that improve them using LLMs, outperforming Chain-of-Thought and Plan-and-Solve on arithmetic and commonsense reasoning benchmarks.
DDIMs construct non-Markovian diffusion processes that share DDPM training objectives but allow much faster reverse sampling, demonstrated empirically at 10-50x wall-clock speedup.
DiffWave is a non-autoregressive diffusion model that generates high-fidelity audio waveforms from noise in constant steps, matching WaveNet vocoder quality while being orders of magnitude faster and outperforming prior models in unconditional generation.
A temperature-conditioned diffusion model trained on small XY lattices produces accurate larger-lattice samples and cuts MCMC thermalization time by roughly 10x.
Lie group diffusion models combine a discrete circuit skeleton selector with continuous diffusion on SU(2) ≃ S³ to synthesize hardware-aware quantum circuits, outperforming baselines on three-qubit Hamiltonian simulation targets.
A gauge-equivariant diffusion model samples Schwinger model configurations, yielding unbiased observables matching MCMC and qualitatively less topological freezing than HMC.
Develops primal-dual inference (PDI) that jointly infers optimal primal distributions and dual multipliers during diffusion sampling using a dual-conditioned score network.
Flow Reversal Steering steers flow matching generalist policies by reversing suboptimal actions to nearby better modes, enabling improved zero-shot control, quick distillation, and RL bootstrapping in robotic manipulation.
Ambient Diffusion Policy enables better imitation learning from suboptimal robot data by leveraging spectral properties to restrict data usage to specific diffusion times.
Spectrally regularized compression in latent flow matching raises retained deep-dissipation spectral power from 20% to 79% in generated turbulence on a 256^2 DNS dataset at Re_f ≈ 2250.
Establishes a quadratic lower bound on query complexity for sampling from large classes of distributions given approximate density oracles, answers an open question on optimality of random walks, and shows circumvention for bounded classes as an abstraction of TTT.
OGAS uses a parallel diffusion model to bias PDE configuration sampling toward high surrogate difficulty, reducing 99th-percentile errors and error variance versus uniform sampling across tested 2D PDEs.
Continuous language diffusion works by entering high-margin decoder basins where frozen T5 embeddings recover 93-96% of native decisions and linear readouts reach 97.9% agreement, implying models should be evaluated as representation-decoder systems.
Derives optimal score functions for diffusion models as wavelet expansions in terms of data moments, enabling architecture-agnostic analysis of which distribution attributes matter for denoising.
FRUC enables one-shot calibration-free dynamic scene reconstruction from collaborative driving views via a geometric Transformer, ego-centric occlusion priors, and zero-initialized residual denoising, claiming SOTA quality and speed on V2XReal and UrbanIng-V2X.
DGLD applies domain-gated latent diffusion with label-quality gating and multi-task guidance to discover 12 novel energetic material leads validated by DFT, outperforming SMILES-LSTM, SELFIES-GA, and REINVENT baselines in novelty and on-target performance.
HumanFlow is a latent diffusion model for unified human motion tracking and forecasting in 3D scenes, tightly coupled via flow-matching MPC to an approximate policy for MAV social navigation that outperforms prior methods in simulation under partial observability.
Derives that the Rao-Blackwellized DSM target on manifolds equals the intrinsic Riemannian score plus an explicit order-σ² correction decomposing into an intrinsic Tweedie term and an extrinsic curvature term involving Weingarten and Ricci operators.
FlowErase-RL applies GRPO to reformulate concept erasure in flow matching models as reward optimization using a dynamic dual-path mechanism for target suppression and non-target preservation.
Nested-GPT is an autoregressive Transformer surrogate that generates variable-multiplicity parton showers while enforcing ordered Markovian branching and matches reference Monte Carlo results for leading-log non-global logarithm resummation in the large-Nc limit.
Constrained Diffusion for Code (CDC) integrates constraint satisfaction into the reverse denoising process of discrete diffusion models via constraint-aware operators that use optimization and program analysis to steer generation toward feasible programs.
citing papers explorer
-
Generating quantum ensembles via reverse-time quantum diffusions
The paper establishes a reverse-time quantum diffusion framework that generates complex quantum ensembles from simple distributions by deriving and learning a feedback Hamiltonian from forward trajectory data.
-
Generative models on phase space
Generative diffusion and flow models are constructed to remain exactly on the Lorentz-invariant massless N-particle phase space manifold during sampling for particle physics applications.
-
Mean-Field Path-Integral Diffusion: From Samples to Interacting Agents
MF-PID turns independent diffusion samples into mean-field interacting agents, proving that quadratic interactions yield exact linear mean interpolation and delivering 19-24% energy savings in demand-response control.
-
Diffusion-warm sampling of the XY model enables fast thermalization at scale
A temperature-conditioned diffusion model trained on small XY lattices produces accurate larger-lattice samples and cuts MCMC thermalization time by roughly 10x.
-
Lie Group Diffusion Models for Hardware-Aware Quantum Circuit Synthesis
Lie group diffusion models combine a discrete circuit skeleton selector with continuous diffusion on SU(2) ≃ S³ to synthesize hardware-aware quantum circuits, outperforming baselines on three-qubit Hamiltonian simulation targets.
-
Sampling the Schwinger Model with Gauge-Equivariant Diffusion
A gauge-equivariant diffusion model samples Schwinger model configurations, yielding unbiased observables matching MCMC and qualitatively less topological freezing than HMC.
-
Constrained Diffusion Models with Primal-Dual Inference
Develops primal-dual inference (PDI) that jointly infers optimal primal distributions and dual multipliers during diffusion sampling using a dual-conditioned score network.
-
Improving Robotic Generalist Policies via Flow Reversal Steering
Flow Reversal Steering steers flow matching generalist policies by reversing suboptimal actions to nearby better modes, enabling improved zero-shot control, quick distillation, and RL bootstrapping in robotic manipulation.
-
Ambient Diffusion Policy: Imitation Learning from Suboptimal Data in Robotics
Ambient Diffusion Policy enables better imitation learning from suboptimal robot data by leveraging spectral properties to restrict data usage to specific diffusion times.
-
Spectrally Regularized Latent Flow Matching for Turbulence Generation
Spectrally regularized compression in latent flow matching raises retained deep-dissipation spectral power from 20% to 79% in generated turbulence on a 256^2 DNS dataset at Re_f ≈ 2250.
-
The Power of Test-Time Training for Approximate Sampling
Establishes a quadratic lower bound on query complexity for sampling from large classes of distributions given approximate density oracles, answers an open question on optimality of random walks, and shows circumvention for bounded classes as an abstraction of TTT.
-
Learning Where to Simulate: Generative Active Sampling for Online PDE Surrogate Training
OGAS uses a parallel diffusion model to bias PDE configuration sampling toward high surrogate difficulty, reducing 99th-percentile errors and error variance versus uniform sampling across tested 2D PDEs.
-
Continuous Language Diffusion as a Decoder-Interface Problem
Continuous language diffusion works by entering high-margin decoder basins where frozen T5 embeddings recover 93-96% of native decisions and linear readouts reach 97.9% agreement, implying models should be evaluated as representation-decoder systems.
-
Where the Score Lives: A Wavelet View of Diffusion
Derives optimal score functions for diffusion models as wavelet expansions in terms of data moments, enabling architecture-agnostic analysis of which distribution attributes matter for denoising.
-
FRUC: Feedforward Dynamic Scene Reconstruction from Uncalibrated Collaborative Driving Views
FRUC enables one-shot calibration-free dynamic scene reconstruction from collaborative driving views via a geometric Transformer, ego-centric occlusion priors, and zero-initialized residual denoising, claiming SOTA quality and speed on V2XReal and UrbanIng-V2X.
-
DGLD: Domain-Gated Latent Diffusion for the Discovery of Novel Energetic Materials
DGLD applies domain-gated latent diffusion with label-quality gating and multi-task guidance to discover 12 novel energetic material leads validated by DFT, outperforming SMILES-LSTM, SELFIES-GA, and REINVENT baselines in novelty and on-target performance.
-
HumanFlow -- Diffusion-Driven MAV Navigation Among Humans via Tightly-Coupled Motion Tracking, Forecasting, and Control
HumanFlow is a latent diffusion model for unified human motion tracking and forecasting in 3D scenes, tightly coupled via flow-matching MPC to an approximate policy for MAV social navigation that outperforms prior methods in simulation under partial observability.
-
Rao-Blackwellized Score Matching on Manifolds
Derives that the Rao-Blackwellized DSM target on manifolds equals the intrinsic Riemannian score plus an explicit order-σ² correction decomposing into an intrinsic Tweedie term and an extrinsic curvature term involving Weingarten and Ricci operators.
-
FlowErase-RL: Rethinking Concept Erasure as Reward Optimization in Flow Matching Models
FlowErase-RL applies GRPO to reformulate concept erasure in flow matching models as reward optimization using a dynamic dual-path mechanism for target suppression and non-target preservation.
-
Nested-GPT for variable-multiplicity parton showers: A case study in the resummation of non-global logarithms
Nested-GPT is an autoregressive Transformer surrogate that generates variable-multiplicity parton showers while enforcing ordered Markovian branching and matches reference Monte Carlo results for leading-log non-global logarithm resummation in the large-Nc limit.
-
Constrained Code Generation with Discrete Diffusion
Constrained Diffusion for Code (CDC) integrates constraint satisfaction into the reverse denoising process of discrete diffusion models via constraint-aware operators that use optimization and program analysis to steer generation toward feasible programs.
-
Seeking the Unfamiliar but Memorable: Conceptual Creativity as Meta-Learning
Creativity is defined as meta-learning where a frozen diffusion creator optimizes candidates for rapid improvement by an adapting appraiser such as an autoencoder or CLIP adapter.
-
DSSP: Diffusion State Space Policy with Full-History Encoding
DSSP is a history-conditioned diffusion state space policy that uses SSMs to encode full observation streams with an auxiliary dynamics objective and hierarchical fusion, achieving SOTA results with reduced model size in robot manipulation.
-
TRACE: Transport Alignment Conformal Prediction via Diffusion and Flow Matching Models
TRACE creates valid conformal prediction sets for complex generative models by scoring outputs via averaged denoising or velocity errors along stochastic transport paths instead of likelihoods.
-
Deep Dreams Are Made of This: Visualizing Monosemantic Features in Diffusion Models
LVO applies optimization-based feature visualization to latent diffusion models after disentangling their representations with sparse autoencoders, yielding recognizable concept images on a fine-tuned Stable Diffusion model that are clearer than those from entangled baselines.
-
Tempered Guided Diffusion
Tempered Guided Diffusion uses annealed SMC to produce consistent particle approximations to the posterior for training-free conditional diffusion sampling, outperforming independent guided trajectories in experiments.
-
Action Agent: Agentic Video Generation Meets Flow-Constrained Diffusion
Action Agent pairs LLM-driven video generation with a flow-constrained diffusion transformer to produce velocity commands, raising video success to 86% and delivering 64.7% real-world navigation on a Unitree G1 humanoid.
-
Generative diffusion models for spatiotemporal influenza forecasting
Influpaint uses generative diffusion models on image-encoded influenza data to produce realistic and diverse epidemic trajectories that match leading ensemble methods in accuracy.
-
Oracle Noise: Faster Semantic Spherical Alignment for Interpretable Latent Optimization
Oracle Noise optimizes diffusion model noise on a Riemannian hypersphere guided by key prompt words to preserve the Gaussian prior, eliminate norm inflation, and achieve faster semantic alignment than Euclidean methods.
-
$Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models
Z²-Sampling implicitly realizes zero-cost zigzag trajectories for curvature-aware semantic alignment in diffusion models by reducing multi-step paths via operator dualities and temporal caching while synthesizing a directional derivative penalty.
-
Privatar: Scalable Privacy-preserving Multi-user VR via Secure Offloading
Privatar uses horizontal frequency partitioning and distribution-aware minimal perturbation to enable private offloading of VR avatar reconstruction, supporting 2.37x more users with modest overhead.
-
Conflated Inverse Modeling to Generate Diverse and Temperature-Change Inducing Urban Vegetation Patterns
A diffusion generative inverse model conditioned on temperature targets produces diverse, physically plausible urban vegetation patterns that achieve specified regional temperature shifts.
-
Causal Diffusion Models for Counterfactual Outcome Distributions in Longitudinal Data
Causal Diffusion Model is the first diffusion-based method to produce full probabilistic counterfactual outcome distributions for sequential interventions in longitudinal data, showing 15-30% better distributional accuracy than prior methods on a tumor-growth simulator.
-
ExpertEdit: Learning Skill-Aware Motion Editing from Expert Videos
ExpertEdit edits novice motions to expert skill levels by learning a motion prior from unpaired videos and infilling masked skill-critical spans.
-
MoZoo:Unleashing Video Diffusion power in animal fur and muscle simulation
MoZoo generates high-fidelity animal videos with fur and muscle dynamics from coarse meshes by extending video diffusion with role-aware RoPE and asymmetric decoupled attention, trained on a new synthetic-to-real dataset.
-
Anchored Cyclic Generation: A Novel Paradigm for Long-Sequence Symbolic Music Generation
Anchored Cyclic Generation uses anchor features from known music to mitigate error accumulation in autoregressive models, with the Hi-ACG framework delivering better long-sequence symbolic music and music completion performance.
-
Unlocking Prompt Infilling Capability for Diffusion Language Models
Full-sequence masking in SFT unlocks prompt infilling for masked diffusion language models, producing templates that match or surpass hand-designed ones and transfer across models.
-
GVCC: Zero-Shot Video Compression via Codebook-Driven Stochastic Rectified Flow
GVCC achieves the lowest LPIPS on UVG at bitrates down to 0.003 bpp by encoding stochastic innovations in a marginal-preserving stochastic process derived from a pretrained rectified-flow video model, with 65% LPIPS reduction over DCVC-RT.
-
Unifying Contrastive and Generative Objectives for Visual Understanding and Text-to-Image Generation
DREAM introduces Masking Warmup and Semantically Aligned Decoding to let a single encoder handle both contrastive alignment and masked generation, yielding gains over CLIP and FLUID on understanding and generation benchmarks.
-
The Diffusion-Attention Connection
Attention, diffusion maps, and magnetic Laplacians are different regimes of a single Markov geometry from pre-softmax query-scores, linked by a QK bidivergence and Schrödinger bridges into equilibrium, nonequilibrium, and driven dynamics.
-
Latent Generative Solvers for Generalizable Long-Term Physics Simulation
LGS pretrained on 2.5M trajectories across 16 systems matches deterministic baselines at one step and halves 20-step error while using far less compute and adapting to held-out higher-resolution flows.
-
Contour Refinement using Discrete Diffusion in Low Data Regime
A CNN-based discrete diffusion method refines sparse contours from segmentation masks using simplified denoising steps and minimal post-processing, outperforming baselines on small medical and environmental datasets while running 3.5 times faster.
-
DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching
DisCa replaces heuristic feature caching with a lightweight learnable neural predictor compatible with distillation, achieving 11.8× acceleration on video diffusion transformers with preserved generation quality.
-
Mitigating Long-Tail Bias via Prompt-Controlled Diffusion Augmentation
A prompt-controlled diffusion framework generates class-ratio-targeted synthetic layouts and domain-consistent images that, when mixed with real data, improve segmentation accuracy on long-tailed remote-sensing datasets especially under domain shift.
-
Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language Models
Early and late denoising steps in masked diffusion LMs are robust to smaller-model replacement, enabling 17% FLOPs reduction with modest generative quality loss.
-
Differentiable Surrogate for Detector Simulation and Design with Diffusion Models
A LoRA-adapted conditional diffusion surrogate for electromagnetic calorimeter showers matches key observables within 2% RMSE and reproduces directional trends in design-utility gradients.
-
HandsOnWorld: Unconstrained Egocentric Video Generation with Camera-Disentangled Hand Control
HandsOnWorld creates a hand-controlled egocentric video generator from unconstrained monocular video via a new EgoVid-Pro dataset from monocular reconstruction and a Plücker Hand Map that disentangles camera and hand motion.
-
Chronos: A Physics-Informed Full-History Framework for Non-Markovian Long-Horizon Manipulation
Chronos elevates full observation history to the policy's latent state via selective SSM tokens and a Schrödinger-inspired acceleration bridge, achieving large gains on memory-dependent robot tasks with fewer parameters.
-
Few-Step Boltzmann Generators via Scalable Likelihood Flow Maps
SCALLOP replaces Hutchinson's trace estimator with a scalable, vectorized likelihood distillation objective for F2D2 flow maps, cutting training variance and time while improving performance on molecular Boltzmann generators and image data.
-
Support-Constrained RL Enables Real-World Policy Improvement without Real-World Experience
SCORE constrains sim RL to the support of a real-data policy via flow steering, raising average success on eight dexterous tasks from 37.8% to 89.9%.