The paper establishes a reverse-time quantum diffusion framework that generates complex quantum ensembles from simple distributions by deriving and learning a feedback Hamiltonian from forward trajectory data.
super hub Mixed citations
Denoising Diffusion Probabilistic Models
Mixed citation behavior. Most common role is background (55%).
abstract
We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and our models naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding. On the unconditional CIFAR10 dataset, we obtain an Inception score of 9.46 and a state-of-the-art FID score of 3.17. On 256x256 LSUN, we obtain sample quality similar to ProgressiveGAN. Our implementation is available at https://github.com/hojonathanho/diffusion
hub tools
citation-role summary
citation-polarity summary
claims ledger
- abstract We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and our models naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding. On the unconditional CIFAR10 dataset, we obtain an Inception score
authors
co-cited works
representative citing papers
Generative diffusion and flow models are constructed to remain exactly on the Lorentz-invariant massless N-particle phase space manifold during sampling for particle physics applications.
MF-PID turns independent diffusion samples into mean-field interacting agents, proving that quadratic interactions yield exact linear mean interpolation and delivering 19-24% energy savings in demand-response control.
Promptbreeder evolves both task prompts and the mutation prompts that improve them using LLMs, outperforming Chain-of-Thought and Plan-and-Solve on arithmetic and commonsense reasoning benchmarks.
DDIMs construct non-Markovian diffusion processes that share DDPM training objectives but allow much faster reverse sampling, demonstrated empirically at 10-50x wall-clock speedup.
DiffWave is a non-autoregressive diffusion model that generates high-fidelity audio waveforms from noise in constant steps, matching WaveNet vocoder quality while being orders of magnitude faster and outperforming prior models in unconditional generation.
A temperature-conditioned diffusion model trained on small XY lattices produces accurate larger-lattice samples and cuts MCMC thermalization time by roughly 10x.
Lie group diffusion models combine a discrete circuit skeleton selector with continuous diffusion on SU(2) ≃ S³ to synthesize hardware-aware quantum circuits, outperforming baselines on three-qubit Hamiltonian simulation targets.
A gauge-equivariant diffusion model samples Schwinger model configurations, yielding unbiased observables matching MCMC and qualitatively less topological freezing than HMC.
Develops primal-dual inference (PDI) that jointly infers optimal primal distributions and dual multipliers during diffusion sampling using a dual-conditioned score network.
Flow Reversal Steering steers flow matching generalist policies by reversing suboptimal actions to nearby better modes, enabling improved zero-shot control, quick distillation, and RL bootstrapping in robotic manipulation.
Ambient Diffusion Policy enables better imitation learning from suboptimal robot data by leveraging spectral properties to restrict data usage to specific diffusion times.
Spectrally regularized compression in latent flow matching raises retained deep-dissipation spectral power from 20% to 79% in generated turbulence on a 256^2 DNS dataset at Re_f ≈ 2250.
Establishes a quadratic lower bound on query complexity for sampling from large classes of distributions given approximate density oracles, answers an open question on optimality of random walks, and shows circumvention for bounded classes as an abstraction of TTT.
OGAS uses a parallel diffusion model to bias PDE configuration sampling toward high surrogate difficulty, reducing 99th-percentile errors and error variance versus uniform sampling across tested 2D PDEs.
Continuous language diffusion works by entering high-margin decoder basins where frozen T5 embeddings recover 93-96% of native decisions and linear readouts reach 97.9% agreement, implying models should be evaluated as representation-decoder systems.
Derives optimal score functions for diffusion models as wavelet expansions in terms of data moments, enabling architecture-agnostic analysis of which distribution attributes matter for denoising.
FRUC enables one-shot calibration-free dynamic scene reconstruction from collaborative driving views via a geometric Transformer, ego-centric occlusion priors, and zero-initialized residual denoising, claiming SOTA quality and speed on V2XReal and UrbanIng-V2X.
DGLD applies domain-gated latent diffusion with label-quality gating and multi-task guidance to discover 12 novel energetic material leads validated by DFT, outperforming SMILES-LSTM, SELFIES-GA, and REINVENT baselines in novelty and on-target performance.
HumanFlow is a latent diffusion model for unified human motion tracking and forecasting in 3D scenes, tightly coupled via flow-matching MPC to an approximate policy for MAV social navigation that outperforms prior methods in simulation under partial observability.
Derives that the Rao-Blackwellized DSM target on manifolds equals the intrinsic Riemannian score plus an explicit order-σ² correction decomposing into an intrinsic Tweedie term and an extrinsic curvature term involving Weingarten and Ricci operators.
FlowErase-RL applies GRPO to reformulate concept erasure in flow matching models as reward optimization using a dynamic dual-path mechanism for target suppression and non-target preservation.
Nested-GPT is an autoregressive Transformer surrogate that generates variable-multiplicity parton showers while enforcing ordered Markovian branching and matches reference Monte Carlo results for leading-log non-global logarithm resummation in the large-Nc limit.
Constrained Diffusion for Code (CDC) integrates constraint satisfaction into the reverse denoising process of discrete diffusion models via constraint-aware operators that use optimization and program analysis to steer generation toward feasible programs.
citing papers explorer
-
LTX-Video: Realtime Video Latent Diffusion
LTX-Video integrates Video-VAE and transformer for 1:192 latent compression and real-time video diffusion by moving patchifying to the VAE and letting the decoder finish denoising in pixel space.
-
Gravitational-Wave Parameter Estimation in non-Gaussian noise using Score-Based Likelihood Characterization
Score-based diffusion models learn the empirical distribution of real LIGO noise to enable unbiased gravitational-wave parameter estimation under only an additivity assumption.
-
3D Diffuser Actor: Policy Diffusion with 3D Scene Representations
3D Diffuser Actor unifies diffusion policies with 3D scene features to set new state-of-the-art results on RLBench and CALVIN robot benchmarks.
-
KFC-W: Generating 3D-Consistent Videos from Unposed Internet Photos
KFC-W is a self-supervised 3D-aware video model trained on videos and multiview internet photos that produces geometrically consistent interpolations between unposed input images without any 3D annotations.
-
MSG Score: Automated Video Verification for Reliable Multi-Scene Generation
Proposes MSG score as core of CGS framework plus IID distillation for automated, fast verification of long-form text-to-video outputs.
-
TextBoost: Boosting Text Encoder for Personalized Text-to-Image Generation
TextBoost is a one-shot personalization technique that selectively fine-tunes the text encoder of diffusion models using causality-preserving adaptation and lightweight adapters to reduce parameters and storage.