Equivalence between Gaussian processes and linear diffusion models enables general conditioning on arbitrary pointwise likelihoods via ODE dynamics and Monte Carlo guidance approximation.
hub Canonical reference
Denoising diffusion probabilistic models.Advances in Neural Information Processing Systems, 33:6840–6851
Canonical reference. 80% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
AGMs use a lightweight learned potential V_phi with stop-gradient to selectively weight informative bridge samples in generative model training, yielding better fidelity and coverage.
GG-PA composes diffusion priors with physical context via a derived Gibbs sampler that is asymptotically exact as diffusion time approaches zero and exact at finite times for quadratic interactions.
A multi-scale extension of the Fisher information metric, derived from coarse-graining contraction rules, exactly captures the structure of mutual information in neural population codes and can be estimated via diffusion models.
HEDGE generates hypergraphs via a linear-Gaussian forward diffusion on incidence matrices with a hypergraph-specific heat operator, then learns a permutation-equivariant reverse drift to sample from the Gaussian base.
OT-MPC computes an optimal coupling between candidate control sequences and low-cost proposals via entropy-regularized optimal transport and the Sinkhorn algorithm to improve sampling-based MPC performance.
A hierarchical variational formulation amortizes test-time guidance in diffusion models to achieve strong quality-speed tradeoffs with significantly reduced inference compute.
AVIS applies autoregressive diffusion models to video inverse problems by streaming restoration with measurement-consistent initialization, reducing latency from 114s to 4s and raising throughput to 1.18 FPS (or 5.91 FPS in the Flash variant).
DSL provides a continuous embedding framework where one denoiser supports a family of SNR paths for discrete sequences, improving MAUVE scores on OpenWebText and allowing random-order and hybrid sampling from a fine-tuned MDLM checkpoint.
PGID restores watermark detection in diffusion models by using progressive inversion-denoising cycles to correct latents displaced by removal or forgery attacks.
LENSEs improves representation-conditioned molecule generation by jointly training a multi-level representation head, perceptual loss, and REPA alignment on pretrained encoders, yielding 97.28% validity and 98.51% stability on GEOM-DRUG.
CoreFlow is a low-rank matrix generative model that trains normalizing flows on shared subspaces to improve efficiency and quality for high-dimensional limited-sample data, including incomplete matrices.
VGM²P achieves SOTA-comparable performance in offline MARL via value-guided conditional behavior cloning with MeanFlow, enabling efficient single-step action generation insensitive to regularization coefficients.
Drifting with Gaussian kernels exactly matches score-matching on smoothed distributions via Tweedie's formula, while Laplace kernels approximate this closely in high dimensions.
Flow matching critics outperform monolithic ones in RL by 2x performance and 5x sample efficiency via test-time error recovery through integration and multi-point velocity supervision that preserves feature plasticity.
OPAD enables reliable high-quality personalization of one-step diffusion models via multi-step teacher distillation combined with adversarial alignment losses.
Scaling an autoregressive Transformer to 20B parameters for text-to-image generation using image token sequences achieves new SOTA zero-shot FID of 7.23 and fine-tuned FID of 3.22 on MS-COCO.
Aligning the DDIM forward diffusion process with flow-matching manifold evolution enables high-quality generation without time conditioning, and class-conditional synthesis is possible with an unconditional denoiser by using separate time spaces per class.
A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.
Quantum diffusion models develop a distinct barren plateau beyond small qubit counts; an architectural enhancement and conditional formulation restore trainability for Hamiltonian-parameterized ground-state generation.
An adapted scaling law predicts GPU energy consumption for diffusion model inference with R² > 0.9 within architectures and strong cross-architecture generalization.
BADiff introduces joint training of diffusion models with quality conditioning derived from bandwidth to enable adaptive early-stop sampling that preserves appropriate perceptual quality.
citing papers explorer
-
Toward Better Geometric Representations for Molecule Generative Models
LENSEs improves representation-conditioned molecule generation by jointly training a multi-level representation head, perceptual loss, and REPA alignment on pretrained encoders, yielding 97.28% validity and 98.51% stability on GEOM-DRUG.