Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.
hub Canonical reference
Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 step s
Canonical reference. 80% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.
CF-VLA uses a coarse initialization over endpoint velocity followed by single-step refinement to achieve strong performance with low inference steps on CALVIN, LIBERO, and real-robot tasks.
Z²-Sampling implicitly realizes zero-cost zigzag trajectories for curvature-aware semantic alignment in diffusion models by reducing multi-step paths via operator dualities and temporal caching while synthesizing a directional derivative penalty.
Early and late denoising steps in masked diffusion LMs are robust to smaller-model replacement, enabling 17% FLOPs reduction with modest generative quality loss.
Diff-ANO uses conditional consistency models and adjoint neural operator surrogates to enable fast, high-quality USCT reconstructions under sparse and partial views by replacing slow PDE solvers and enabling few-step sampling.
D-Garment is a template-specific latent diffusion model that generates dynamic 3D garment deformations conditioned on body shape, motion, and physical material properties using training data from a physics-based simulator.
Latent Consistency Models enable high-fidelity text-to-image generation in 2-4 steps by directly predicting solutions to the probability flow ODE in latent space, distilled from pre-trained LDMs.
Diffusion models solve noisy (non)linear inverse problems via approximated posterior sampling that blends diffusion steps with manifold gradients without strict consistency projection.
Diffusion-QL uses conditional diffusion models as expressive policies in offline RL by coupling behavior cloning with Q-value maximization, achieving SOTA on most D4RL tasks.
REPA-P aligns intermediate representations in diffusion models with physical states using first-principles PDE residuals to accelerate convergence and boost out-of-distribution robustness on PDE tasks.
MetaSR adaptively orchestrates metadata in a DiT-based generative SR model to deliver up to 1 dB PSNR gains and 50% bitrate savings across diverse content and degradations.
DOLLAR combines variational score and consistency distillation for few-step video generation plus latent reward optimization, reporting 82.57 VBench score and up to 278x speedup over the teacher diffusion model for 128-frame 10-second videos.
Moment-matched GMM kernels in DDIM yield lower FID and higher IS than Gaussian kernels at small sampling steps on CelebA-HQ, FFHQ, ImageNet, and Stable Diffusion tasks.
Lens is a 3.8B-parameter text-to-image model that reaches competitive or superior performance to >6B-parameter systems using 19.3% of the training compute of Z-Image through a densely captioned 800M dataset, multi-resolution batching, semantic VAE, strong language encoder, RL fine-tuning, and 4-step
A consistency-regularized Euclidean-Wasserstein-2 gradient flow performs joint posterior sampling and prompt optimization in latent space for efficient low-NFE inverse problem solving with diffusion models.
AdaCorrection adaptively corrects offset caches in DiT inference via on-the-fly spatio-temporal validity checks to maintain near-original FID with moderate acceleration.
Cross-fluctuations detect discrete phase transitions during diffusion sampling trajectories, yielding efficiency gains in generation and zero-shot tasks.
The paper proposes CDCD, a continuous-time and continuous-space diffusion framework for categorical data, and reports results on language modeling tasks.
One-step pixel-MeanFlow models recover key galaxy morphology statistics at orders-of-magnitude lower computational cost than standard DDPM sampling while remaining weaker on fine-grained structure.
citing papers explorer
-
Consistency Models
Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.
-
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.
-
CF-VLA: Efficient Coarse-to-Fine Action Generation for Vision-Language-Action Policies
CF-VLA uses a coarse initialization over endpoint velocity followed by single-step refinement to achieve strong performance with low inference steps on CALVIN, LIBERO, and real-robot tasks.
-
$Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models
Z²-Sampling implicitly realizes zero-cost zigzag trajectories for curvature-aware semantic alignment in diffusion models by reducing multi-step paths via operator dualities and temporal caching while synthesizing a directional derivative penalty.
-
Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language Models
Early and late denoising steps in masked diffusion LMs are robust to smaller-model replacement, enabling 17% FLOPs reduction with modest generative quality loss.
-
Diff-ANO: Towards Fast High-Resolution Ultrasound Computed Tomography via Conditional Consistency Models and Adjoint Neural Operators
Diff-ANO uses conditional consistency models and adjoint neural operator surrogates to enable fast, high-quality USCT reconstructions under sparse and partial views by replacing slow PDE solvers and enabling few-step sampling.
-
D-Garment: Physically Grounded Latent Diffusion for Dynamic Garment Deformations
D-Garment is a template-specific latent diffusion model that generates dynamic 3D garment deformations conditioned on body shape, motion, and physical material properties using training data from a physics-based simulator.
-
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
Latent Consistency Models enable high-fidelity text-to-image generation in 2-4 steps by directly predicting solutions to the probability flow ODE in latent space, distilled from pre-trained LDMs.
-
Diffusion Posterior Sampling for General Noisy Inverse Problems
Diffusion models solve noisy (non)linear inverse problems via approximated posterior sampling that blends diffusion steps with manifold gradients without strict consistency projection.
-
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning
Diffusion-QL uses conditional diffusion models as expressive policies in offline RL by coupling behavior cloning with Q-value maximization, achieving SOTA on most D4RL tasks.
-
Learning to Think in Physics: Breaking Shortcut Learning in Scientific Diffusion via Representation Alignment
REPA-P aligns intermediate representations in diffusion models with physical states using first-principles PDE residuals to accelerate convergence and boost out-of-distribution robustness on PDE tasks.
-
MetaSR: Content-Adaptive Metadata Orchestration for Generative Super-Resolution
MetaSR adaptively orchestrates metadata in a DiT-based generative SR model to deliver up to 1 dB PSNR gains and 50% bitrate savings across diverse content and degradations.
-
DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization
DOLLAR combines variational score and consistency distillation for few-step video generation plus latent reward optimization, reporting 82.57 VBench score and up to 278x speedup over the teacher diffusion model for 128-frame 10-second videos.
-
Improved DDIM Sampling with Moment Matching Gaussian Mixtures
Moment-matched GMM kernels in DDIM yield lower FID and higher IS than Gaussian kernels at small sampling steps on CelebA-HQ, FFHQ, ImageNet, and Stable Diffusion tasks.
-
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models
Lens is a 3.8B-parameter text-to-image model that reaches competitive or superior performance to >6B-parameter systems using 19.3% of the training compute of Z-Image through a densely captioned 800M dataset, multi-resolution batching, semantic VAE, strong language encoder, RL fine-tuning, and 4-step
-
Consistency Regularised Gradient Flows for Inverse Problems
A consistency-regularized Euclidean-Wasserstein-2 gradient flow performs joint posterior sampling and prompt optimization in latent space for efficient low-NFE inverse problem solving with diffusion models.
-
AdaCorrection: Adaptive Offset Cache Correction for Accurate Diffusion Transformers
AdaCorrection adaptively corrects offset caches in DiT inference via on-the-fly spatio-temporal validity checks to maintain near-original FID with moderate acceleration.
-
Cross-fluctuation phase transitions reveal sampling dynamics in diffusion models
Cross-fluctuations detect discrete phase transitions during diffusion sampling trajectories, yielding efficiency gains in generation and zero-shot tasks.
-
Continuous diffusion for categorical data
The paper proposes CDCD, a continuous-time and continuous-space diffusion framework for categorical data, and reports results on language modeling tasks.
-
Accelerating Redshift-Conditioned Galaxy Image Synthesis with One-step Generative Modeling
One-step pixel-MeanFlow models recover key galaxy morphology statistics at orders-of-magnitude lower computational cost than standard DDPM sampling while remaining weaker on fine-grained structure.