hub Canonical reference

Denoising diffusion probabilistic models

Jonathan Ho, Ajay Jain, Pieter Abbeel · 2020

Canonical reference. 71% of citing Pith papers cite this work as background.

23 Pith papers citing it

Background 71% of classified citations

browse 23 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

background 5 method 2

citation-polarity summary

background 5 use method 2

representative citing papers

Generative Modeling by Value-Driven Transport

cs.LG · 2026-05-21 · unverdicted · novelty 7.0

A control-theoretic linear program yields value-driven transport policies for generative modeling with straight paths and simulation-free training.

Bridging Domain Gaps with Target-Aligned Generation for Offline Reinforcement Learning

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

TCE bridges domain gaps in offline RL by selectively using source data or generating target-aligned transitions via a dual score-based model, outperforming baselines in experiments.

SpecEdit: Training-Free Acceleration for Diffusion based Image Editing via Semantic Locking

cs.CV · 2026-05-04 · unverdicted · novelty 7.0

SpecEdit accelerates diffusion-based image editing up to 10x by using a low-resolution draft to identify edit-relevant tokens via semantic discrepancies for selective high-resolution denoising.

LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories

cs.CV · 2026-04-16 · unverdicted · novelty 7.0

LeapAlign fine-tunes flow matching models by constructing two consecutive leaps that skip multiple ODE steps with randomized timesteps and consistency weighting, enabling stable updates at any generation step.

Turbulent pair dispersion with Stochastic Generative Diffusion Models

physics.flu-dyn · 2026-04-14 · unverdicted · novelty 7.0

Diffusion models generate joint pairs of Lagrangian trajectories that reproduce turbulent pair separation statistics, including deviations from Richardson scaling, while preserving single-particle properties.

CARD: Coarse-to-fine Autoregressive Modeling with Radix-based Decomposition for Transferable Free Energy Estimation

cs.LG · 2026-05-04 · unverdicted · novelty 6.0

CARD uses radix decomposition to enable autoregressive modeling of molecular coordinates as a zero-free-energy reference distribution, delivering classical accuracy for absolute free energy on unseen systems at ~40x speedup.

Frozen Forecasting: A Unified Evaluation

cs.CV · 2025-07-18 · unverdicted · novelty 6.0

A new evaluation framework using latent diffusion on frozen vision backbones shows video-pretrained models consistently outperform image-based ones in forecasting entire trajectories across abstraction levels.

BulletGen: Improving 4D Reconstruction with Bullet-Time Generation

cs.GR · 2025-06-23 · unverdicted · novelty 6.0

BulletGen enhances 4D dynamic scene reconstruction from monocular videos by supervising Gaussian optimization with diffusion-generated frames aligned at a bullet-time step, achieving SOTA on novel-view synthesis and tracking.

2ndMatch: Finetuning Pruned Diffusion Models via Second-Order Jacobian Matching

cs.GR · 2025-06-03 · unverdicted · novelty 6.0

2ndMatch finetunes pruned diffusion models via second-order Jacobian matching inspired by Finite-Time Lyapunov Exponents to reduce the quality gap with dense models on image generation tasks.

Latent Wavelet Diffusion For Ultra-High-Resolution Image Synthesis

cs.CV · 2025-05-31 · unverdicted · novelty 6.0

Latent Wavelet Diffusion uses wavelet energy map masking and a scale-consistent VAE to improve detail fidelity in 2K-4K image generation without extra inference overhead.

DreamPolicy: A Unified World-model Policy for Scalable Humanoid Locomotion

cs.RO · 2025-05-24 · unverdicted · novelty 6.0

DreamPolicy integrates an autoregressive diffusion world model with policy learning to produce a single scalable policy that generalizes to unseen composite terrains for humanoid locomotion.

DanceGRPO: Unleashing GRPO on Visual Generation

cs.CV · 2025-05-12 · unverdicted · novelty 6.0

DanceGRPO applies GRPO to visual generation tasks to achieve stable policy optimization across diffusion models, rectified flows, multiple tasks, and diverse reward models, outperforming prior RL methods.

Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation

cs.CV · 2025-05-08 · unverdicted · novelty 6.0

Mogao presents a causal unified model with deep fusion, dual encoders, and interleaved position embeddings that achieves strong performance on multi-modal understanding, text-to-image generation, and coherent interleaved outputs including zero-shot editing.

SkyReels-V2: Infinite-length Film Generative Model

cs.CV · 2025-04-17 · unverdicted · novelty 6.0

SkyReels-V2 produces infinite-length film videos via MLLM-based captioning, progressive pretraining, motion RL, and diffusion forcing with non-decreasing noise schedules.

HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model

cs.CV · 2025-03-13 · unverdicted · novelty 6.0

HybridVLA unifies diffusion and autoregression in a single VLA model via collaborative training and ensemble to raise robot manipulation success rates by 14% in simulation and 19% in real-world tasks.

GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation

cs.RO · 2024-10-08 · unverdicted · novelty 6.0

GR-2 pre-trains on web-scale videos then fine-tunes on robot data to reach 97.7% average success across over 100 manipulation tasks with strong generalization to new scenes and objects.

Efficient 3D Content Reconstruction and Generation

cs.CV · 2026-05-18 · unverdicted · novelty 5.0

Presents Instant3D for rapid text/image-to-3D generation via multi-view diffusion plus feed-forward reconstruction, and FastMap for 10x faster structure-from-motion with comparable accuracy.

Common Inpainted Objects In-N-Out of Context

cs.CV · 2025-05-31 · unverdicted · novelty 5.0

COinCO is a new dataset of inpainted COCO images with in- and out-of-context objects, enabling context reasoning, object prediction from scenes, and improved fake image detection.

A Probabilistic Formulation of Offset Noise in Diffusion Models

stat.ML · 2024-12-04 · unverdicted · novelty 5.0

A diffusion model variant that adds structured non-zero-mean noise via modified forward/reverse processes, yielding an ELBO loss analogous to offset noise but with time-dependent coefficients, and showing gains on synthetic high-dimensional data.

Step1X-Edit: A Practical Framework for General Image Editing

cs.CV · 2025-04-24 · unverdicted · novelty 4.0

Step1X-Edit integrates a multimodal LLM with a diffusion decoder, trained on a custom high-quality dataset, to deliver image editing performance that surpasses open-source baselines and approaches proprietary models on the new GEdit-Bench.

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

cs.CV · 2025-01-21 · unverdicted · novelty 4.0

Hunyuan3D 2.0 scales flow-based diffusion transformers and texture synthesis models to generate high-resolution textured 3D assets that outperform prior state-of-the-art in geometry, alignment, and texture quality.

Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material

cs.CV · 2025-06-18 · unverdicted · novelty 3.0

Hunyuan3D 2.1 is a two-part system with DiT for shape generation and Paint for texture synthesis that produces high-fidelity 3D assets with PBR materials.

Training-Free Inference for High-Resolution Sinogram Completion

cs.CV · 2025-06-10 · 2 refs

citing papers explorer

Showing 23 of 23 citing papers.

Generative Modeling by Value-Driven Transport cs.LG · 2026-05-21 · unverdicted · none · ref 21
A control-theoretic linear program yields value-driven transport policies for generative modeling with straight paths and simulation-free training.
Bridging Domain Gaps with Target-Aligned Generation for Offline Reinforcement Learning cs.LG · 2026-05-13 · unverdicted · none · ref 33
TCE bridges domain gaps in offline RL by selectively using source data or generating target-aligned transitions via a dual score-based model, outperforming baselines in experiments.
SpecEdit: Training-Free Acceleration for Diffusion based Image Editing via Semantic Locking cs.CV · 2026-05-04 · unverdicted · none · ref 27
SpecEdit accelerates diffusion-based image editing up to 10x by using a low-resolution draft to identify edit-relevant tokens via semantic discrepancies for selective high-resolution denoising.
LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories cs.CV · 2026-04-16 · unverdicted · none · ref 14
LeapAlign fine-tunes flow matching models by constructing two consecutive leaps that skip multiple ODE steps with randomized timesteps and consistency weighting, enabling stable updates at any generation step.
Turbulent pair dispersion with Stochastic Generative Diffusion Models physics.flu-dyn · 2026-04-14 · unverdicted · none · ref 31
Diffusion models generate joint pairs of Lagrangian trajectories that reproduce turbulent pair separation statistics, including deviations from Richardson scaling, while preserving single-particle properties.
CARD: Coarse-to-fine Autoregressive Modeling with Radix-based Decomposition for Transferable Free Energy Estimation cs.LG · 2026-05-04 · unverdicted · none · ref 23
CARD uses radix decomposition to enable autoregressive modeling of molecular coordinates as a zero-free-energy reference distribution, delivering classical accuracy for absolute free energy on unseen systems at ~40x speedup.
Frozen Forecasting: A Unified Evaluation cs.CV · 2025-07-18 · unverdicted · none · ref 16
A new evaluation framework using latent diffusion on frozen vision backbones shows video-pretrained models consistently outperform image-based ones in forecasting entire trajectories across abstraction levels.
BulletGen: Improving 4D Reconstruction with Bullet-Time Generation cs.GR · 2025-06-23 · unverdicted · none · ref 21
BulletGen enhances 4D dynamic scene reconstruction from monocular videos by supervising Gaussian optimization with diffusion-generated frames aligned at a bullet-time step, achieving SOTA on novel-view synthesis and tracking.
2ndMatch: Finetuning Pruned Diffusion Models via Second-Order Jacobian Matching cs.GR · 2025-06-03 · unverdicted · none · ref 17
2ndMatch finetunes pruned diffusion models via second-order Jacobian matching inspired by Finite-Time Lyapunov Exponents to reduce the quality gap with dense models on image generation tasks.
Latent Wavelet Diffusion For Ultra-High-Resolution Image Synthesis cs.CV · 2025-05-31 · unverdicted · none · ref 22
Latent Wavelet Diffusion uses wavelet energy map masking and a scale-consistent VAE to improve detail fidelity in 2K-4K image generation without extra inference overhead.
DreamPolicy: A Unified World-model Policy for Scalable Humanoid Locomotion cs.RO · 2025-05-24 · unverdicted · none · ref 49
DreamPolicy integrates an autoregressive diffusion world model with policy learning to produce a single scalable policy that generalizes to unseen composite terrains for humanoid locomotion.
DanceGRPO: Unleashing GRPO on Visual Generation cs.CV · 2025-05-12 · unverdicted · none · ref 1
DanceGRPO applies GRPO to visual generation tasks to achieve stable policy optimization across diffusion models, rectified flows, multiple tasks, and diverse reward models, outperforming prior RL methods.
Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation cs.CV · 2025-05-08 · unverdicted · none · ref 28
Mogao presents a causal unified model with deep fusion, dual encoders, and interleaved position embeddings that achieves strong performance on multi-modal understanding, text-to-image generation, and coherent interleaved outputs including zero-shot editing.
SkyReels-V2: Infinite-length Film Generative Model cs.CV · 2025-04-17 · unverdicted · none · ref 33
SkyReels-V2 produces infinite-length film videos via MLLM-based captioning, progressive pretraining, motion RL, and diffusion forcing with non-decreasing noise schedules.
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model cs.CV · 2025-03-13 · unverdicted · none · ref 17
HybridVLA unifies diffusion and autoregression in a single VLA model via collaborative training and ensemble to raise robot manipulation success rates by 14% in simulation and 19% in real-world tasks.
GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation cs.RO · 2024-10-08 · unverdicted · none · ref 22
GR-2 pre-trains on web-scale videos then fine-tunes on robot data to reach 97.7% average success across over 100 manipulation tasks with strong generalization to new scenes and objects.
Efficient 3D Content Reconstruction and Generation cs.CV · 2026-05-18 · unverdicted · none · ref 95
Presents Instant3D for rapid text/image-to-3D generation via multi-view diffusion plus feed-forward reconstruction, and FastMap for 10x faster structure-from-motion with comparable accuracy.
Common Inpainted Objects In-N-Out of Context cs.CV · 2025-05-31 · unverdicted · none · ref 30
COinCO is a new dataset of inpainted COCO images with in- and out-of-context objects, enabling context reasoning, object prediction from scenes, and improved fake image detection.
A Probabilistic Formulation of Offset Noise in Diffusion Models stat.ML · 2024-12-04 · unverdicted · none · ref 11
A diffusion model variant that adds structured non-zero-mean noise via modified forward/reverse processes, yielding an ELBO loss analogous to offset noise but with time-dependent coefficients, and showing gains on synthetic high-dimensional data.
Step1X-Edit: A Practical Framework for General Image Editing cs.CV · 2025-04-24 · unverdicted · none · ref 19
Step1X-Edit integrates a multimodal LLM with a diffusion decoder, trained on a custom high-quality dataset, to deliver image editing performance that surpasses open-source baselines and approaches proprietary models on the new GEdit-Bench.
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation cs.CV · 2025-01-21 · unverdicted · none · ref 33
Hunyuan3D 2.0 scales flow-based diffusion transformers and texture synthesis models to generate high-resolution textured 3D assets that outperform prior state-of-the-art in geometry, alignment, and texture quality.
Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material cs.CV · 2025-06-18 · unverdicted · none · ref 1
Hunyuan3D 2.1 is a two-part system with DiT for shape generation and Paint for texture synthesis that produces high-fidelity 3D assets with PBR materials.
Training-Free Inference for High-Resolution Sinogram Completion cs.CV · 2025-06-10 · unreviewed · ref 7 · 2 links

Denoising diffusion probabilistic models

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer