super hub Mixed citations

Denoising Diffusion Implicit Models

Chenlin Meng, Jiaming Song · 2020 · cs.LG · arXiv 2010.02502

Mixed citation behavior. Most common role is background (67%).

446 Pith papers citing it

Background 67% of classified citations

open full Pith review browse 446 citing papers more from Chenlin Meng arXiv PDF

abstract

Denoising diffusion probabilistic models (DDPMs) have achieved high quality image generation without adversarial training, yet they require simulating a Markov chain for many steps to produce a sample. To accelerate sampling, we present denoising diffusion implicit models (DDIMs), a more efficient class of iterative implicit probabilistic models with the same training procedure as DDPMs. In DDPMs, the generative process is defined as the reverse of a Markovian diffusion process. We construct a class of non-Markovian diffusion processes that lead to the same training objective, but whose reverse process can be much faster to sample from. We empirically demonstrate that DDIMs can produce high quality samples $10 \times$ to $50 \times$ faster in terms of wall-clock time compared to DDPMs, allow us to trade off computation for sample quality, and can perform semantically meaningful image interpolation directly in the latent space.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 58 method 23 baseline 2

citation-polarity summary

background 56 use method 23 baseline 2 support 1 unclear 1

claims ledger

abstract Denoising diffusion probabilistic models (DDPMs) have achieved high quality image generation without adversarial training, yet they require simulating a Markov chain for many steps to produce a sample. To accelerate sampling, we present denoising diffusion implicit models (DDIMs), a more efficient class of iterative implicit probabilistic models with the same training procedure as DDPMs. In DDPMs, the generative process is defined as the reverse of a Markovian diffusion process. We construct a class of non-Markovian diffusion processes that lead to the same training objective, but whose revers

authors

and Stefano Ermon Chenlin Meng Jiaming Song

co-cited works

representative citing papers

ActivityForensics: A Comprehensive Benchmark for Localizing Manipulated Activity in Videos

cs.CV · 2026-04-04 · unverdicted · novelty 8.0

ActivityForensics is the first large-scale benchmark for temporally localizing activity-level forgeries in videos, paired with a diffusion-based baseline called TADiff.

Flow-GRPO: Training Flow Matching Models via Online RL

cs.CV · 2025-05-08 · unverdicted · novelty 8.0

Flow-GRPO is the first online RL method for flow matching models, raising GenEval accuracy from 63% to 95% and text-rendering accuracy from 59% to 92% with little reward hacking.

Consistency Models

cs.LG · 2023-03-02 · conditional · novelty 8.0

Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.

Splatshot: 3D Face Avatar Generation from a Single Unconstrained Photo

cs.CV · 2026-05-31 · unverdicted · novelty 7.0

SplatShot is a training-free method that inserts per-step 3DGS refitting and photometric feedback into diffusion denoising to enforce multi-view consistency for single-photo 3D face avatars.

Decoupled Residual Denoising Diffusion Models for Unified and Data Efficient Image-to-Image Translation

cs.CV · 2026-05-31 · unverdicted · novelty 7.0

DRDD decouples diffusion into independent noise and residual stages to preserve domain harmonization and enable unified data-efficient I2I translation.

Sample-Efficient Diffusion-based Reinforcement Learning with Critic Guidance

cs.RO · 2026-05-28 · unverdicted · novelty 7.0

CGPO integrates training-free critic guidance into diffusion denoising to produce high-Q actions as regression targets, yielding SOTA results on MuJoCo locomotion and successful Franka arm grasping.

Midpoint Generative Models

cs.LG · 2026-05-28 · unverdicted · novelty 7.0

Midpoint Generative Models define a midpoint divergence from flow matching symmetry and derive its variational form as a tractable objective for training competitive one-step generators.

Point Tracking Improves World Action Models

cs.RO · 2026-05-22 · unverdicted · novelty 7.0

JOPAT jointly models pixels, point tracks, and actions in a diffusion transformer and reports gains over pixel-only baselines on long-horizon robot tasks with occlusion and off-screen motion.

DFSAttn: Dynamic Fine-grained Sparse Attention for Efficient Video Generation

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

DFSAttn is a training-free framework for dynamic fine-grained sparse attention in video DiTs that achieves up to 2.1x speedup while preserving generation quality via Hilbert reordering, hierarchical scoring, and adaptive caching.

VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

VDE accelerates rectified flow models like Flux by 3.22x with LPIPS of 0.069 via velocity decomposition into parallel/orthogonal components plus periodic full-pass anchoring.

Linear-DPO: Linear Direct Preference Optimization for Diffusion and Flow-Matching Generative Models

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

Linear-DPO replaces sigmoid utility with linear utility and adds EMA reference to improve preference alignment in diffusion and flow-matching text-to-image models.

DrawMotion: Generating 3D Human Motions by Freehand Drawing

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

DrawMotion is a diffusion-based framework that fuses text and hand-drawn stickman conditions via a Multi-Condition Module and training-free guidance to generate 3D human motions.

CAdam: Context-Adaptive Moment Estimation for 3D Gaussian Densification in Generative Distillation

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

CAdam reinterprets densification in generative 3DGS as signal verification via gradient-moment interference, quantile context, and SNR gating to achieve large reductions in primitive count with comparable quality.

DISC: Decoupling Instruction from State-Conditioned Control via Policy Generation

cs.RO · 2026-05-20 · unverdicted · novelty 7.0

A hypernetwork generates complete task-specific visuomotor policy parameters from instructions alone to structurally eliminate observation leakage in language-conditioned robotic control.

BrepForge: Factorized B-rep Synthesis via Wireframe Composition and Boundary-Conditioned Surface Instantiation

cs.GR · 2026-05-19 · unverdicted · novelty 7.0

BrepForge factorizes B-rep synthesis into face-aware autoregressive wireframe composition followed by boundary-conditioned surface instantiation using learning-free geometric priors.

Inference-Time Scaling in Diffusion Models through Iterative Partial Refinement

cs.LG · 2026-05-19 · unverdicted · novelty 7.0

IPR improves valid solution rates on MNIST Sudoku from 55.8% to 75.0% by iteratively refining partial regions in sequential diffusion models without external verifiers or reward models.

PolycubeNet: A Dual-latent Diffusion Model for Polycube-Based Hexahedral Mesh Generation

cs.GR · 2026-05-19 · unverdicted · novelty 7.0

PolycubeNet applies a dual-latent diffusion architecture to generate polycube point clouds from input point clouds, enabling robust hexahedral mesh creation without surface segmentation or templates.

Functionalization via Structure Completion and Motion Rectification

cs.CV · 2026-05-18 · unverdicted · novelty 7.0

Object functionalization is cast as neural graph completion over a functional graph of parts, contacts, and motions, followed by geometry realization that also rectifies erroneous motions, demonstrated on furniture with a new paired dataset.

StreamingEffect: Real-Time Human-Centric Video Effect Generation

cs.CV · 2026-05-16 · unverdicted · novelty 7.0

StreamingEffect enables real-time 720p human-centric video effect generation on one GPU via teacher-student distillation, keyframe control, and a new 130K video dataset.

Towards Generalized Image Manipulation Localization via Score-based Model

cs.CV · 2026-05-16 · conditional · novelty 7.0

DiffIML applies score-based generative modeling to image manipulation localization, recovering coherent masks iteratively from noise to improve generalization on unseen manipulation types.

VMU-Diff: A Coarse-to-fine Multi-source Data Fusion Framework for Precipitation Nowcasting

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

VMU-Diff improves precipitation nowcasting via coarse multi-source Vision Mamba fusion followed by residual conditional diffusion refinement.

HASTE: Training-Free Video Diffusion Acceleration via Head-Wise Adaptive Sparse Attention

cs.CV · 2026-05-14 · unverdicted · novelty 7.0

HASTE delivers up to 1.93x speedup on Wan2.1 video DiTs via head-wise adaptive sparse attention using temporal mask reuse and error-guided per-head calibration while preserving video quality.

What if Tomorrow is the World Cup Final? Counterfactual Time Series Forecasting with Textual Conditions

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

Introduces the task of counterfactual time series forecasting with textual conditions plus a text-attribution mechanism that improves accuracy by distinguishing mutable from immutable factors.

Training-Free Generative Sampling via Moment-Matched Score Smoothing

stat.ML · 2026-05-14 · unverdicted · novelty 7.0

MM-SOLD is a training-free particle sampler whose large-particle limit converges to a moment-matched Gibbs distribution obtained by exponentially tilting a score-smoothed target.

citing papers explorer

Showing 50 of 58 citing papers after filters.

Flow-GRPO: Training Flow Matching Models via Online RL cs.CV · 2025-05-08 · unverdicted · none · ref 22 · internal anchor
Flow-GRPO is the first online RL method for flow matching models, raising GenEval accuracy from 63% to 95% and text-rendering accuracy from 59% to 92% with little reward hacking.
HIR-ALIGN: Enhancing Hyperspectral Image Restoration via Diffusion-Based Data Generation cs.CV · 2026-05-13 · unverdicted · none · ref 30 · internal anchor
HIR-ALIGN augments limited target data for hyperspectral restoration by creating proxy clean images, synthesizing aligned HSIs with blur-robust diffusion and warp-based transfer, then finetuning models to lower target-domain risk.
ImageAttributionBench: How Far Are We from Generalizable Attribution? cs.CV · 2026-05-13 · unverdicted · none · ref 62 · internal anchor
ImageAttributionBench is a benchmark dataset demonstrating that state-of-the-art image attribution methods lack robustness to image degradation and fail to generalize to semantically disjoint domains.
LatentHDR: Decoupling Exposure from Diffusion via Conditional Latent-to-Latent Mapping for Text/Image-to-Panoramic HDR cs.CV · 2026-05-11 · unverdicted · none · ref 40 · internal anchor
LatentHDR generates structurally consistent panoramic HDR images by producing one scene latent with a diffusion backbone then deterministically mapping it to multiple exposure latents via a lightweight conditional head.
Muninn: Your Trajectory Diffusion Model But Faster cs.RO · 2026-05-11 · unverdicted · none · ref 57 · internal anchor
Muninn accelerates diffusion trajectory planners up to 4.6x by spending an uncertainty budget to decide when to cache denoiser outputs, preserving performance and certifying bounded deviation from full computation.
GPO-V: Jailbreak Diffusion Vision Language Model by Global Probability Optimization cs.CV · 2026-05-08 · unverdicted · none · ref 22 · 2 links · internal anchor
GPO-V jailbreaks dVLMs by globally optimizing probabilities in the denoising process to bypass refusal patterns, achieving stealthy and transferable attacks.
LENS: Low-Frequency Eigen Noise Shaping for Efficient Diffusion Sampling cs.CV · 2026-05-08 · unverdicted · none · ref 33 · internal anchor
LENS shapes low-frequency eigen noise with a lightweight network to enable efficient, high-quality sampling in distilled diffusion models.
Oracle Noise: Faster Semantic Spherical Alignment for Interpretable Latent Optimization cs.CV · 2026-04-26 · unverdicted · none · ref 36 · internal anchor
Oracle Noise optimizes diffusion model noise on a Riemannian hypersphere guided by key prompt words to preserve the Gaussian prior, eliminate norm inflation, and achieve faster semantic alignment than Euclidean methods.
$Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models cs.CV · 2026-04-26 · unverdicted · none · ref 42 · internal anchor
Z²-Sampling implicitly realizes zero-cost zigzag trajectories for curvature-aware semantic alignment in diffusion models by reducing multi-step paths via operator dualities and temporal caching while synthesizing a directional derivative penalty.
ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis cs.CV · 2026-04-21 · unverdicted · none · ref 42 · internal anchor
ReImagine decouples human appearance from temporal consistency via pretrained image backbones, SMPL-X motion guidance, and training-free video diffusion refinement to generate high-quality controllable videos.
HP-Edit: A Human-Preference Post-Training Framework for Image Editing cs.CV · 2026-04-21 · unverdicted · none · ref 40 · internal anchor
HP-Edit introduces a post-training framework and RealPref-50K dataset that uses a VLM-based HP-Scorer to align diffusion image editing models with human preferences, improving outputs on Qwen-Image-Edit-2509.
Learning to Credit the Right Steps: Objective-aware Process Optimization for Visual Generation cs.CV · 2026-04-21 · unverdicted · none · ref 38 · internal anchor
OTCA improves GRPO training for visual generation by estimating step importance in trajectories and adaptively weighting multiple reward objectives.
DMax: Aggressive Parallel Decoding for dLLMs cs.LG · 2026-04-09 · conditional · none · ref 71 · 2 links · internal anchor
DMax uses On-Policy Uniform Training and Soft Parallel Decoding to enable aggressive parallelism in dLLMs, raising TPF on GSM8K from 2.04 to 5.47 and on MBPP from 2.71 to 5.86 while preserving accuracy.
Your Pre-trained Diffusion Model Secretly Knows Restoration cs.CV · 2026-04-06 · unverdicted · none · ref 45 · internal anchor
Pre-trained diffusion models inherently support image restoration that can be unlocked by optimizing prompt embeddings at the text encoder output using a diffusion bridge formulation, achieving competitive results on models like WAN and FLUX without fine-tuning.
PromptEvolver: Prompt Inversion through Evolutionary Optimization in Natural-Language Space cs.LG · 2026-04-03 · unverdicted · none · ref 40 · internal anchor
PromptEvolver recovers high-fidelity natural language prompts for given images by evolving them via genetic algorithm guided by a vision-language model, outperforming prior methods on benchmarks.
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE cs.AI · 2025-07-29 · unverdicted · none · ref 35 · internal anchor
MixGRPO speeds up GRPO for flow-based image generators by restricting SDE sampling and optimization to a sliding window while using ODE elsewhere, cutting training time by up to 71% with better alignment performance.
FD-Bench: A Modular and Fair Benchmark for Data-driven Fluid Simulation physics.flu-dyn · 2025-05-25 · unverdicted · none · ref 96 · internal anchor
FD-Bench supplies the first modular, reproducible benchmark and leaderboard for comparing neural PDE solvers on fluid dynamics tasks with direct numerical solver baselines.
Scalable Diffusion Models with Transformers cs.CV · 2022-12-19 · unverdicted · none · ref 55 · internal anchor
DiTs achieve SOTA FID of 2.27 on ImageNet 256x256 by scaling transformer-based latent diffusion models, with performance improving consistently as Gflops increase.
Hierarchical Text-Conditional Image Generation with CLIP Latents cs.CV · 2022-04-13 · accept · none · ref 48 · internal anchor
A hierarchical prior-decoder model using CLIP latents generates more diverse text-conditional images than direct methods while preserving photorealism and caption fidelity.
TOPOS: High-Fidelity and Efficient Industry-Grade 3D Head Generation cs.CV · 2026-05-14 · unverdicted · none · ref 2 · internal anchor
TOPOS creates high-fidelity 3D heads with fixed industry topology from single images via a specialized VAE with Perceiver Resampler and a rectified flow transformer.
GenMed: A Pairwise Generative Reformulation of Medical Diagnostic Tasks cs.CV · 2026-05-11 · unverdicted · none · ref 81 · internal anchor
GenMed uses diffusion models to capture P(X,Y) for medical tasks and performs inference via gradient-based test-time optimization, supporting arbitrary observation combinations without retraining.
"Training robust watermarking model may hurt authentication!'' Exploring and Mitigating the Identity Leakage in Robust Watermarking cs.CR · 2026-05-10 · unverdicted · none · ref 70 · internal anchor
W-IR is the first watermarking framework to combine certified robustness via randomized smoothing in pixel and coordinate spaces with identity leakage mitigation via residual information loss minimization.
Removing the Watermark Is Not Enough: Forensic Stealth in Generative-AI Watermark Removal cs.CR · 2026-05-09 · unverdicted · none · ref 38 · internal anchor
Current AI image watermark removal attacks replace the watermark with a different forensic signal, allowing independent detectors to distinguish processed outputs from clean images at over 98% true-positive rate under a 1% false-positive budget.
From Synthetic to Real: Toward Identity-Consistent Makeup Transfer with Synthetic and Real Data cs.CV · 2026-05-08 · unverdicted · none · ref 24 · internal anchor
The work creates identity-consistent synthetic makeup data via ConsistentBeauty and adapts models to real images using reinforcement learning in RealBeauty, achieving better identity preservation and real-world performance than prior methods.
Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework cs.CV · 2026-05-08 · unverdicted · none · ref 38 · 2 links · internal anchor
MagicBokeh uses a single diffusion model with alternative training, focus-aware masked attention, and degradation-aware depth estimation to produce photorealistic bokeh on low-res zoomed images.
GCCM: Enhancing Generative Graph Prediction via Contrastive Consistency Model cs.AI · 2026-05-07 · unverdicted · none · ref 13 · internal anchor
GCCM prevents shortcut collapse in consistency models for graph prediction by using contrastive negative pairs and input feature perturbation, leading to better performance than deterministic baselines.
NoiseRater: Meta-Learned Noise Valuation for Diffusion Model Training cs.LG · 2026-05-02 · unverdicted · none · ref 43 · internal anchor
NoiseRater meta-learns instance-level importance scores for noise in diffusion training via bilevel optimization, then uses a two-stage pipeline to improve efficiency and generation quality on FFHQ and ImageNet.
Delta Score Matters! Spatial Adaptive Multi Guidance in Diffusion Models cs.CV · 2026-04-29 · unverdicted · none · ref 30 · internal anchor
SAMG uses spatially adaptive guidance scales derived from a geometric analysis of classifier-free guidance to resolve the detail-artifact dilemma in diffusion-based image and video generation.
MetaSR: Content-Adaptive Metadata Orchestration for Generative Super-Resolution cs.CV · 2026-04-29 · unverdicted · none · ref 26 · internal anchor
MetaSR adaptively orchestrates metadata in a DiT-based generative SR model to deliver up to 1 dB PSNR gains and 50% bitrate savings across diverse content and degradations.
Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation cs.CV · 2026-04-20 · unverdicted · none · ref 2 · internal anchor
By requiring and using highly discriminative LLM text features, the work enables the first effective one-step text-conditioned image generation with MeanFlow.
Deepfake Detection Generalization with Diffusion Noise cs.CV · 2026-04-16 · unverdicted · none · ref 64 · internal anchor
ANL uses diffusion noise prediction and attention to regularize deepfake detectors for better generalization to unseen synthesis methods without added inference cost.
ReplicateAnyScene: Zero-Shot Video-to-3D Composition via Textual-Visual-Spatial Alignment cs.CV · 2026-04-12 · unverdicted · none · ref 46 · internal anchor
ReplicateAnyScene performs fully automated zero-shot video-to-compositional-3D reconstruction by cascading alignments of generic priors from vision foundation models across textual, visual, and spatial dimensions.
Rein3D: Reinforced 3D Indoor Scene Generation with Panoramic Video Diffusion Models cs.CV · 2026-04-12 · unverdicted · none · ref 36 · internal anchor
Rein3D generates photorealistic, globally consistent 3D indoor scenes by using a restore-and-refine process where radial panoramic videos are restored via diffusion models and then used to update a 3D Gaussian field.
Long-Horizon Streaming Video Generation via Hybrid Attention with Decoupled Distillation cs.CV · 2026-04-11 · conditional · none · ref 38 · internal anchor
Hybrid Forcing combines linear temporal attention for long-range retention, block-sparse attention for efficiency, and decoupled distillation to achieve real-time unbounded 832x480 streaming video generation at 29.5 FPS.
Towards Robust Content Watermarking Against Removal and Forgery Attacks cs.CV · 2026-04-08 · unverdicted · none · ref 51 · internal anchor
ISTS watermarking dynamically controls injection based on prompt semantics and uses two-sided detection to resist removal and forgery attacks in diffusion models.
GENSERVE: Efficient Co-Serving of Heterogeneous Diffusion Model Workloads cs.DC · 2026-04-06 · unverdicted · none · ref 33 · internal anchor
GENSERVE improves SLO attainment by up to 44% for co-serving heterogeneous T2I and T2V diffusion workloads via step-level preemption, elastic parallelism, and joint scheduling.
Emu3.5: Native Multimodal Models are World Learners cs.CV · 2025-10-30 · unverdicted · none · ref 83 · internal anchor
Emu3.5 is a native multimodal world model pre-trained on over 10 trillion vision-language tokens with next-token prediction, post-trained via reinforcement learning, and accelerated by Discrete Diffusion Adaptation for efficient interleaved generation and world exploration.
ReSim: Reliable World Simulation for Autonomous Driving cs.CV · 2025-06-11 · unverdicted · none · ref 115 · internal anchor
ReSim is a controllable video world model trained on heterogeneous real and simulated driving data that achieves higher fidelity and controllability for both expert and non-expert actions, plus a Video2Reward module for estimating action quality from simulated futures.
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation cs.RO · 2024-11-29 · unverdicted · none · ref 59 · internal anchor
CogACT is a new VLA model that uses a conditioned diffusion action transformer to achieve over 35% higher average success rates than OpenVLA in simulation and 55% in real-robot experiments while generalizing to new robots and objects.
VideoPhy: Evaluating Physical Commonsense for Video Generation cs.CV · 2024-06-05 · conditional · none · ref 93 · internal anchor
VideoPhy benchmark shows state-of-the-art text-to-video models follow physical commonsense and text prompts in only 39.6% of cases for the best model.
Efficient 3D Content Reconstruction and Generation cs.CV · 2026-05-18 · unverdicted · none · ref 232 · internal anchor
Presents Instant3D for rapid text/image-to-3D generation via multi-view diffusion plus feed-forward reconstruction, and FastMap for 10x faster structure-from-motion with comparable accuracy.
EponaV2: Driving World Model with Comprehensive Future Reasoning cs.CV · 2026-05-14 · unverdicted · none · ref 62 · internal anchor
EponaV2 advances perception-free driving world models by forecasting comprehensive future 3D geometry and semantic representations, achieving SOTA planning performance on NAVSIM benchmarks.
CaloArt: Large-Patch x-Prediction Diffusion Transformers for High-Granularity Calorimeter Shower Generation physics.ins-det · 2026-05-12 · unverdicted · none · ref 26 · internal anchor
CaloArt achieves top FPD, high-level, and classifier metrics on CaloChallenge datasets 2 and 3 while keeping single-GPU generation at 9-11 ms per shower by combining large-patch tokenization, x-prediction, and conditional flow matching.
diffGHOST: Diffusion based Generative Hedged Oblivious Synthetic Trajectories cs.AI · 2026-05-11 · unverdicted · none · ref 36 · internal anchor
diffGHOST is a conditional diffusion model that segments learned latent space to identify and mitigate memorization of critical trajectory samples, aiming to deliver privacy guarantees alongside data utility.
Memory Efficient Full-gradient Attacks (MEFA) Framework for Adversarial Defense Evaluations cs.LG · 2026-05-07 · unverdicted · none · ref 42 · internal anchor
MEFA enables exact full-gradient white-box attacks on iterative stochastic purification defenses like diffusion and Langevin EBMs by trading recomputation for lower memory, revealing vulnerabilities missed by approximate-gradient methods.
Exploring Time Conditioning in Diffusion Generative Models from Disjoint Noisy Data Manifolds cs.LG · 2026-04-28 · unverdicted · none · ref 17 · internal anchor
Aligning the DDIM forward diffusion process with flow-matching manifold evolution enables high-quality generation without time conditioning, and class-conditional synthesis is possible with an unconditional denoiser by using separate time spaces per class.
Gated Memory Policy cs.RO · 2026-04-21 · unverdicted · none · ref 44 · internal anchor
GMP selectively activates and represents memory via a gate and lightweight cross-attention, yielding 30.1% higher success on non-Markovian robotic tasks while staying competitive on Markovian ones.
Controllable Video Object Insertion via Multiview Priors cs.CV · 2026-04-16 · unverdicted · none · ref 37 · internal anchor
A multi-view prior-based framework for video object insertion that uses dual-path conditioning and an integration-aware consistency module to improve appearance stability and occlusion handling.
A Survey on Diffusion Models for Inverse Problems cs.LG · 2024-09-30 · unverdicted · none · ref 132 · internal anchor
A survey that introduces taxonomies for categorizing pre-trained diffusion model methods applied to inverse problems and analyzes their connections and challenges.
ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance cs.CV · 2026-04-29 · unverdicted · none · ref 26
ACPO uses anchor-based regularization with NR-IQA guidance to enable stable perceptual quality improvements in diffusion model fine-tuning.

Denoising Diffusion Implicit Models

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer