super hub Mixed citations

Denoising Diffusion Implicit Models

Chenlin Meng, Jiaming Song · 2020 · cs.LG · arXiv 2010.02502

Mixed citation behavior. Most common role is background (67%).

479 Pith papers citing it

Background 67% of classified citations

open full Pith review browse 479 citing papers more from Chenlin Meng arXiv PDF

abstract

Denoising diffusion probabilistic models (DDPMs) have achieved high quality image generation without adversarial training, yet they require simulating a Markov chain for many steps to produce a sample. To accelerate sampling, we present denoising diffusion implicit models (DDIMs), a more efficient class of iterative implicit probabilistic models with the same training procedure as DDPMs. In DDPMs, the generative process is defined as the reverse of a Markovian diffusion process. We construct a class of non-Markovian diffusion processes that lead to the same training objective, but whose reverse process can be much faster to sample from. We empirically demonstrate that DDIMs can produce high quality samples $10 \times$ to $50 \times$ faster in terms of wall-clock time compared to DDPMs, allow us to trade off computation for sample quality, and can perform semantically meaningful image interpolation directly in the latent space.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 58 method 23 baseline 2

citation-polarity summary

background 56 use method 23 baseline 2 support 1 unclear 1

claims ledger

abstract Denoising diffusion probabilistic models (DDPMs) have achieved high quality image generation without adversarial training, yet they require simulating a Markov chain for many steps to produce a sample. To accelerate sampling, we present denoising diffusion implicit models (DDIMs), a more efficient class of iterative implicit probabilistic models with the same training procedure as DDPMs. In DDPMs, the generative process is defined as the reverse of a Markovian diffusion process. We construct a class of non-Markovian diffusion processes that lead to the same training objective, but whose revers

authors

and Stefano Ermon Chenlin Meng Jiaming Song

co-cited works

representative citing papers

ActivityForensics: A Comprehensive Benchmark for Localizing Manipulated Activity in Videos

cs.CV · 2026-04-04 · unverdicted · novelty 8.0

ActivityForensics is the first large-scale benchmark for temporally localizing activity-level forgeries in videos, paired with a diffusion-based baseline called TADiff.

Flow-GRPO: Training Flow Matching Models via Online RL

cs.CV · 2025-05-08 · unverdicted · novelty 8.0

Flow-GRPO is the first online RL method for flow matching models, raising GenEval accuracy from 63% to 95% and text-rendering accuracy from 59% to 92% with little reward hacking.

Consistency Models

cs.LG · 2023-03-02 · conditional · novelty 8.0

Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.

MUSE: Unlocking Timestep as Native Task Steering for One-Step Dense Prediction

cs.CV · 2026-06-29 · unverdicted · novelty 7.0

MUSE shows that the native timestep embedding in diffusion models acts as a parameter-free steering signal for multi-task monocular depth and normal estimation via manifold decoupling in latent space.

ASTAD: Asymmetric Style Transfer for Synthetic-to-Real Adaptation in Autonomous Driving

cs.CV · 2026-06-28 · unverdicted · novelty 7.0

Introduces the ASTAD task and training-free ASTModel framework for semantically consistent asymmetric style transfer using labeled synthetic content and unlabeled real references.

Splatshot: 3D Face Avatar Generation from a Single Unconstrained Photo

cs.CV · 2026-05-31 · unverdicted · novelty 7.0

SplatShot is a training-free method that inserts per-step 3DGS refitting and photometric feedback into diffusion denoising to enforce multi-view consistency for single-photo 3D face avatars.

Decoupled Residual Denoising Diffusion Models for Unified and Data Efficient Image-to-Image Translation

cs.CV · 2026-05-31 · unverdicted · novelty 7.0

DRDD decouples diffusion into independent noise and residual stages to preserve domain harmonization and enable unified data-efficient I2I translation.

Sample-Efficient Diffusion-based Reinforcement Learning with Critic Guidance

cs.RO · 2026-05-28 · unverdicted · novelty 7.0

CGPO integrates training-free critic guidance into diffusion denoising to produce high-Q actions as regression targets, yielding SOTA results on MuJoCo locomotion and successful Franka arm grasping.

Midpoint Generative Models

cs.LG · 2026-05-28 · unverdicted · novelty 7.0

Midpoint Generative Models define a midpoint divergence from flow matching symmetry and derive its variational form as a tractable objective for training competitive one-step generators.

Spectral Guidance for Flexible and Efficient Control of Diffusion Models

cs.LG · 2026-05-27 · unverdicted · novelty 7.0

Spectral Guidance learns singular functions via self-supervised objective to project guidance signals onto diffusion sampling trajectories, enabling stable control without retraining or backpropagation and improving CIFAR-10 accuracy by 37 points with 4x faster sampling.

Towards Anatomically Plausible Human Image Generation via Synthetic Localized Preferences

cs.CV · 2026-05-25 · unverdicted · novelty 7.0

ASAP generates over 10K synthetic anatomical preference pairs via targeted degradation of high-fidelity images and applies a localized margin-bounded DPO to reduce anatomical errors in text-to-image human generation, supported by the new HAP dataset and HAF-Bench.

DeltaCam: Differential Intrinsic Camera Modeling for Video Generation

cs.CV · 2026-05-24 · unverdicted · novelty 7.0

DeltaCam models relative changes in camera intrinsics via Δ-parameterized neural adaptors in video diffusion models trained on synthetic data to enable controllable generation and real-world transfer.

Loki: Representation over Architecture for Diffusion-Based Portrait Animation

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

Loki replaces RGB conditioning stacks with identity-orthogonal parametric face encodings rasterized for diffusion, achieving efficient cross-ID portrait animation without cross-ID training data.

Point Tracking Improves World Action Models

cs.RO · 2026-05-22 · unverdicted · novelty 7.0

JOPAT jointly models pixels, point tracks, and actions in a diffusion transformer and reports gains over pixel-only baselines on long-horizon robot tasks with occlusion and off-screen motion.

DFSAttn: Dynamic Fine-grained Sparse Attention for Efficient Video Generation

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

DFSAttn is a training-free framework for dynamic fine-grained sparse attention in video DiTs that achieves up to 2.1x speedup while preserving generation quality via Hilbert reordering, hierarchical scoring, and adaptive caching.

VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

VDE accelerates rectified flow models like Flux by 3.22x with LPIPS of 0.069 via velocity decomposition into parallel/orthogonal components plus periodic full-pass anchoring.

Linear-DPO: Linear Direct Preference Optimization for Diffusion and Flow-Matching Generative Models

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

Linear-DPO replaces sigmoid utility with linear utility and adds EMA reference to improve preference alignment in diffusion and flow-matching text-to-image models.

DrawMotion: Generating 3D Human Motions by Freehand Drawing

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

DrawMotion is a diffusion-based framework that fuses text and hand-drawn stickman conditions via a Multi-Condition Module and training-free guidance to generate 3D human motions.

CAdam: Context-Adaptive Moment Estimation for 3D Gaussian Densification in Generative Distillation

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

CAdam reinterprets densification in generative 3DGS as signal verification via gradient-moment interference, quantile context, and SNR gating to achieve large reductions in primitive count with comparable quality.

DISC: Decoupling Instruction from State-Conditioned Control via Policy Generation

cs.RO · 2026-05-20 · unverdicted · novelty 7.0

A hypernetwork generates complete task-specific visuomotor policy parameters from instructions alone to structurally eliminate observation leakage in language-conditioned robotic control.

FlowErase-RL: Rethinking Concept Erasure as Reward Optimization in Flow Matching Models

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

FlowErase-RL applies GRPO to reformulate concept erasure in flow matching models as reward optimization using a dynamic dual-path mechanism for target suppression and non-target preservation.

BrepForge: Factorized B-rep Synthesis via Wireframe Composition and Boundary-Conditioned Surface Instantiation

cs.GR · 2026-05-19 · unverdicted · novelty 7.0

BrepForge factorizes B-rep synthesis into face-aware autoregressive wireframe composition followed by boundary-conditioned surface instantiation using learning-free geometric priors.

Inference-Time Scaling in Diffusion Models through Iterative Partial Refinement

cs.LG · 2026-05-19 · unverdicted · novelty 7.0

IPR improves valid solution rates on MNIST Sudoku from 55.8% to 75.0% by iteratively refining partial regions in sequential diffusion models without external verifiers or reward models.

PolycubeNet: A Dual-latent Diffusion Model for Polycube-Based Hexahedral Mesh Generation

cs.GR · 2026-05-19 · unverdicted · novelty 7.0

PolycubeNet applies a dual-latent diffusion architecture to generate polycube point clouds from input point clouds, enabling robust hexahedral mesh creation without surface segmentation or templates.

citing papers explorer

Showing 50 of 479 citing papers.

"Training robust watermarking model may hurt authentication!'' Exploring and Mitigating the Identity Leakage in Robust Watermarking cs.CR · 2026-05-10 · unverdicted · none · ref 70 · internal anchor
W-IR is the first watermarking framework to combine certified robustness via randomized smoothing in pixel and coordinate spaces with identity leakage mitigation via residual information loss minimization.
Removing the Watermark Is Not Enough: Forensic Stealth in Generative-AI Watermark Removal cs.CR · 2026-05-09 · unverdicted · none · ref 38 · internal anchor
Current AI image watermark removal attacks replace the watermark with a different forensic signal, allowing independent detectors to distinguish processed outputs from clean images at over 98% true-positive rate under a 1% false-positive budget.
From Synthetic to Real: Toward Identity-Consistent Makeup Transfer with Synthetic and Real Data cs.CV · 2026-05-08 · unverdicted · none · ref 24 · internal anchor
The work creates identity-consistent synthetic makeup data via ConsistentBeauty and adapts models to real images using reinforcement learning in RealBeauty, achieving better identity preservation and real-world performance than prior methods.
TextLDM: Language Modeling with Continuous Latent Diffusion cs.CL · 2026-05-08 · unverdicted · none · ref 14 · internal anchor
TextLDM applies DiT-style latent diffusion with flow matching to language modeling via a REPA-aligned VAE, outperforming prior diffusion LMs and matching GPT-2 when trained from scratch on OpenWebText2.
Toward Better Geometric Representations for Molecule Generative Models cs.LG · 2026-05-08 · unverdicted · none · ref 12 · internal anchor
LENSEs improves representation-conditioned molecule generation by jointly training a multi-level representation head, perceptual loss, and REPA alignment on pretrained encoders, yielding 97.28% validity and 98.51% stability on GEOM-DRUG.
Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework cs.CV · 2026-05-08 · unverdicted · none · ref 38 · 2 links · internal anchor
MagicBokeh uses a single diffusion model with alternative training, focus-aware masked attention, and degradation-aware depth estimation to produce photorealistic bokeh on low-res zoomed images.
FlashMol: High-Quality Molecule Generation in as Few as Four Steps cs.LG · 2026-05-07 · unverdicted · none · ref 31 · internal anchor
FlashMol produces chemically valid 3D molecules in 4 steps via distribution matching distillation with respaced timesteps and Jensen-Shannon regularization, matching or exceeding 1000-step teacher performance on QM9 and GEOM-DRUG.
Conservative Flows: A New Paradigm of Generative Models cs.LG · 2026-05-07 · unverdicted · none · ref 50 · internal anchor
Conservative flows generate by running probability-preserving stochastic dynamics initialized at data points rather than noise, using corrected Langevin or predictor-corrector mechanisms on top of any pretrained flow model and showing gains on Swiss-roll, ImageNet-256 and Oxford Flowers-102.
Physical Fidelity Reconstruction via Improved Consistency-Distilled Flow Matching for Dynamical Systems cs.LG · 2026-05-07 · unverdicted · none · ref 26 · internal anchor
Distilled one-step consistency model from optimal-transport flow-matching teacher reconstructs high-fidelity dynamical system flows from low-fidelity data with 12x speedup, half the parameters, and 23.1% better SSIM than scratch-trained baselines.
InkDiffuser: High-Fidelity One-shot Chinese Calligraphy via Differentiable Morphological Optimization cs.CV · 2026-05-07 · unverdicted · none · ref 9 · internal anchor
InkDiffuser generates high-fidelity one-shot Chinese calligraphy using high-frequency enhancement and a differentiable ink structure loss for realistic stroke and ink rendering.
GCCM: Enhancing Generative Graph Prediction via Contrastive Consistency Model cs.AI · 2026-05-07 · unverdicted · none · ref 13 · internal anchor
GCCM prevents shortcut collapse in consistency models for graph prediction by using contrastive negative pairs and input feature perturbation, leading to better performance than deterministic baselines.
Learning a Delighting Prior for Facial Appearance Capture in the Wild cs.CV · 2026-05-07 · unverdicted · none · ref 115 · internal anchor
A delighting network trained via Dataset Latent Modulation on heterogeneous OLAT and Light Stage data enables high-quality in-the-wild facial reflectance capture from video and produces the NeRSemble-Scan dataset.
Conditional Diffusion Under Linear Constraints: Langevin Mixing and Information-Theoretic Guarantees cs.LG · 2026-05-06 · unverdicted · none · ref 9 · internal anchor
Error in approximating the tangent conditional score by the unconditional score in diffusion models is bounded by dimension-free conditional mutual information, with a projected-Langevin method outperforming baselines in inpainting and super-resolution.
D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models cs.CV · 2026-05-06 · unverdicted · none · ref 88 · 3 links · internal anchor
D-OPSD formulates supervised fine-tuning of step-distilled diffusion models as on-policy self-distillation by having the model act as both teacher (with multimodal context) and student (with text-only context) on its own roll-outs.
Structured 3D Latents Are Surprisingly Powerful: Unleashing Generalizable Style with 2D Diffusion cs.CV · 2026-05-06 · unverdicted · none · ref 2 · internal anchor
DiLAST optimizes 3D latents via guidance from a 2D diffusion model to enable generalizable style transfer for OOD styles in 3D asset generation.
Intermediate Representations are Strong AI-Generated Image Detectors cs.CV · 2026-05-05 · unverdicted · none · ref 50 · internal anchor
Intermediate layer embedding sensitivity to perturbations distinguishes AI-generated images from real ones, yielding higher AUROC on GenImage and Forensics Small benchmarks than prior methods.
Identity-Consistent Multi-Pose Generation of Contactless Fingerprints cs.CV · 2026-05-05 · unverdicted · none · ref 30 · internal anchor
IMPOSE generates identity-consistent multi-pose contactless fingerprints via latent diffusion, Sauvola-guided translation, and 3D finger model projection, enabling SOTA cross-modal matching with EER reduced to 8.74% on UWA and 2.26% on PolyU CL2CB.
A Few-Step Generative Model on Cumulative Flow Maps cs.LG · 2026-05-05 · unverdicted · none · ref 4 · internal anchor
Cumulative flow maps unify few-step generative modeling for diffusion and flow models via cumulative transport and parameterization with minimal changes to time embeddings and objectives.
PerFlow: Physics-Embedded Rectified Flow for Efficient Reconstruction and Uncertainty Quantification of Spatiotemporal Dynamics cs.LG · 2026-05-05 · unverdicted · none · ref 23 · 2 links · internal anchor
PerFlow decouples observation conditioning from physics enforcement in rectified flows using constraint-preserving projections and invariance guarantees for fast, physics-consistent reconstruction of spatiotemporal dynamics.
Synthetic Data Generation for Long-Tail Medical Image Classification: A Case Study in Skin Lesions cs.CV · 2026-05-04 · unverdicted · none · ref 29 · internal anchor
A diffusion-based synthetic data pipeline using inpainting and OOD post-selection improves long-tail skin lesion classification on ISIC2019, delivering over 28% accuracy gain on the rarest class.
OGPO: Sample Efficient Full-Finetuning of Generative Control Policies cs.LG · 2026-05-04 · unverdicted · none · ref 44 · 2 links · internal anchor
OGPO enables sample-efficient full-finetuning of generative control policies via off-policy critics and modified PPO, achieving SOTA on robot manipulation tasks while rescuing poorly initialized behavior cloning policies without expert data.
Anomaly-Preference Image Generation cs.CV · 2026-05-04 · unverdicted · none · ref 7 · 3 links · internal anchor
Anomaly Preference Optimization reformulates anomaly image generation as preference learning using real anomalies for implicit alignment signals from denoising trajectories plus a time-aware capacity allocation module.
SlimDiffSR: Toward Lightweight and Efficient Remote Sensing Image Super-Resolution via Diffusion Model Distillation cs.CV · 2026-05-04 · unverdicted · none · ref 38 · 2 links · internal anchor
SlimDiffSR uses uncertainty-guided timestep assignment and structured pruning with frequency- and direction-separable convolutions plus MMD distillation to create a 200x faster, 20x smaller diffusion SR model for remote sensing while retaining competitive quality.
NoiseRater: Meta-Learned Noise Valuation for Diffusion Model Training cs.LG · 2026-05-02 · unverdicted · none · ref 43 · internal anchor
NoiseRater meta-learns instance-level importance scores for noise in diffusion training via bilevel optimization, then uses a two-stage pipeline to improve efficiency and generation quality on FFHQ and ImageNet.
PhysiGen: Integrating Collision-Aware Physical Constraints for High-Fidelity Human-Human Interaction Generation cs.CV · 2026-05-01 · unverdicted · none · ref 37 · internal anchor
PhysiGen reduces interpenetration in text-driven 3D human interaction generation by simplifying meshes to geometric primitives for fast collision detection and guiding optimization with collision regions.
GSDrive: Reinforcing Driving Policies by Multi-mode Future Trajectory Probing with 3D Gaussian Splatting Environment cs.RO · 2026-04-30 · unverdicted · none · ref 30 · 2 links · internal anchor
GSDrive combines IL priors with RL feedback by probing multi-mode futures inside a 3D Gaussian Splatting simulator to supply dense rewards for closed-loop driving policy improvement on nuScenes.
Diffusion-OAMP for Joint Image Compression and Wireless Transmission eess.IV · 2026-04-30 · unverdicted · none · ref 10 · internal anchor
Diffusion-OAMP combines a pre-trained diffusion model with the OAMP algorithm under an SNR-matching rule to enable training-free reconstruction of compressed images transmitted over noisy wireless channels.
Delta Score Matters! Spatial Adaptive Multi Guidance in Diffusion Models cs.CV · 2026-04-29 · unverdicted · none · ref 30 · internal anchor
SAMG uses spatially adaptive guidance scales derived from a geometric analysis of classifier-free guidance to resolve the detail-artifact dilemma in diffusion-based image and video generation.
MetaSR: Content-Adaptive Metadata Orchestration for Generative Super-Resolution cs.CV · 2026-04-29 · unverdicted · none · ref 26 · internal anchor
MetaSR adaptively orchestrates metadata in a DiT-based generative SR model to deliver up to 1 dB PSNR gains and 50% bitrate savings across diverse content and degradations.
The Thinking Pixel: Recursive Sparse Reasoning in Multimodal Diffusion Latents cs.CV · 2026-04-28 · unverdicted · none · ref 11 · internal anchor
A recursive sparse MoE framework integrated into diffusion models iteratively refines visual tokens via gated module selection to improve structured reasoning and image generation performance.
CoreFlow: Low-Rank Matrix Generative Models cs.LG · 2026-04-27 · unverdicted · none · ref 34 · internal anchor
CoreFlow is a low-rank matrix generative model that trains normalizing flows on shared subspaces to improve efficiency and quality for high-dimensional limited-sample data, including incomplete matrices.
Diffusion Model as a Generalist Segmentation Learner cs.CV · 2026-04-27 · unverdicted · none · ref 74 · internal anchor
DiGSeg repurposes diffusion U-Nets as generalist segmentation learners by conditioning on image-mask latents and multi-scale CLIP text features, achieving strong cross-domain performance.
Efficient Diffusion Distillation via Embedding Loss cs.CV · 2026-04-24 · unverdicted · none · ref 38 · internal anchor
Embedding Loss aligns feature distributions via MMD in random network embeddings to boost one-step diffusion distillation, reaching SOTA FID of 1.475 on CIFAR-10 unconditional generation.
Learning Coverage- and Power-Optimal Transmitter Placement from Building Maps: A Comparative Study of Direct and Indirect Neural Approaches cs.LG · 2026-04-23 · unverdicted · none · ref 72 · internal anchor
Neural models predict coverage- and power-optimal transmitter locations from building maps, matching exhaustive search performance at 14-2400x speedups while quantifying an asymmetric coverage-power trade-off.
Conditional Diffusion Posterior Alignment for Sparse-View CT Reconstruction eess.IV · 2026-04-23 · unverdicted · none · ref 26 · internal anchor
CDPA scales diffusion-based reconstruction to large 3D volumes by conditioning 2D models on initial 3D reconstructions plus data-consistency alignment, delivering state-of-the-art results on synthetic and real CBCT data.
Generative Learning Enhanced Intelligent Resource Management for Cell-Free Delay Deterministic Communications cs.IT · 2026-04-23 · unverdicted · none · ref 53 · internal anchor
The proposed pretraining framework for safe DRL in CF-MIMO resource management doubles initial energy efficiency, achieves 4.7% higher final EE, maintains 1% delay violation rate, and cuts exploration steps by 50% compared to non-pretrained baselines while matching diffusion model performance at 14x
Exploring the Role of Synthetic Data Augmentation in Controllable Human-Centric Video Generation cs.CV · 2026-04-23 · unverdicted · none · ref 33 · internal anchor
Synthetic data complements real data in diffusion-based controllable human video generation, with effective sample selection improving motion realism, temporal consistency, and identity preservation.
LatRef-Diff: Latent and Reference-Guided Diffusion for Facial Attribute Editing and Style Manipulation cs.CV · 2026-04-23 · unverdicted · none · ref 22 · internal anchor
LatRef-Diff replaces semantic directions in diffusion models with latent and reference-guided style codes, uses a hierarchical style modulation module, and applies forward-backward consistency training to achieve state-of-the-art facial attribute editing and style manipulation on CelebA-HQ.
Uncertainty-Aware Spatiotemporal Super-Resolution Data Assimilation with Diffusion Models physics.flu-dyn · 2026-04-23 · unverdicted · none · ref 15 · internal anchor
DiffSRDA uses denoising diffusion models to perform uncertainty-aware spatiotemporal super-resolution data assimilation, achieving EnKF-like quality from low-resolution forecasts on an ocean jet testbed.
WFM: 3D Wavelet Flow Matching for Ultrafast Multi-Modal MRI Synthesis cs.CV · 2026-04-22 · unverdicted · none · ref 19 · internal anchor
WFM achieves near-diffusion quality for all four BraTS MRI modalities with one 82M model in 1-2 steps by flowing from the mean of conditioning modalities in wavelet space, running 250-1000x faster.
Normalizing Flows with Iterative Denoising cs.CV · 2026-04-21 · unverdicted · none · ref 15 · internal anchor
iTARFlow augments normalizing flows with diffusion-style iterative denoising during sampling while preserving end-to-end likelihood training, reaching competitive results on ImageNet 64/128/256.
LatentGandr: Visual Exploration of Generative AI Latent Space via Local Embeddings cs.HC · 2026-04-21 · unverdicted · none · ref 25 · internal anchor
LatentGandr computes local principal components from neighborhood embeddings in generative model latent spaces and visualizes them as interactive grids to improve exploration over global slider methods.
Generative Drifting for Conditional Medical Image Generation cs.CV · 2026-04-21 · unverdicted · none · ref 31 · internal anchor
GDM reformulates 3D conditional medical image generation as attractive-repulsive drifting with multi-level feature banks to balance distribution plausibility, patient fidelity, and one-step inference, outperforming GANs, flows, and SDEs on MRI-to-CT and sparse CT tasks.
Geometric Decoupling: Diagnosing the Structural Instability of Latent cs.CV · 2026-04-20 · unverdicted · none · ref 41 · internal anchor
Latent diffusion models exhibit geometric decoupling where curvature in out-of-distribution generation is misallocated to unstable semantic boundaries instead of image details, identifying geometric hotspots as the structural cause of editing instability.
DAG-STL: A Hierarchical Framework for Zero-Shot Trajectory Planning under Signal Temporal Logic Specifications cs.RO · 2026-04-20 · unverdicted · none · ref 71 · internal anchor
DAG-STL decomposes long-horizon STL planning into decomposition, timed waypoint allocation, and diffusion-based trajectory generation to enable zero-shot planning under unknown dynamics.
Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation cs.CV · 2026-04-20 · unverdicted · none · ref 2 · internal anchor
By requiring and using highly discriminative LLM text features, the work enables the first effective one-step text-conditioned image generation with MeanFlow.
MetaEarth3D: Unlocking World-scale 3D Generation with Spatially Scalable Generative Modeling cs.CV · 2026-04-19 · unverdicted · none · ref 9 · internal anchor
MetaEarth3D is the first generative foundation model for spatially consistent, unbounded 3D scene generation at planetary scale using optical Earth observation data.
Repurposing 3D Generative Model for Autoregressive Layout Generation cs.CV · 2026-04-17 · unverdicted · none · ref 71 · internal anchor
LaviGen turns 3D generative models into an autoregressive layout generator that models geometric and physical constraints, delivering 19% higher physical plausibility and 65% faster inference on the LayoutVLM benchmark.
Cross-Modal Generation: From Commodity WiFi to High-Fidelity mmWave and RFID Sensing cs.LG · 2026-04-17 · unverdicted · none · ref 42 · internal anchor
RF-CMG synthesizes high-quality mmWave and RFID signals from WiFi using a diffusion model with Modality-Guided Embedding for high-frequency details and Low-Frequency Modality Consistency to preserve physical structure.
CLIMB: Controllable Longitudinal Brain Image Generation using Mamba-based Latent Diffusion Model and Gaussian-aligned Autoencoder cs.CV · 2026-04-17 · unverdicted · none · ref 31 · internal anchor
CLIMB generates controllable longitudinal brain MRI images from baseline scans using a Mamba-based latent diffusion model and Gaussian-aligned autoencoder, reporting SSIM 0.9433 on the ADNI dataset of 6306 scans.

Denoising Diffusion Implicit Models

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer