super hub Mixed citations

Denoising Diffusion Implicit Models

Chenlin Meng, Jiaming Song · 2020 · cs.LG · arXiv 2010.02502

Mixed citation behavior. Most common role is background (67%).

478 Pith papers citing it

Background 67% of classified citations

open full Pith review browse 478 citing papers more from Chenlin Meng arXiv PDF

abstract

Denoising diffusion probabilistic models (DDPMs) have achieved high quality image generation without adversarial training, yet they require simulating a Markov chain for many steps to produce a sample. To accelerate sampling, we present denoising diffusion implicit models (DDIMs), a more efficient class of iterative implicit probabilistic models with the same training procedure as DDPMs. In DDPMs, the generative process is defined as the reverse of a Markovian diffusion process. We construct a class of non-Markovian diffusion processes that lead to the same training objective, but whose reverse process can be much faster to sample from. We empirically demonstrate that DDIMs can produce high quality samples $10 \times$ to $50 \times$ faster in terms of wall-clock time compared to DDPMs, allow us to trade off computation for sample quality, and can perform semantically meaningful image interpolation directly in the latent space.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 58 method 23 baseline 2

citation-polarity summary

background 56 use method 23 baseline 2 support 1 unclear 1

claims ledger

abstract Denoising diffusion probabilistic models (DDPMs) have achieved high quality image generation without adversarial training, yet they require simulating a Markov chain for many steps to produce a sample. To accelerate sampling, we present denoising diffusion implicit models (DDIMs), a more efficient class of iterative implicit probabilistic models with the same training procedure as DDPMs. In DDPMs, the generative process is defined as the reverse of a Markovian diffusion process. We construct a class of non-Markovian diffusion processes that lead to the same training objective, but whose revers

authors

and Stefano Ermon Chenlin Meng Jiaming Song

co-cited works

representative citing papers

ActivityForensics: A Comprehensive Benchmark for Localizing Manipulated Activity in Videos

cs.CV · 2026-04-04 · unverdicted · novelty 8.0

ActivityForensics is the first large-scale benchmark for temporally localizing activity-level forgeries in videos, paired with a diffusion-based baseline called TADiff.

Flow-GRPO: Training Flow Matching Models via Online RL

cs.CV · 2025-05-08 · unverdicted · novelty 8.0

Flow-GRPO is the first online RL method for flow matching models, raising GenEval accuracy from 63% to 95% and text-rendering accuracy from 59% to 92% with little reward hacking.

Consistency Models

cs.LG · 2023-03-02 · conditional · novelty 8.0

Consistency models achieve fast one-step generation with SOTA FID of 3.55 on CIFAR-10 and 6.20 on ImageNet 64x64 by directly mapping noise to data, outperforming prior distillation techniques.

MUSE: Unlocking Timestep as Native Task Steering for One-Step Dense Prediction

cs.CV · 2026-06-29 · unverdicted · novelty 7.0

MUSE shows that the native timestep embedding in diffusion models acts as a parameter-free steering signal for multi-task monocular depth and normal estimation via manifold decoupling in latent space.

ASTAD: Asymmetric Style Transfer for Synthetic-to-Real Adaptation in Autonomous Driving

cs.CV · 2026-06-28 · unverdicted · novelty 7.0

Introduces the ASTAD task and training-free ASTModel framework for semantically consistent asymmetric style transfer using labeled synthetic content and unlabeled real references.

Splatshot: 3D Face Avatar Generation from a Single Unconstrained Photo

cs.CV · 2026-05-31 · unverdicted · novelty 7.0

SplatShot is a training-free method that inserts per-step 3DGS refitting and photometric feedback into diffusion denoising to enforce multi-view consistency for single-photo 3D face avatars.

Decoupled Residual Denoising Diffusion Models for Unified and Data Efficient Image-to-Image Translation

cs.CV · 2026-05-31 · unverdicted · novelty 7.0

DRDD decouples diffusion into independent noise and residual stages to preserve domain harmonization and enable unified data-efficient I2I translation.

Sample-Efficient Diffusion-based Reinforcement Learning with Critic Guidance

cs.RO · 2026-05-28 · unverdicted · novelty 7.0

CGPO integrates training-free critic guidance into diffusion denoising to produce high-Q actions as regression targets, yielding SOTA results on MuJoCo locomotion and successful Franka arm grasping.

Midpoint Generative Models

cs.LG · 2026-05-28 · unverdicted · novelty 7.0

Midpoint Generative Models define a midpoint divergence from flow matching symmetry and derive its variational form as a tractable objective for training competitive one-step generators.

Spectral Guidance for Flexible and Efficient Control of Diffusion Models

cs.LG · 2026-05-27 · unverdicted · novelty 7.0

Spectral Guidance learns singular functions via self-supervised objective to project guidance signals onto diffusion sampling trajectories, enabling stable control without retraining or backpropagation and improving CIFAR-10 accuracy by 37 points with 4x faster sampling.

Towards Anatomically Plausible Human Image Generation via Synthetic Localized Preferences

cs.CV · 2026-05-25 · unverdicted · novelty 7.0

ASAP generates over 10K synthetic anatomical preference pairs via targeted degradation of high-fidelity images and applies a localized margin-bounded DPO to reduce anatomical errors in text-to-image human generation, supported by the new HAP dataset and HAF-Bench.

DeltaCam: Differential Intrinsic Camera Modeling for Video Generation

cs.CV · 2026-05-24 · unverdicted · novelty 7.0

DeltaCam models relative changes in camera intrinsics via Δ-parameterized neural adaptors in video diffusion models trained on synthetic data to enable controllable generation and real-world transfer.

Loki: Representation over Architecture for Diffusion-Based Portrait Animation

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

Loki replaces RGB conditioning stacks with identity-orthogonal parametric face encodings rasterized for diffusion, achieving efficient cross-ID portrait animation without cross-ID training data.

Point Tracking Improves World Action Models

cs.RO · 2026-05-22 · unverdicted · novelty 7.0

JOPAT jointly models pixels, point tracks, and actions in a diffusion transformer and reports gains over pixel-only baselines on long-horizon robot tasks with occlusion and off-screen motion.

DFSAttn: Dynamic Fine-grained Sparse Attention for Efficient Video Generation

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

DFSAttn is a training-free framework for dynamic fine-grained sparse attention in video DiTs that achieves up to 2.1x speedup while preserving generation quality via Hilbert reordering, hierarchical scoring, and adaptive caching.

VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation

cs.CV · 2026-05-22 · unverdicted · novelty 7.0

VDE accelerates rectified flow models like Flux by 3.22x with LPIPS of 0.069 via velocity decomposition into parallel/orthogonal components plus periodic full-pass anchoring.

Linear-DPO: Linear Direct Preference Optimization for Diffusion and Flow-Matching Generative Models

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

Linear-DPO replaces sigmoid utility with linear utility and adds EMA reference to improve preference alignment in diffusion and flow-matching text-to-image models.

DrawMotion: Generating 3D Human Motions by Freehand Drawing

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

DrawMotion is a diffusion-based framework that fuses text and hand-drawn stickman conditions via a Multi-Condition Module and training-free guidance to generate 3D human motions.

CAdam: Context-Adaptive Moment Estimation for 3D Gaussian Densification in Generative Distillation

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

CAdam reinterprets densification in generative 3DGS as signal verification via gradient-moment interference, quantile context, and SNR gating to achieve large reductions in primitive count with comparable quality.

DISC: Decoupling Instruction from State-Conditioned Control via Policy Generation

cs.RO · 2026-05-20 · unverdicted · novelty 7.0

A hypernetwork generates complete task-specific visuomotor policy parameters from instructions alone to structurally eliminate observation leakage in language-conditioned robotic control.

FlowErase-RL: Rethinking Concept Erasure as Reward Optimization in Flow Matching Models

cs.CV · 2026-05-19 · unverdicted · novelty 7.0

FlowErase-RL applies GRPO to reformulate concept erasure in flow matching models as reward optimization using a dynamic dual-path mechanism for target suppression and non-target preservation.

BrepForge: Factorized B-rep Synthesis via Wireframe Composition and Boundary-Conditioned Surface Instantiation

cs.GR · 2026-05-19 · unverdicted · novelty 7.0

BrepForge factorizes B-rep synthesis into face-aware autoregressive wireframe composition followed by boundary-conditioned surface instantiation using learning-free geometric priors.

Inference-Time Scaling in Diffusion Models through Iterative Partial Refinement

cs.LG · 2026-05-19 · unverdicted · novelty 7.0

IPR improves valid solution rates on MNIST Sudoku from 55.8% to 75.0% by iteratively refining partial regions in sequential diffusion models without external verifiers or reward models.

PolycubeNet: A Dual-latent Diffusion Model for Polycube-Based Hexahedral Mesh Generation

cs.GR · 2026-05-19 · unverdicted · novelty 7.0

PolycubeNet applies a dual-latent diffusion architecture to generate polycube point clouds from input point clouds, enabling robust hexahedral mesh creation without surface segmentation or templates.

citing papers explorer

Showing 50 of 478 citing papers.

Adversarial Diffusion Across Modalities: A Fusion Survey of Attacks, Defenses, and Evaluation for Text, Vision, and Vision-Language Models cs.CR · 2026-06-25 · unverdicted · none · ref 61 · internal anchor
A narrative survey that catalogs fifty papers on diffusion-based adversarial techniques across text, vision, and vision-language models, proposes a six-class taxonomy of diffusion roles plus a unified five-dimension evaluation framework, and releases a companion catalog.
pop-cosmos: Disentangling galaxy properties from observables using data-driven approaches astro-ph.GA · 2026-06-09 · unverdicted · none · ref 126 · internal anchor
A beta-VAE analysis of pop-cosmos models finds that five latent dimensions capture the rest-frame optical SED, corresponding to stellar mass, recent star formation, dust, and two gas ionization states.
dMoE: dLLMs with Learnable Block Experts cs.CL · 2026-05-29 · unverdicted · none · ref 46 · internal anchor
dMoE aggregates token expert distributions to block level in dLLMs, cutting unique experts from 69.5 to 14.6, memory by 76-80%, and latency by 1.14-1.66x while retaining 99.11% performance.
Efficient Diffusion LLMs via Temporal-Spatial Parallel Decoding and Confidence Extrapolation cs.CL · 2026-05-29 · unverdicted · none · ref 21 · internal anchor
Introduces TSPD with a trajectory-feature controller and training-free CE to reduce denoising steps in dLLMs while aiming to preserve quality.
Audio Pirates: Black-box Audio Watermark Removal via Diffusion Priors cs.CR · 2026-05-28 · unverdicted · none · ref 16 · internal anchor
DiffErase removes black-box audio watermarks via diffusion priors by adding intermediate noise and regenerating with a pretrained model, preserving quality across audio domains.
Colored Noise Diffusion Sampling cs.CV · 2026-05-28 · unverdicted · none · ref 56 · internal anchor
CNS is a plug-and-play stochastic sampler for diffusion models that uses timestep- and frequency-dependent colored noise to allocate energy to unresolved bands, producing lower FID scores than standard ODE/SDE baselines on ImageNet-256.
IP-Adapter Is All You Need: Towards Fine-Tuning-Free Diffusion-Based Talking Face Generation cs.CV · 2026-05-28 · unverdicted · none · ref 36 · internal anchor
A fine-tuning-free framework combines pretrained Stable Diffusion with IP-Adapter plus three parameter-free modules to achieve improved lip synchronization and visual quality in talking face generation.
Continual Learning in Modern Hopfield Networks with an Application to Diffusion Models cs.LG · 2026-05-27 · unverdicted · none · ref 2 · internal anchor
Modern Hopfield energy identifies high-energy samples as more prone to intrinsic forgetting in continual learning, with effective energy-based replay validated in diffusion models.
Frequency-Guided Action Diffusion via Sub-Frequency Manifold Traversal cs.RO · 2026-05-27 · unverdicted · none · ref 6 · internal anchor
FGO guides diffusion policy generation via expanding spectral bands on sub-frequency manifolds to improve action smoothness on 15 robotic manipulation tasks.
AI-T2I: Aggregating-and-Isolating Cross-Attention to Diffusion Models for Text-to-Image Synthesis cs.CV · 2026-05-25 · unverdicted · none · ref 1 · internal anchor
AI-T2I improves text-to-image alignment in diffusion models by using aggregation and isolation losses on cross-attention maps to fix scattering and overlap issues.
Test-Time Self-Adaptive Conditioning for Stable Audio-Driven Talking-Head Generation cs.CV · 2026-05-25 · unverdicted · none · ref 37 · internal anchor
TT-SAC is a parameter-free inference framework that uses a generator-encoder feedback loop to adapt conditioning representations and stabilize identity and motion in audio-driven talking-head videos.
D3S2: Diffusion-Guided Dataset Distillation for Semantic Segmentation cs.CV · 2026-05-24 · unverdicted · none · ref 32 · internal anchor
D3S2 combines class-balanced mask selection with diffusion-guided image synthesis and two consistency losses to distill 1% datasets that yield 24.99% mIoU on ADE20K and 35.49% on COCO-Stuff, beating random selection.
Dual Prototype-Conditioned Diffusion Model for Scalable Multi-Class Unsupervised Anomaly Detection in Large Category Spaces cs.CV · 2026-05-23 · unverdicted · none · ref 46 · internal anchor
DPDiff-AD conditions a diffusion model on local prototypes (via nearest aggregation) and global prototypes (via optimal transport) to model normality scalably in multi-class anomaly detection, reporting AUROC gains on 160-category data.
Score-Based One-step MeanFlow Policy Optimization cs.LG · 2026-05-22 · unverdicted · none · ref 21 · internal anchor
SOM is an actor-critic algorithm that constructs the target velocity field for one-step MeanFlow policies directly from the Q-function via score estimation and probability flow ODE, achieving claimed SOTA on locomotion tasks with reduced training and inference time.
Discontinuous Galerkin Neural Operator for Pathology Defocus Deblurring eess.IV · 2026-05-22 · unverdicted · none · ref 48 · internal anchor
DGNO parameterizes integral kernels with discontinuous Galerkin elements for heterogeneous defocus deblurring in pathology images and reports superior performance over prior methods.
Drift-React: One-step Generation of Reaction Pathways via SE(3) Drifting Fields physics.chem-ph · 2026-05-21 · unverdicted · none · ref 32 · internal anchor
Drift-React produces full minimum energy pathways for reactions in a single step via SE(3) drifting fields, matching TS accuracy of iterative models with orders-of-magnitude speedup on Transition1x and Halo8 datasets.
The Value of Covariance Matching in Gaussian DDPMs and the Lanczos Sampler cs.LG · 2026-05-21 · unverdicted · none · ref 10 · internal anchor
Full covariance matching in Gaussian DDPMs yields O(1/T^2) path KL error and is enabled by the training-free Lanczos Gaussian sampler using Jacobian-vector products.
Broken Memories: Detecting and Mitigating Memorization in Diffusion Models with Degraded Generations cs.CV · 2026-05-21 · unverdicted · none · ref 34 · 2 links · internal anchor
Authors link memorization to internal instability in diffusion models via latent norms, propose step-wise detection and mitigation achieving AUC >0.999 and 0% memorization rate on Stable Diffusion 1.4.
StreamGVE: Training-Free Video Editing via Few-Step Streaming Video Generation cs.CV · 2026-05-20 · unverdicted · none · ref 68 · internal anchor
StreamGVE enables high-quality training-free video editing by converting the task to noise-to-data streaming generation with dual-branch fast sampling, self-attention bridges, cross-attention grounding, source-oriented guidance, and visual prompting.
HITL-D: Human In The Loop Diffusion Assisted Shared Control cs.RO · 2026-05-20 · unverdicted · none · ref 40 · internal anchor
HITL-D combines diffusion policies with human input for shared robotic control, reducing required joystick axes and improving speed and workload in manipulation tasks per a 12-participant study.
Learning to Think in Physics: Breaking Shortcut Learning in Scientific Diffusion via Representation Alignment cs.LG · 2026-05-20 · unverdicted · none · ref 18 · internal anchor
REPA-P aligns intermediate representations in diffusion models with physical states using first-principles PDE residuals to accelerate convergence and boost out-of-distribution robustness on PDE tasks.
AttriStory: Fine-grained Attribute Realization for Visual Storytelling with Diffusion Models cs.CV · 2026-05-20 · unverdicted · none · ref 31 · internal anchor
AttriStory adds a benchmark and AttriLoss-based latent optimization to improve faithful rendering of fine-grained attributes such as clothing color and texture in diffusion-model visual storytelling.
Rethinking Cross-Layer Information Routing in Diffusion Transformers cs.CV · 2026-05-20 · unverdicted · none · ref 48 · 2 links · internal anchor
DAR replaces residual addition in DiTs with learnable, timestep-adaptive aggregation of sublayer outputs, yielding 2.11 FID improvement on SiT-XL/2 and 8.75x faster convergence on ImageNet 256x256.
Pareto-Enhanced Portrait Generation: Vision-Aligned Text Supervision for Alignment, Realism, and Aesthetics cs.CV · 2026-05-20 · unverdicted · none · ref 24 · internal anchor
A feature supervision approach using SigLIP 2 extracts multi-granularity vision-aligned text representations to supervise MM-DiT image branches, pushing the Pareto frontier for portrait generation across alignment, realism, and aesthetics.
Beyond Imitation: Learning Safe End-to-End Autonomous Driving from Hard Negatives cs.RO · 2026-05-19 · unverdicted · none · ref 37 · internal anchor
BeyondDrive augments imitation learning with synthesized safety-critical negative trajectories and a repulsive loss to improve safety in autonomous driving, reporting 89.7 PDMS on NAVSIMv1 and generalization to other models.
Guiding Neuro-Symbolic Scenario Generation with Spatio-Temporal Logic cs.RO · 2026-05-18 · unverdicted · none · ref 20 · internal anchor
STRELGen combines a multi-agent diffusion model with differentiable STREL specifications to optimize latent space for generating plausible yet safety-critical driving scenarios.
Vision Foundation Models as Generalist Tokenizers for Image Generation cs.CV · 2026-05-18 · unverdicted · none · ref 69 · internal anchor
VFMTok builds a generalist image tokenizer on frozen VFMs using adaptive quantization and semantic alignment, delivering gFID 1.36 for autoregressive and 1.25 for continuous generation on ImageNet with 3x faster convergence.
Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos cs.CV · 2026-05-18 · unverdicted · none · ref 22 · internal anchor
MIGA introduces two-stage alignment to close train-inference gaps and dual consistency enhancement via self-reflection and long-range guidance to achieve SOTA temporal consistency in infinite-frame video generation on VBench and NarrLV.
Learning to Balance: Decoupled Siamese Diffusion Transformer for Reference-Based Remote Sensing Image Super-Resolution cs.CV · 2026-05-18 · unverdicted · none · ref 29 · 2 links · internal anchor
DS-DiT decouples LR and Ref conditions in a Siamese diffusion transformer, adds patch-level weighting, and uses autoguidance to improve reference-based super-resolution for remote sensing images.
DCFold: Efficient Protein Structure Generation with Single Forward Pass cs.LG · 2026-05-18 · unverdicted · none · ref 12 · internal anchor
DCFold achieves AlphaFold3-level protein structure prediction accuracy in a single forward pass using Dual Consistency training and a Temporal Geodesic Matching scheduler, delivering 15x inference acceleration.
RDDM: A Residual-Driven Drifting Model for High-Fidelity Low-Dose CT Denoising eess.IV · 2026-05-16 · unverdicted · none · ref 26 · internal anchor
RDDM introduces a residual drifting field with attractive and repulsive forces to achieve one-step supervised denoising of low-dose CT, reporting superior PSNR, SSIM, FID of 5.87, and 15 ms inference time.
Beyond Point-Wise Matching: Structural Representation Alignment for Accelerating Diffusion Transformers cs.CV · 2026-05-16 · unverdicted · none · ref 36 · internal anchor
sREPA enforces structural consistency in relational geometry of pre-trained vision features to accelerate DiT training and improve generation quality.
DiRotQ: Rotation-Aware Quantization for 4-bit Diffusion Transformers cs.CV · 2026-05-16 · unverdicted · none · ref 64 · internal anchor
DiRotQ uses PCA-based rotation-aware activation quantization combined with GPTQ to achieve better FID and PSNR in 4-bit diffusion transformers than prior methods like SVDQuant.
Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization cs.CV · 2026-05-15 · unverdicted · none · ref 22 · 2 links · internal anchor
Flash-GRPO is a one-step GRPO framework for video diffusion alignment that applies iso-temporal grouping and temporal gradient rectification to achieve higher alignment quality and stability than full-trajectory training under low compute budgets on 1.3B-14B models.
AdaEraser: Training-Free Object Removal via Adaptive Attention Suppression cs.CV · 2026-05-15 · unverdicted · none · ref 16 · internal anchor
AdaEraser introduces token-wise adaptive attention suppression in diffusion denoising to enable high-quality training-free object removal by modulating suppression according to evolving self-attention maps.
FLASH: Efficient Visuomotor Policy via Sparse Sampling cs.RO · 2026-05-15 · unverdicted · none · ref 5 · internal anchor
FLASH Policy uses sparse Legendre polynomial trajectory fitting and history-anchored flow matching to enable single-step inference for visuomotor control, reporting 31.4 ms per-episode latency and >=92% success on five simulated plus two real manipulation tasks.
Towards Continuous Sign Language Conversation from Isolated Signs cs.CV · 2026-05-14 · unverdicted · none · ref 72 · internal anchor
Constructs continuous sign conversation data from isolated signs using retrieval and diffusion models to train a direct sign-to-sign conversational AI.
TOPOS: High-Fidelity and Efficient Industry-Grade 3D Head Generation cs.CV · 2026-05-14 · unverdicted · none · ref 2 · internal anchor
TOPOS creates high-fidelity 3D heads with fixed industry topology from single images via a specialized VAE with Perceiver Resampler and a rectified flow transformer.
Head Forcing: Long Autoregressive Video Generation via Head Heterogeneity cs.CV · 2026-05-14 · unverdicted · none · ref 44 · internal anchor
Head Forcing assigns tailored KV cache strategies to local, anchor, and memory attention heads plus head-wise RoPE re-encoding to extend autoregressive video generation from seconds to minutes without training.
Reduce the Artifacts Bias for More Generalizable AI-Generated Image Detection cs.CV · 2026-05-14 · conditional · none · ref 51 · internal anchor
SEF introduces GAN upsampling for diverse artifacts and expert fusion to reduce domain interference, yielding stronger generalization on 13 benchmarks for AI-generated image detection.
IG-Diff: Complex Night Scene Restoration with Illumination-Guided Diffusion Model cs.CV · 2026-05-14 · unverdicted · none · ref 29 · internal anchor
IG-Diff adds an illumination-guided module to a diffusion model and supplies new paired datasets to restore images degraded by simultaneous low light and other factors while preserving texture.
EgoForce: Robust Online Egocentric Motion Reconstruction via Diffusion Forcing cs.CV · 2026-05-13 · unverdicted · none · ref 38 · internal anchor
EgoForce reconstructs long-horizon full-body motion online from sparse noisy egocentric views by incrementally denoising with a temporally asymmetric diffusion schedule and noise-robust imputation.
Discrete Stochastic Localization for Non-autoregressive Generation cs.LG · 2026-05-13 · unverdicted · none · ref 26 · internal anchor
DSL provides a continuous embedding framework where one denoiser supports a family of SNR paths for discrete sequences, improving MAUVE scores on OpenWebText and allowing random-order and hybrid sampling from a fine-tuned MDLM checkpoint.
L2P: Unlocking Latent Potential for Pixel Generation cs.CV · 2026-05-12 · unverdicted · none · ref 19 · internal anchor
L2P repurposes pre-trained LDMs for direct pixel generation via large-patch tokenization and shallow-layer training on synthetic data, matching source performance with 8-GPU training and enabling native 4K output.
EPIC: Efficient Predicate-Guided Inference-Time Control for Compositional Text-to-Image Generation cs.CV · 2026-05-12 · unverdicted · none · ref 12 · internal anchor
EPIC introduces predicate-guided inference-time search that lifts compositional T2I prompt accuracy from 34% to 71% on GenEval2 with 31-81% lower execution costs.
Joint probabilistic inference of galaxy redshifts and rest-frame spectra from photometric fluxes with latent diffusion astro-ph.GA · 2026-05-11 · unverdicted · none · ref 23 · internal anchor
A generative latent diffusion framework jointly infers photometric-redshift PDFs and reconstructs rest-frame spectra from photometric data after pre-training a spectral autoencoder on millions of spectra.
GenMed: A Pairwise Generative Reformulation of Medical Diagnostic Tasks cs.CV · 2026-05-11 · unverdicted · none · ref 81 · internal anchor
GenMed uses diffusion models to capture P(X,Y) for medical tasks and performs inference via gradient-based test-time optimization, supporting arbitrary observation combinations without retraining.
Fashion130K: An E-commerce Fashion Dataset for Outfit Generation with Unified Multi-modal Condition cs.CV · 2026-05-11 · unverdicted · none · ref 44 · 2 links · internal anchor
Fashion130K dataset and UMC framework align text and visual prompts to generate more consistent fashion outfits than prior state-of-the-art methods.
The two clocks and the innovation window: When and how generative models learn rules cs.LG · 2026-05-11 · unverdicted · none · ref 69 · internal anchor
Generative models learn rules before memorizing data, creating an innovation window whose width depends on dataset size and rule complexity, observed in both diffusion and autoregressive architectures.
"Training robust watermarking model may hurt authentication!'' Exploring and Mitigating the Identity Leakage in Robust Watermarking cs.CR · 2026-05-10 · unverdicted · none · ref 70 · internal anchor
W-IR is the first watermarking framework to combine certified robustness via randomized smoothing in pixel and coordinate spaces with identity leakage mitigation via residual information loss minimization.

Denoising Diffusion Implicit Models

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer