super hub Canonical reference

Score-Based Generative Modeling through Stochastic Differential Equations

Abhishek Kumar, Diederik P Kingma, Jascha Sohl-Dickstein, Stefano Ermon, Yang Song · 2020 · cs.LG · arXiv 2011.13456

Canonical reference. 76% of citing Pith papers cite this work as background.

387 Pith papers citing it

Background 76% of classified citations

open full Pith review browse 387 citing papers more from Abhishek Kumar arXiv PDF

abstract

Creating noise from data is easy; creating data from noise is generative modeling. We present a stochastic differential equation (SDE) that smoothly transforms a complex data distribution to a known prior distribution by slowly injecting noise, and a corresponding reverse-time SDE that transforms the prior distribution back into the data distribution by slowly removing the noise. Crucially, the reverse-time SDE depends only on the time-dependent gradient field (\aka, score) of the perturbed data distribution. By leveraging advances in score-based generative modeling, we can accurately estimate these scores with neural networks, and use numerical SDE solvers to generate samples. We show that this framework encapsulates previous approaches in score-based generative modeling and diffusion probabilistic modeling, allowing for new sampling procedures and new modeling capabilities. In particular, we introduce a predictor-corrector framework to correct errors in the evolution of the discretized reverse-time SDE. We also derive an equivalent neural ODE that samples from the same distribution as the SDE, but additionally enables exact likelihood computation, and improved sampling efficiency. In addition, we provide a new way to solve inverse problems with score-based models, as demonstrated with experiments on class-conditional generation, image inpainting, and colorization. Combined with multiple architectural improvements, we achieve record-breaking performance for unconditional image generation on CIFAR-10 with an Inception score of 9.89 and FID of 2.20, a competitive likelihood of 2.99 bits/dim, and demonstrate high fidelity generation of 1024 x 1024 images for the first time from a score-based generative model.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 71 method 17 baseline 3 other 1

citation-polarity summary

background 70 use method 16 baseline 3 unclear 2 support 1

claims ledger

abstract Creating noise from data is easy; creating data from noise is generative modeling. We present a stochastic differential equation (SDE) that smoothly transforms a complex data distribution to a known prior distribution by slowly injecting noise, and a corresponding reverse-time SDE that transforms the prior distribution back into the data distribution by slowly removing the noise. Crucially, the reverse-time SDE depends only on the time-dependent gradient field (\aka, score) of the perturbed data distribution. By leveraging advances in score-based generative modeling, we can accurately estimate

authors

Ab- hishek Kumar and Ben Poole Diederik P Kingma Jascha Sohl-Dickstein Stefano Ermon Yang Song

co-cited works

representative citing papers

Generative Modeling with Flux Matching

cs.LG · 2026-05-08 · unverdicted · novelty 8.0

Flux Matching generalizes score-based generative modeling by using a weaker objective that admits infinitely many non-conservative vector fields with the data as stationary distribution, enabling new design choices beyond traditional score matching.

A-CODE: Fully Atomic Protein Co-Design with Unified Multimodal Diffusion

q-bio.QM · 2026-05-05 · unverdicted · novelty 8.0

A-CODE presents a fully atomic one-stage multimodal diffusion model for protein co-design that claims superior unconditional generation performance over prior one- and two-stage models plus a tenfold success-rate gain on hard binder-design tasks.

Quotient-Space Diffusion Models

cs.LG · 2026-04-23 · unverdicted · novelty 8.0

Quotient-space diffusion models generate correct symmetric distributions by removing redundancy on the quotient space, simplifying learning and improving results on small molecules and proteins under SE(3) symmetry.

The Feedback Hamiltonian is the Score Function: A Diffusion-Model Framework for Quantum Trajectory Reversal

quant-ph · 2026-04-23 · unverdicted · novelty 8.0

The García-Pintos feedback Hamiltonian equals the score function of the quantum trajectory distribution, linking quantum feedback to diffusion-model reversal.

Query Lower Bounds for Diffusion Sampling

cs.LG · 2026-04-12 · unverdicted · novelty 8.0

Diffusion sampling from d-dimensional distributions requires at least ~sqrt(d) adaptive score queries when score estimates have polynomial accuracy.

OP-GRPO: Efficient Off-Policy GRPO for Flow-Matching Models

cs.CV · 2026-04-05 · unverdicted · novelty 8.0

OP-GRPO is the first off-policy GRPO method for flow-matching models that reuses trajectories via replay buffer and importance sampling corrections, matching on-policy performance with 34.2% of the training steps.

Generative models on phase space

hep-ph · 2026-04-02 · unverdicted · novelty 8.0

Generative diffusion and flow models are constructed to remain exactly on the Lorentz-invariant massless N-particle phase space manifold during sampling for particle physics applications.

A Priori Sampling of Transition States with Guided Diffusion

physics.chem-ph · 2026-03-26 · conditional · novelty 8.0

ASTRA reframes transition-state search as guided diffusion inference that samples the isodensity surface between metastable basins and converges to first-order saddles via score differences and physical forces.

Mean-Field Path-Integral Diffusion: From Samples to Interacting Agents

math.OC · 2026-02-23 · unverdicted · novelty 8.0

MF-PID turns independent diffusion samples into mean-field interacting agents, proving that quadratic interactions yield exact linear mean interpolation and delivering 19-24% energy savings in demand-response control.

Variational Optimality of F\"ollmer Processes in Generative Diffusions

math.ST · 2026-02-11 · unverdicted · novelty 8.0

Föllmer processes are variationally optimal among generative diffusions because they minimize the impact of drift estimation error on path-space KL divergence, rendering different interpolation schedules statistically equivalent.

Flow-GRPO: Training Flow Matching Models via Online RL

cs.CV · 2025-05-08 · unverdicted · novelty 8.0

Flow-GRPO is the first online RL method for flow matching models, raising GenEval accuracy from 63% to 95% and text-rendering accuracy from 59% to 92% with little reward hacking.

Large Language Diffusion Models

cs.CL · 2025-02-14 · unverdicted · novelty 8.0

LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

Denoising Diffusion Implicit Models

cs.LG · 2020-10-06 · unverdicted · novelty 8.0

DDIMs construct non-Markovian diffusion processes that share DDPM training objectives but allow much faster reverse sampling, demonstrated empirically at 10-50x wall-clock speedup.

Quasi-Monte Carlo for SDE Simulation: Error Analysis and Dimensionality Reduction

math.NA · 2026-06-23 · unverdicted · novelty 7.0

QMC applied to Euler-Maruyama yields faster sampling-error decay than Monte Carlo, and the new MSTG method based on exact simulation achieves super-exponential truncation-error decay that sharply reduces integration dimension.

Text Dictates, Music Decorates: Energy-based Attention for Editable Dance Motion Generation

cs.AI · 2026-06-22 · unverdicted · novelty 7.0

STREAM decouples text and music conditioning in a diffusion transformer via AdaLN for structure and BEAM for beats, plus new Motorica++ dataset and editability metrics, claiming SOTA music alignment with preserved semantics.

Fast Computation of Free-Support Wasserstein Medians

stat.CO · 2026-06-17 · unverdicted · novelty 7.0

Direct fixed-weight solver for free-support Wasserstein medians relocates atoms using OT barycentric projections and inverse-distance weights, achieving monotone descent on smoothed objectives with fewer subproblems than nested Weiszfeld baselines.

Chameleon: Style-Content Disentangled Framework for Cross-Domain Object Compositing

cs.CV · 2026-05-31 · unverdicted · novelty 7.0

Chameleon proposes the first large-scale cross-domain compositing dataset and a disentangled encoder plus gated diffusion transformer that outperforms prior in-domain and cross-domain methods on plausibility and fidelity.

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

cs.CV · 2026-05-28 · unverdicted · novelty 7.0

YoCausal benchmark shows video diffusion models detect the arrow of time but lack genuine causal understanding relative to humans.

Sample-Efficient Diffusion-based Reinforcement Learning with Critic Guidance

cs.RO · 2026-05-28 · unverdicted · novelty 7.0

CGPO integrates training-free critic guidance into diffusion denoising to produce high-Q actions as regression targets, yielding SOTA results on MuJoCo locomotion and successful Franka arm grasping.

Generative Modeling by Value-Driven Transport

cs.LG · 2026-05-21 · unverdicted · novelty 7.0

A control-theoretic linear program yields value-driven transport policies for generative modeling with straight paths and simulation-free training.

Let EEG Models Learn EEG

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

JET is a conditional flow matching framework that generates EEG as continuous raw sequences with added constraints for spectral and temporal properties, achieving over 40% lower TS-FID than prior discrete denoising methods on three benchmarks.

Linear-DPO: Linear Direct Preference Optimization for Diffusion and Flow-Matching Generative Models

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

Linear-DPO replaces sigmoid utility with linear utility and adds EMA reference to improve preference alignment in diffusion and flow-matching text-to-image models.

CAdam: Context-Adaptive Moment Estimation for 3D Gaussian Densification in Generative Distillation

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

CAdam reinterprets densification in generative 3DGS as signal verification via gradient-moment interference, quantile context, and SNR gating to achieve large reductions in primitive count with comparable quality.

Mat\'ern Noise for Triangulation-Agnostic Flow Matching on Meshes

cs.GR · 2026-05-19 · unverdicted · novelty 7.0

Proposes discretized Matérn process noise for triangulation-agnostic flow matching on meshes with PoissonNet denoiser, tested on elastic states and humanoid poses for meshes exceeding one million triangles.

citing papers explorer

Showing 50 of 143 citing papers after filters.

OP-GRPO: Efficient Off-Policy GRPO for Flow-Matching Models cs.CV · 2026-04-05 · unverdicted · none · ref 33 · internal anchor
OP-GRPO is the first off-policy GRPO method for flow-matching models that reuses trajectories via replay buffer and importance sampling corrections, matching on-policy performance with 34.2% of the training steps.
Flow-GRPO: Training Flow Matching Models via Online RL cs.CV · 2025-05-08 · unverdicted · none · ref 23 · internal anchor
Flow-GRPO is the first online RL method for flow matching models, raising GenEval accuracy from 63% to 95% and text-rendering accuracy from 59% to 92% with little reward hacking.
Chameleon: Style-Content Disentangled Framework for Cross-Domain Object Compositing cs.CV · 2026-05-31 · unverdicted · none · ref 6 · internal anchor
Chameleon proposes the first large-scale cross-domain compositing dataset and a disentangled encoder plus gated diffusion transformer that outperforms prior in-domain and cross-domain methods on plausibility and fidelity.
YoCausal: How Far is Video Generation from World Model? A Causality Perspective cs.CV · 2026-05-28 · unverdicted · none · ref 102 · internal anchor
YoCausal benchmark shows video diffusion models detect the arrow of time but lack genuine causal understanding relative to humans.
Let EEG Models Learn EEG cs.CV · 2026-05-20 · unverdicted · none · ref 82 · internal anchor
JET is a conditional flow matching framework that generates EEG as continuous raw sequences with added constraints for spectral and temporal properties, achieving over 40% lower TS-FID than prior discrete denoising methods on three benchmarks.
Linear-DPO: Linear Direct Preference Optimization for Diffusion and Flow-Matching Generative Models cs.CV · 2026-05-20 · unverdicted · none · ref 13 · internal anchor
Linear-DPO replaces sigmoid utility with linear utility and adds EMA reference to improve preference alignment in diffusion and flow-matching text-to-image models.
Functionalization via Structure Completion and Motion Rectification cs.CV · 2026-05-18 · unverdicted · none · ref 245 · internal anchor
Object functionalization is cast as neural graph completion over a functional graph of parts, contacts, and motions, followed by geometry realization that also rectifies erroneous motions, demonstrated on furniture with a new paired dataset.
Towards Generalized Image Manipulation Localization via Score-based Model cs.CV · 2026-05-16 · conditional · none · ref 31 · internal anchor
DiffIML applies score-based generative modeling to image manipulation localization, recovering coherent masks iteratively from noise to improve generalization on unseen manipulation types.
HIR-ALIGN: Enhancing Hyperspectral Image Restoration via Diffusion-Based Data Generation cs.CV · 2026-05-13 · unverdicted · none · ref 32 · internal anchor
HIR-ALIGN augments limited target data for hyperspectral restoration by creating proxy clean images, synthesizing aligned HSIs with blur-robust diffusion and warp-based transfer, then finetuning models to lower target-domain risk.
Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling cs.CV · 2026-05-13 · unverdicted · none · ref 37 · internal anchor
Edit-Compass and EditReward-Compass are new unified benchmarks for fine-grained image editing evaluation and realistic reward modeling in reinforcement learning optimization.
Amortized Guidance for Image Inpainting with Pretrained Diffusion Models cs.CV · 2026-05-13 · unverdicted · none · ref 34 · internal anchor
AID amortizes guidance for diffusion inpainting by training a reusable module via an auxiliary Gaussian formulation and continuous-time actor-critic algorithm, improving quality-speed trade-off with under 1% overhead.
Adaptive Subspace Projection for Generative Personalization cs.CV · 2026-05-08 · unverdicted · none · ref 34 · internal anchor
A training-free adaptive subspace projection method mitigates semantic collapsing in generative personalization by isolating and adjusting drift in a low-dimensional subspace using the stable pre-trained embedding as anchor.
Arena as Offline Reward: Efficient Fine-Grained Preference Optimization for Diffusion Models cs.CV · 2026-05-07 · unverdicted · none · ref 32 · internal anchor
ArenaPO infers Gaussian capability distributions from pairwise preferences and applies truncated-normal latent inference to derive fine-grained offline rewards for preference optimization of text-to-image diffusion models.
DBMSolver: A Training-free Diffusion Bridge Sampler for High-Quality Image-to-Image Translation cs.CV · 2026-05-07 · unverdicted · none · ref 11 · internal anchor
DBMSolver is a new training-free sampler using exponential integrators that reduces NFEs by up to 5x and improves quality in diffusion bridge model-based image-to-image translation tasks.
Oracle Noise: Faster Semantic Spherical Alignment for Interpretable Latent Optimization cs.CV · 2026-04-26 · unverdicted · none · ref 37 · internal anchor
Oracle Noise optimizes diffusion model noise on a Riemannian hypersphere guided by key prompt words to preserve the Gaussian prior, eliminate norm inflation, and achieve faster semantic alignment than Euclidean methods.
$Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models cs.CV · 2026-04-26 · unverdicted · none · ref 43 · internal anchor
Z²-Sampling implicitly realizes zero-cost zigzag trajectories for curvature-aware semantic alignment in diffusion models by reducing multi-step paths via operator dualities and temporal caching while synthesizing a directional derivative penalty.
Dream-Cubed: Controllable Generative Modeling in Minecraft by Training on Billions of Cubes cs.CV · 2026-04-22 · unverdicted · none · ref 41 · internal anchor
Dream-Cubed releases a billion-scale voxel dataset and 3D diffusion models that generate controllable Minecraft worlds by operating directly on blocks.
HP-Edit: A Human-Preference Post-Training Framework for Image Editing cs.CV · 2026-04-21 · unverdicted · none · ref 41 · internal anchor
HP-Edit introduces a post-training framework and RealPref-50K dataset that uses a VLM-based HP-Scorer to align diffusion image editing models with human preferences, improving outputs on Qwen-Image-Edit-2509.
Learning to Credit the Right Steps: Objective-aware Process Optimization for Visual Generation cs.CV · 2026-04-21 · unverdicted · none · ref 40 · internal anchor
OTCA improves GRPO training for visual generation by estimating step importance in trajectories and adaptively weighting multiple reward objectives.
Generative Texture Filtering cs.CV · 2026-04-21 · unverdicted · none · ref 81 · internal anchor
A two-stage fine-tuning strategy on pre-trained generative models enables effective texture filtering that outperforms prior methods on challenging cases.
Long-Text-to-Image Generation via Compositional Prompt Decomposition cs.CV · 2026-04-20 · unverdicted · none · ref 22 · internal anchor
PRISM lets pre-trained text-to-image models handle long prompts by breaking them into compositional parts, predicting noise separately, and merging outputs via energy-based conjunction, matching fine-tuned models while generalizing better to prompts over 500 tokens.
Structure-Adaptive Sparse Diffusion in Voxel Space for 3D Medical Image Enhancement cs.CV · 2026-04-20 · unverdicted · none · ref 25 · internal anchor
A sparse voxel-space diffusion method with structure-adaptive modulation achieves up to 10x training speedup and state-of-the-art results for 3D medical image denoising and super-resolution.
Beyond Reconstruction: Reconstruction-to-Vector Diffusion for Hyperspectral Anomaly Detection cs.CV · 2026-04-13 · unverdicted · none · ref 37 · internal anchor
R2VD redefines reconstruction as the origin for residual-guided vector diffusion across PPE, GMP, RSM, and VDI stages to achieve superior anomaly detectability and background suppression on eight datasets.
Novel View Synthesis as Video Completion cs.CV · 2026-04-09 · unverdicted · none · ref 36 · internal anchor
Video diffusion models can be adapted into permutation-invariant generators for sparse novel view synthesis by treating the problem as video completion and removing temporal order cues.
HumANDiff: Articulated Noise Diffusion for Motion-Consistent Human Video Generation cs.CV · 2026-04-07 · unverdicted · none · ref 58 · internal anchor
HumANDiff improves motion consistency in human video generation by sampling diffusion noise on an articulated human body template and adding joint appearance-motion prediction plus a geometric consistency loss.
1.x-Distill: Breaking the Diversity, Quality, and Efficiency Barrier in Distribution Matching Distillation cs.CV · 2026-04-05 · conditional · none · ref 42 · internal anchor
1.x-Distill achieves better quality and diversity than prior few-step distillation methods at 1.67 and 1.74 effective NFEs on SD3 models with up to 33x speedup.
VOSR: A Vision-Only Generative Model for Image Super-Resolution cs.CV · 2026-04-03 · conditional · none · ref 31 · internal anchor
VOSR shows that competitive generative image super-resolution with faithful structures can be achieved by training a diffusion-style model from scratch on visual data alone, using a vision encoder for guidance and a restoration-oriented sampling strategy.
Drift-AR: Single-Step Visual Autoregressive Generation via Anti-Symmetric Drifting cs.CV · 2026-03-30 · unverdicted · none · ref 29 · internal anchor
Drift-AR achieves 3.8-5.5x speedup in AR-diffusion image models by using entropy to enable entropy-informed speculative decoding and single-step (1-NFE) anti-symmetric drifting decoding.
SHARP: Spectrum-aware Highly-dynamic Adaptation for Resolution Promotion in Remote Sensing Synthesis cs.CV · 2026-03-23 · conditional · none · ref 29 · internal anchor
SHARP applies a spectrum-aware dynamic RoPE scaling schedule that promotes resolution more strongly in early denoising stages and relaxes it later, outperforming static baselines on quality metrics for remote sensing images.
FrameDiT: Diffusion Transformer with Matrix Attention for Efficient Video Generation cs.CV · 2026-03-10 · unverdicted · none · ref 46 · internal anchor
FrameDiT proposes Matrix Attention for DiTs to achieve SOTA video generation with improved temporal coherence and efficiency comparable to local factorized attention.
Improving Text-to-Image Generation with Intrinsic Self-Confidence Rewards cs.CV · 2026-03-01 · unverdicted · none · ref 66 · internal anchor
SOLACE improves text-to-image generation by using intrinsic self-confidence rewards from noise reconstruction accuracy during reinforcement learning post-training without external supervision.
MultiAnimate: Pose-Guided Image Animation Made Extensible cs.CV · 2026-02-25 · unverdicted · none · ref 28 · internal anchor
MultiAnimate adds Identifier Assigner and Identifier Adapter modules to diffusion video models so they can handle multiple characters without identity mix-ups, generalizing from two-character training data to more characters.
Exploring Cross-Modal Flows for Few-Shot Learning cs.CV · 2025-10-16 · unverdicted · none · ref 22 · internal anchor
FMA introduces flow matching for multi-step cross-modal feature alignment in few-shot learning, using fixed coupling, noise augmentation, and early-stopping to outperform one-step PEFT methods.
Single-Step Reconstruction-Free Anomaly Detection and Segmentation via Diffusion Models cs.CV · 2025-08-06 · conditional · none · ref 26 · internal anchor
RADAR produces anomaly maps directly from attention-based diffusion models in a single forward pass, achieving higher F1 scores than reconstruction-based diffusion and statistical baselines on MVTec-AD and 3D-printed material data.
UniEdit-Flow: Unleashing Inversion and Editing in the Era of Flow Models cs.CV · 2025-04-17 · unverdicted · none · ref 60 · internal anchor
UniEdit-Flow presents tuning-free Uni-Inv and Uni-Edit methods for inversion and editing in flow models that achieve accurate reconstruction and robust region-preserving edits across generative models.
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference cs.CV · 2023-10-06 · unverdicted · none · ref 85 · internal anchor
Latent Consistency Models enable high-fidelity text-to-image generation in 2-4 steps by directly predicting solutions to the probability flow ODE in latent space, distilled from pre-trained LDMs.
Human Motion Diffusion Model cs.CV · 2022-09-29 · unverdicted · none · ref 19 · internal anchor
MDM is a classifier-free diffusion model that generates expressive human motions by predicting clean samples rather than noise, supporting text and action conditioning and outperforming prior methods on standard benchmarks.
High-Resolution Image Synthesis with Latent Diffusion Models cs.CV · 2021-12-20 · conditional · none · ref 85 · internal anchor
Latent diffusion models achieve state-of-the-art inpainting and competitive results on unconditional generation, scene synthesis, and super-resolution by performing the diffusion process in the latent space of pretrained autoencoders with cross-attention conditioning, while cutting computational and
ExpoCM: Exposure-Aware One-Step Generative Single-Image HDR Reconstruction cs.CV · 2026-05-04 · unverdicted · none · ref 46
ExpoCM enables fast one-step single-image HDR reconstruction via exposure-dependent perturbations and region-conditioned consistency trajectories derived from a probability flow ODE.
Few-step Generative Models as Lossy Compression cs.CV · 2026-06-09 · unverdicted · none · ref 14 · internal anchor
Few-step generative models can be reformulated as lossy codecs in the reverse channel coding framework without retraining, yielding faster encoding/decoding on low-resolution image benchmarks.
Colored Noise Diffusion Sampling cs.CV · 2026-05-28 · unverdicted · none · ref 58 · internal anchor
CNS is a plug-and-play stochastic sampler for diffusion models that uses timestep- and frequency-dependent colored noise to allocate energy to unresolved bands, producing lower FID scores than standard ODE/SDE baselines on ImageNet-256.
minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models cs.CV · 2026-05-28 · unverdicted · none · ref 32 · internal anchor
minWM supplies an end-to-end pipeline that fine-tunes bidirectional T2V/TI2V models with camera control then distills them via Causal Forcing into few-step autoregressive generators for low-latency rollout.
RiT: Vanilla Diffusion Transformers Suffice in Representation Space cs.CV · 2026-05-21 · conditional · none · ref 30 · internal anchor
A vanilla Diffusion Transformer trained via x-prediction on frozen DINOv2 features reaches FID 1.14 on ImageNet 256x256 with fewer parameters and faster sampling than prior DiT variants.
Pareto-Enhanced Portrait Generation: Vision-Aligned Text Supervision for Alignment, Realism, and Aesthetics cs.CV · 2026-05-20 · unverdicted · none · ref 25 · internal anchor
A feature supervision approach using SigLIP 2 extracts multi-granularity vision-aligned text representations to supervise MM-DiT image branches, pushing the Pareto frontier for portrait generation across alignment, realism, and aesthetics.
GeoFlow: Enforcing Implicit Geometric Consistency in Video Generation cs.CV · 2026-05-18 · unverdicted · none · ref 63 · internal anchor
GeoFlow adds a geometry-consistency reward based on rigid camera flow and object appearance preservation, integrated via reinforcement fine-tuning to improve geometric coherence in video generation.
Beyond Point-Wise Matching: Structural Representation Alignment for Accelerating Diffusion Transformers cs.CV · 2026-05-16 · unverdicted · none · ref 37 · internal anchor
sREPA enforces structural consistency in relational geometry of pre-trained vision features to accelerate DiT training and improve generation quality.
GeoQuery: Geometry-Query Diffusion for Sparse-View Reconstruction cs.CV · 2026-05-12 · unverdicted · none · ref 3 · internal anchor
GeoQuery replaces corrupted rendering features with geometry-aligned proxy queries and restricts cross-view attention to local windows, enabling robust diffusion-based refinement under extreme view sparsity.
From Synthetic to Real: Toward Identity-Consistent Makeup Transfer with Synthetic and Real Data cs.CV · 2026-05-08 · unverdicted · none · ref 23 · internal anchor
The work creates identity-consistent synthetic makeup data via ConsistentBeauty and adapts models to real images using reinforcement learning in RealBeauty, achieving better identity preservation and real-world performance than prior methods.
Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework cs.CV · 2026-05-08 · unverdicted · none · ref 39 · 2 links · internal anchor
MagicBokeh uses a single diffusion model with alternative training, focus-aware masked attention, and degradation-aware depth estimation to produce photorealistic bokeh on low-res zoomed images.
InkDiffuser: High-Fidelity One-shot Chinese Calligraphy via Differentiable Morphological Optimization cs.CV · 2026-05-07 · unverdicted · none · ref 10 · internal anchor
InkDiffuser generates high-fidelity one-shot Chinese calligraphy using high-frequency enhancement and a differentiable ink structure loss for realistic stroke and ink rendering.

Score-Based Generative Modeling through Stochastic Differential Equations

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer