TacticGen generates realistic, adaptable football tactics via a multi-agent diffusion transformer trained on 3.3M events and 100M frames, supporting rule-, language-, or model-based guidance at inference time.
hub
Diffusion models beat gans on image synthesis
17 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Advantage-guided diffusion (SAG and EAG) steers sampling in diffusion world models to higher-advantage trajectories, enabling policy improvement and better sample efficiency on MuJoCo tasks.
FrameCache uses a Screen-Cache-Match strategy and Trajectory-Aware Autoregressive Generation to convert past frames into causal guidance for temporally coherent human animation videos.
Introduces noise aggregation analysis with single-step small-noise injection to enable efficient and accurate membership inference attacks on diffusion models.
MIMIC-D enables multi-modal multi-agent coordination via joint training of decentralized diffusion policies using only local information.
Neural implicit functions enable resolution-agnostic, deterministic virtual staining from H&E to IHC images with SOTA results and better low-data performance than patch-based GAN or diffusion methods.
GenMed uses diffusion models to capture P(X,Y) for medical tasks and performs inference via gradient-based test-time optimization, supporting arbitrary observation combinations without retraining.
StructDiff adds adaptive receptive fields and 3D positional encoding to a single-scale diffusion model to preserve structure and enable spatial control in single-image generation.
Monocular depth estimation is recast as indirect feature restoration via an invertible diffusion module plus auxiliary viewpoint enhancement, delivering 4-38% RMSE gains on KITTI over baselines.
NC-Diffusion matches quantization noise to the diffusion forward process, adds an adaptive frequency filter and zero-shot enhancement, and reports superior fidelity on benchmarks.
HardFlow turns hard constraint enforcement during flow-matching sampling into a tractable terminal-time trajectory optimization problem using optimal control.
CDPIR integrates cross-distribution diffusion priors from a Scalable Interpolant Transformer trained with classifier-free guidance into model-based iterative reconstruction to improve sparse-view CT under out-of-distribution conditions.
The paper proposes a unified risk map modeling and learning framework integrated with diffusion-based adversarial scenario generation for risk-aware planning in partially observable autonomous driving, demonstrating improved time-to-collision metrics on the Waymo Open Motion Dataset.
Trajectory-guided diffusion synthesis reconstructs missing frames in top-down drone videos of maritime maneuvers, outperforming optical flow extrapolation and RIFE interpolation on perceptual quality, motion realism, and trajectory adherence metrics.
DiffMagicFace uses concurrent fine-tuned text and image diffusion models plus a rendered multi-view dataset to achieve identity-consistent text-conditioned editing of real facial videos.
Du-FreqNet generates controllable depth-dependent microrobot microscopy images via dual ControlNet branches and frequency-aware loss, improving SSIM by 20.7% over baselines and aiding pose estimation.
An I²SB diffusion model for CT FOV extension delivers RMSE of 49.8 HU on simulated data and 152.0 HU on real data with 0.19 s per-slice inference, over 700 times faster than cDDPM.
citing papers explorer
-
TacticGen: Grounding Adaptable and Scalable Generation of Football Tactics
TacticGen generates realistic, adaptable football tactics via a multi-agent diffusion transformer trained on 3.3M events and 100M frames, supporting rule-, language-, or model-based guidance at inference time.
-
Advantage-Guided Diffusion for Model-Based Reinforcement Learning
Advantage-guided diffusion (SAG and EAG) steers sampling in diffusion world models to higher-advantage trajectories, enabling policy improvement and better sample efficiency on MuJoCo tasks.
-
Screen, Cache, and Match: A Training-Free Causality-Consistent Reference Frame Framework for Human Animation
FrameCache uses a Screen-Cache-Match strategy and Trajectory-Aware Autoregressive Generation to convert past frames into causal guidance for temporally coherent human animation videos.
-
Noise Aggregation Analysis Driven by Small-Noise Injection: Efficient Membership Inference for Diffusion Models
Introduces noise aggregation analysis with single-step small-noise injection to enable efficient and accurate membership inference attacks on diffusion models.
-
MIMIC-D: Multi-modal Imitation for MultI-agent Coordination with Decentralized Diffusion Policies
MIMIC-D enables multi-modal multi-agent coordination via joint training of decentralized diffusion policies using only local information.
-
IMPLICITSTAINER: Resolution Agnostic Data-Efficient Virtual Staining Using Neural Implicit Functions
Neural implicit functions enable resolution-agnostic, deterministic virtual staining from H&E to IHC images with SOTA results and better low-data performance than patch-based GAN or diffusion methods.
-
GenMed: A Pairwise Generative Reformulation of Medical Diagnostic Tasks
GenMed uses diffusion models to capture P(X,Y) for medical tasks and performs inference via gradient-based test-time optimization, supporting arbitrary observation combinations without retraining.
-
StructDiff: A Structure-Preserving and Spatially Controllable Diffusion Model for Single-Image Generation
StructDiff adds adaptive receptive fields and 3D positional encoding to a single-scale diffusion model to preserve structure and enable spatial control in single-image generation.
-
Monocular Depth Estimation From the Perspective of Feature Restoration: A Diffusion Enhanced Depth Restoration Approach
Monocular depth estimation is recast as indirect feature restoration via an invertible diffusion module plus auxiliary viewpoint enhancement, delivering 4-38% RMSE gains on KITTI over baselines.
-
A Noise Constrained Diffusion (NC-Diffusion) Framework for High Fidelity Image Compression
NC-Diffusion matches quantization noise to the diffusion forward process, adds an adaptive frequency filter and zero-shot enhancement, and reports superior fidelity on benchmarks.
-
HardFlow: Hard-Constrained Sampling for Flow-Matching Models via Trajectory Optimization
HardFlow turns hard constraint enforcement during flow-matching sampling into a tractable terminal-time trajectory optimization problem using optimal control.
-
Cross-Distribution Diffusion Priors-Driven Iterative Reconstruction for Sparse-View CT
CDPIR integrates cross-distribution diffusion priors from a Scalable Interpolant Transformer trained with classifier-free guidance into model-based iterative reconstruction to improve sparse-view CT under out-of-distribution conditions.
-
Learning A Unified Risk Map for Autonomous Driving in Partially Observable Environments
The paper proposes a unified risk map modeling and learning framework integrated with diffusion-based adversarial scenario generation for risk-aware planning in partially observable autonomous driving, demonstrating improved time-to-collision metrics on the Waymo Open Motion Dataset.
-
Video Reconstruction using Diffusion-based Image-to-Video Generation with Trajectory Guidance
Trajectory-guided diffusion synthesis reconstructs missing frames in top-down drone videos of maritime maneuvers, outperforming optical flow extrapolation and RIFE interpolation on perceptual quality, motion realism, and trajectory adherence metrics.
-
DiffMagicFace: Identity Consistent Facial Editing of Real Videos
DiffMagicFace uses concurrent fine-tuned text and image diffusion models plus a rendered multi-view dataset to achieve identity-consistent text-conditioned editing of real facial videos.
-
Dual-Control Frequency-Aware Diffusion Model for Depth-Dependent Optical Microrobot Microscopy Image Generation
Du-FreqNet generates controllable depth-dependent microrobot microscopy images via dual ControlNet branches and frequency-aware loss, improving SSIM by 20.7% over baselines and aiding pose estimation.
-
Efficient Image-to-Image Schr\"odinger Bridge for CT Field of View Extension
An I²SB diffusion model for CT FOV extension delivers RMSE of 49.8 HU on simulated data and 152.0 HU on real data with 0.19 s per-slice inference, over 700 times faster than cDDPM.