OmniTryOn performs multi-object video virtual try-on in one pass using first-frame wearable caching and spatiotemporal RoPE, outperforming single-garment baselines on a new TryAny-Bench dataset.
arXiv preprint arXiv:2506.04213 (2025)
4 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 4years
2026 4verdicts
UNVERDICTED 4representative citing papers
A survey that groups efficient video diffusion methods into four paradigms—step distillation, efficient attention, model compression, and cache/trajectory optimization—and outlines open challenges for practical use.
ICDepth adapts text-to-video diffusion transformers for video depth estimation via in-context conditioning, achieving SOTA results on benchmarks with 6-13x less training data than prior generative methods.
GeoEdit introduces a Lift-Manipulate-Render-Denoise pipeline with dual-branch denoising and variance-homogeneous injection for 3D-consistent object editing in single photos.
citing papers explorer
-
OmniTryOn: Video Try-On Anything at Once!
OmniTryOn performs multi-object video virtual try-on in one pass using first-frame wearable caching and spatiotemporal RoPE, outperforming single-garment baselines on a new TryAny-Bench dataset.
-
Efficient Video Diffusion Models: Advancements and Challenges
A survey that groups efficient video diffusion methods into four paradigms—step distillation, efficient attention, model compression, and cache/trajectory optimization—and outlines open challenges for practical use.
-
ICDepth: Taming Video Diffusion Models for Video Depth Estimation via In-Context Conditioning
ICDepth adapts text-to-video diffusion transformers for video depth estimation via in-context conditioning, achieving SOTA results on benchmarks with 6-13x less training data than prior generative methods.
-
GeoEdit: Geometry-Aware Object Editing via Dual-Branch Denoising
GeoEdit introduces a Lift-Manipulate-Render-Denoise pipeline with dual-branch denoising and variance-homogeneous injection for 3D-consistent object editing in single photos.