iTryOn is a diffusion-based framework that adds spatial 3D hand guidance and semantic action-aware embeddings to handle complex garment deformations during human-clothing interactions in videos.
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
6 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
A diffusion model trained on DOOM play sessions generates stable real-time interactive game frames at 20 FPS with quality near lossy JPEG.
RoboDreamer factorizes video generation using language primitives to achieve compositional generalization in robot world models, outperforming monolithic baselines on unseen goals in RT-X.
LIVEditor-14B applies a new sparse attention method (ISA) that prunes context and uses query-sharpness routing to cut attention latency ~60% with no loss in editing quality on standard benchmarks.
CameraCtrl enables accurate camera pose control in video diffusion models through a trained plug-and-play module and dataset choices emphasizing diverse camera trajectories with matching appearance.
The work introduces WaLeF/FIDLAr for flood forecasting, CoDiCast for probabilistic weather, and Hypercube-RAG for explainable environmental QA, claiming superior accuracy, efficiency, and interpretability over baselines.
citing papers explorer
No citing papers match the current filters.