KeyTailor improves video virtual try-on realism by using instruction-guided keyframes to enhance garment details and background integrity in DiT models without major architectural changes.
Visual instruction tuning.Advances in neural information processing systems, 36:34892–34916
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2representative citing papers
citing papers explorer
-
The devil is in the details: Enhancing Video Virtual Try-On via Keyframe-Driven Details Injection
KeyTailor improves video virtual try-on realism by using instruction-guided keyframes to enhance garment details and background integrity in DiT models without major architectural changes.
- Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models