TrioPose proposes a Triple-Stream Pose-Aware DiT with relational bias masks and spatial loss weighting to achieve SOTA pose-guided text-to-image results on multi-person benchmarks like Human-Art.
Kpe: Keypoint pose encoding for transformer-based image generation.arXiv preprint arXiv:2203.04907, 2022
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
TrioPose: Native Triple-Stream Diffusion Transformers for Pose-Guided Text-to-Image Generation
TrioPose proposes a Triple-Stream Pose-Aware DiT with relational bias masks and spatial loss weighting to achieve SOTA pose-guided text-to-image results on multi-person benchmarks like Human-Art.