AnyAct generates editable human reenactments from character videos via conditional motion generation from transferable sparse local 2D articulated cues, with designs for human-only supervision and global-local decoupling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
R-DMesh proposes a VAE-based disentanglement of base mesh, motion trajectories, and rectification offset plus Triflow Attention and rectified-flow diffusion to produce 4D meshes aligned to video despite initial pose mismatch.
Introduces dual pose-image representation, cross-modal alignment, and iterative construction to improve prompt alignment and diversity in multi-person text-to-image generation.
citing papers explorer
-
AnyAct: Towards Human Reenactment of Character Motion From Video
AnyAct generates editable human reenactments from character videos via conditional motion generation from transferable sparse local 2D articulated cues, with designs for human-only supervision and global-local decoupling.
-
R-DMesh: Video-Guided 3D Animation via Rectified Dynamic Mesh Flow
R-DMesh proposes a VAE-based disentanglement of base mesh, motion trajectories, and rectification offset plus Triflow Attention and rectified-flow diffusion to produce 4D meshes aligned to video despite initial pose mismatch.
-
Composing People Together: Iterative Pose-Image Generation for Multi-Person Interaction Scenes
Introduces dual pose-image representation, cross-modal alignment, and iterative construction to improve prompt alignment and diversity in multi-person text-to-image generation.