3DXTalker unifies identity modeling, lip synchronization, emotional expression, and head-pose dynamics in audio-driven 3D avatars via 2D-to-3D curation, amplitude/emotion audio cues, and a flow-matching transformer with prompt control.
Capture, learning, and synthesis of 3D speaking styles.Computer Vision and Pattern Recognition (CVPR), pages 10101–10111
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
dataset 1
citation-polarity summary
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2roles
dataset 1polarities
use dataset 1representative citing papers
AudioFace improves speech-driven facial animation by guiding blendshape prediction with linguistic and articulatory information extracted via multimodal language models.
citing papers explorer
-
3DXTalker: Unifying Identity, Lip Sync, Emotion, and Spatial Dynamics in Expressive 3D Talking Avatars
3DXTalker unifies identity modeling, lip synchronization, emotional expression, and head-pose dynamics in audio-driven 3D avatars via 2D-to-3D curation, amplitude/emotion audio cues, and a flow-matching transformer with prompt control.
-
AudioFace: Language-Assisted Speech-Driven Facial Animation with Multimodal Language Models
AudioFace improves speech-driven facial animation by guiding blendshape prediction with linguistic and articulatory information extracted via multimodal language models.