Vasa-1: Lifelike audio-driven talking faces generated in real time.Advances in Neural Information Pro- cessing Systems, 37:660–684

Sicheng Xu, Guojun Chen, Yu-Xiao Guo, Jiaolong Yang, Chong Li, Zhenyu Zang, Yizhong Zhang, Xin Tong, Baining Guo · 2024

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

C-MET transfers emotions from speech to facial video by learning cross-modal semantic vectors with pretrained audio and disentangled expression encoders, yielding 14% higher emotion accuracy on MEAD and CREMA-D even for unseen emotions.

FlashLips: 100-FPS Mask-Free Latent Lip-Sync using Reconstruction Instead of Diffusion or GANs

cs.CV · 2025-12-23 · unverdicted · novelty 6.0

FlashLips delivers 100+ FPS mask-free lip-sync by reconstructing target frames in latent space from an audio-predicted lips-pose vector using a compact U-Net trained solely on reconstruction losses and self-supervised mask removal.

citing papers explorer

Showing 2 of 2 citing papers.

Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video cs.CV · 2026-04-09 · unverdicted · none · ref 65
C-MET transfers emotions from speech to facial video by learning cross-modal semantic vectors with pretrained audio and disentangled expression encoders, yielding 14% higher emotion accuracy on MEAD and CREMA-D even for unseen emotions.
FlashLips: 100-FPS Mask-Free Latent Lip-Sync using Reconstruction Instead of Diffusion or GANs cs.CV · 2025-12-23 · unverdicted · none · ref 57
FlashLips delivers 100+ FPS mask-free lip-sync by reconstructing target frames in latent space from an audio-predicted lips-pose vector using a compact U-Net trained solely on reconstruction losses and self-supervised mask removal.

Vasa-1: Lifelike audio-driven talking faces generated in real time.Advances in Neural Information Pro- cessing Systems, 37:660–684

fields

years

verdicts

representative citing papers

citing papers explorer