In Proceedings of the 28th ACM international conference on multimedia

A lip sync expert is all you need for speech to lip generation in the wild

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

Listening Deepfake Detection: A New Perspective Beyond Speaking-Centric Forgery Analysis

cs.CV · 2026-04-14 · conditional · novelty 7.0

Introduces the LDD task, ListenForge dataset built from five listening head generation methods, and MANet model that detects listening forgeries via motion inconsistencies guided by audio semantics.

EAD-Net: Emotion-Aware Talking Head Generation with Spatial Refinement and Temporal Coherence

cs.CV · 2026-04-25 · unverdicted · novelty 6.0

EAD-Net uses a diffusion model with new spatio-temporal attention, graph-based temporal reasoning, and LLM-derived semantic descriptions to generate emotionally expressive talking head videos with improved lip-sync and coherence over prior methods.

Do Protective Perturbations Really Protect Portrait Privacy under Real-world Image Transformations?

cs.CV · 2026-04-26 · conditional · novelty 5.0

Pixel-level protective perturbations for portrait privacy are ineffective against common image transformations, and a low-cost purification framework can strip them out.

citing papers explorer

Showing 3 of 3 citing papers.

Listening Deepfake Detection: A New Perspective Beyond Speaking-Centric Forgery Analysis cs.CV · 2026-04-14 · conditional · none · ref 30
Introduces the LDD task, ListenForge dataset built from five listening head generation methods, and MANet model that detects listening forgeries via motion inconsistencies guided by audio semantics.
EAD-Net: Emotion-Aware Talking Head Generation with Spatial Refinement and Temporal Coherence cs.CV · 2026-04-25 · unverdicted · none · ref 28
EAD-Net uses a diffusion model with new spatio-temporal attention, graph-based temporal reasoning, and LLM-derived semantic descriptions to generate emotionally expressive talking head videos with improved lip-sync and coherence over prior methods.
Do Protective Perturbations Really Protect Portrait Privacy under Real-world Image Transformations? cs.CV · 2026-04-26 · conditional · none · ref 27
Pixel-level protective perturbations for portrait privacy are ineffective against common image transformations, and a low-cost purification framework can strip them out.

In Proceedings of the 28th ACM international conference on multimedia

fields

years

verdicts

representative citing papers

citing papers explorer