Deepfake detectors act as alpha blending searchers; training solely on self-blended real images yields top cross-dataset generalization on 15 datasets without using synthetic deepfakes.
Memo: Memory-guided diffusion for expressive talking video generation
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3verdicts
UNVERDICTED 3roles
dataset 1polarities
use dataset 1representative citing papers
AUHead uses audio-language models to generate Action Unit sequences from speech and feeds them into a controllable diffusion model to synthesize realistic emotional talking-head videos.
Matrix-Game 2.0 introduces a scalable data pipeline, action-injection module, and few-step distillation to enable real-time streaming video generation at 25 FPS from game-engine interactions, with open-sourced weights and code.
citing papers explorer
-
The Alpha Blending Hypothesis: Compositing Shortcut in Deepfake Detection
Deepfake detectors act as alpha blending searchers; training solely on self-blended real images yields top cross-dataset generalization on 15 datasets without using synthetic deepfakes.
-
AUHead: Realistic Emotional Talking Head Generation via Action Units Control
AUHead uses audio-language models to generate Action Unit sequences from speech and feeds them into a controllable diffusion model to synthesize realistic emotional talking-head videos.
-
Matrix-game 2.0: An open-source real-time and streaming interactive world model
Matrix-Game 2.0 introduces a scalable data pipeline, action-injection module, and few-step distillation to enable real-time streaming video generation at 25 FPS from game-engine interactions, with open-sourced weights and code.