Memo: Memory-guided diffusion for expressive talking video generation

Longtao Zheng, Yifan Zhang, Hanzhong Guo, Jiachun Pan, Zhenxiong Tan, Jiahao Lu, Chuanxin Tang, Bo An, Shuicheng Yan · 2024 · arXiv 2412.04448

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

dataset 1

citation-polarity summary

use dataset 1

representative citing papers

The Alpha Blending Hypothesis: Compositing Shortcut in Deepfake Detection

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

Deepfake detectors act as alpha blending searchers; training solely on self-blended real images yields top cross-dataset generalization on 15 datasets without using synthetic deepfakes.

AUHead: Realistic Emotional Talking Head Generation via Action Units Control

cs.CV · 2026-02-10 · unverdicted · novelty 5.0

AUHead uses audio-language models to generate Action Unit sequences from speech and feeds them into a controllable diffusion model to synthesize realistic emotional talking-head videos.

Matrix-game 2.0: An open-source real-time and streaming interactive world model

cs.CV · 2025-08-18 · unverdicted · novelty 5.0

Matrix-Game 2.0 introduces a scalable data pipeline, action-injection module, and few-step distillation to enable real-time streaming video generation at 25 FPS from game-engine interactions, with open-sourced weights and code.

citing papers explorer

Showing 3 of 3 citing papers.

The Alpha Blending Hypothesis: Compositing Shortcut in Deepfake Detection cs.CV · 2026-05-11 · unverdicted · none · ref 56
Deepfake detectors act as alpha blending searchers; training solely on self-blended real images yields top cross-dataset generalization on 15 datasets without using synthetic deepfakes.
AUHead: Realistic Emotional Talking Head Generation via Action Units Control cs.CV · 2026-02-10 · unverdicted · none · ref 27
AUHead uses audio-language models to generate Action Unit sequences from speech and feeds them into a controllable diffusion model to synthesize realistic emotional talking-head videos.
Matrix-game 2.0: An open-source real-time and streaming interactive world model cs.CV · 2025-08-18 · unverdicted · none · ref 60
Matrix-Game 2.0 introduces a scalable data pipeline, action-injection module, and few-step distillation to enable real-time streaming video generation at 25 FPS from game-engine interactions, with open-sourced weights and code.

Memo: Memory-guided diffusion for expressive talking video generation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer