SEED is a new benchmark for sequential provenance tracing in diffusion-edited deepfake faces, with the FAITH baseline showing that wavelet-based high-frequency signals aid detection of accumulated editing artifacts.
Evolving from single-modal to multi-modal facial deepfake detection: Progress and challenges
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 2polarities
background 2representative citing papers
The paper introduces semantic mismatch between authentic audio and video as a new DeepFake detection challenge via the RARV-SMM class and demonstrates that a semantic reinforcement strategy with ImageBind embeddings improves detection on FakeAVCeleb and LAV-DF.
ANL uses diffusion noise prediction and attention to regularize deepfake detectors for better generalization to unseen synthesis methods without added inference cost.
3D CNN detector with temporal consistency regularizer reaches 92.8% accuracy on DeepfakeTIMIT and 76.4% cross-dataset on FaceForensics++ without fine-tuning.
citing papers explorer
-
SEED: A Large-Scale Benchmark for Provenance Tracing in Sequential Deepfake Facial Edits
SEED is a new benchmark for sequential provenance tracing in diffusion-edited deepfake faces, with the FAITH baseline showing that wavelet-based high-frequency signals aid detection of accumulated editing artifacts.
-
Are DeepFakes Realistic Enough? Exploring Semantic Mismatch as a Novel Challenge
The paper introduces semantic mismatch between authentic audio and video as a new DeepFake detection challenge via the RARV-SMM class and demonstrates that a semantic reinforcement strategy with ImageBind embeddings improves detection on FakeAVCeleb and LAV-DF.
-
Deepfake Detection Generalization with Diffusion Noise
ANL uses diffusion noise prediction and attention to regularize deepfake detectors for better generalization to unseen synthesis methods without added inference cost.
-
Deepfake Detection in Social Media: A Temporal Artifact Analysis Using 3D Convolutional Neural Networks
3D CNN detector with temporal consistency regularizer reaches 92.8% accuracy on DeepfakeTIMIT and 76.4% cross-dataset on FaceForensics++ without fine-tuning.