pith. machine review for the scientific record. sign in

arxiv: 1803.09179 · v1 · submitted 2018-03-24 · 💻 cs.CV

Recognition: unknown

FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces

Authors on Pith no claims yet
classification 💻 cs.CV
keywords videosdatasetfaceintroducevideobeencompresseddatasets
0
0 comments X
read the original abstract

With recent advances in computer vision and graphics, it is now possible to generate videos with extremely realistic synthetic faces, even in real time. Countless applications are possible, some of which raise a legitimate alarm, calling for reliable detectors of fake videos. In fact, distinguishing between original and manipulated video can be a challenge for humans and computers alike, especially when the videos are compressed or have low resolution, as it often happens on social networks. Research on the detection of face manipulations has been seriously hampered by the lack of adequate datasets. To this end, we introduce a novel face manipulation dataset of about half a million edited images (from over 1000 videos). The manipulations have been generated with a state-of-the-art face editing approach. It exceeds all existing video manipulation datasets by at least an order of magnitude. Using our new dataset, we introduce benchmarks for classical image forensic tasks, including classification and segmentation, considering videos compressed at various quality levels. In addition, we introduce a benchmark evaluation for creating indistinguishable forgeries with known ground truth; for instance with generative refinement models.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. FrameDiT: Diffusion Transformer with Matrix Attention for Efficient Video Generation

    cs.CV 2026-03 unverdicted novelty 7.0

    FrameDiT proposes Matrix Attention for DiTs to achieve SOTA video generation with improved temporal coherence and efficiency comparable to local factorized attention.

  2. Latte: Latent Diffusion Transformer for Video Generation

    cs.CV 2024-01 unverdicted novelty 6.0

    Latte achieves state-of-the-art video generation on FaceForensics, SkyTimelapse, UCF101, and Taichi-HD by using a latent diffusion transformer with four efficient spatial-temporal decomposition variants and best-pract...