Detecting multimedia generated by large ai models: A survey

[Lin et al · 2024 · arXiv 2402.00045

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

The Synthetic Media Shift: Tracking the Rise, Virality, and Detectability of AI-Generated Multimodal Misinformation

cs.CR · 2026-04-15 · unverdicted · novelty 7.0

AI-generated content in a new 150K-post dataset spreads virally via passive engagement, reaches consensus faster once flagged, and evades detectors more effectively as models improve.

XNote: Benchmarking Automated Community Notes Generation for Image-based Contextual Deception

cs.CL · 2026-03-23 · unverdicted · novelty 7.0

The XNote dataset and LVLM benchmarks demonstrate that current models face significant challenges in generating accurate, grounded Community Notes for image-based contextual deception.

VideoASMR-Bench: Can AI-Generated ASMR Videos Fool VLMs and Humans?

cs.CV · 2025-12-15 · unverdicted · novelty 7.0

VideoASMR-Bench shows state-of-the-art VLMs fail to reliably detect AI-generated ASMR videos from real ones, though humans can still identify the fakes relatively easily.

Deepfake Detection Generalization with Diffusion Noise

cs.CV · 2026-04-16 · unverdicted · novelty 6.0

ANL uses diffusion noise prediction and attention to regularize deepfake detectors for better generalization to unseen synthesis methods without added inference cost.

Beyond Semantics: Uncovering the Physics of Fakes via Universal Physical Descriptors for Cross-Modal Synthetic Detection

cs.CV · 2026-04-06 · unverdicted · novelty 6.0

Five universal physical descriptors including Laplacian variance, Sobel statistics, and residual noise variance, when integrated as text encodings with CLIP, achieve up to 99.8% accuracy detecting synthetic images across GAN and diffusion model datasets.

Towards multi-modal forgery representation learning for AI-generated video detection and localization

cs.CV · 2026-05-08 · unverdicted · novelty 5.0

A multi-modal model with LMM semantic, ST visual, and PS audio branches enables simultaneous detection and fine-grained temporal localization of partial AI video forgeries, outperforming prior methods.

Fully AI-Generated Image Detection: Definition, Recent Advances and Challenges

cs.CV · 2025-02-27 · unverdicted · novelty 2.0

A systematic review of fully AI-generated image detection that organizes prior work around dataset construction and artifact extraction methods based on inductive priors.

citing papers explorer

Showing 7 of 7 citing papers.

The Synthetic Media Shift: Tracking the Rise, Virality, and Detectability of AI-Generated Multimodal Misinformation cs.CR · 2026-04-15 · unverdicted · none · ref 26
AI-generated content in a new 150K-post dataset spreads virally via passive engagement, reaches consensus faster once flagged, and evades detectors more effectively as models improve.
XNote: Benchmarking Automated Community Notes Generation for Image-based Contextual Deception cs.CL · 2026-03-23 · unverdicted · none · ref 24
The XNote dataset and LVLM benchmarks demonstrate that current models face significant challenges in generating accurate, grounded Community Notes for image-based contextual deception.
VideoASMR-Bench: Can AI-Generated ASMR Videos Fool VLMs and Humans? cs.CV · 2025-12-15 · unverdicted · none · ref 26
VideoASMR-Bench shows state-of-the-art VLMs fail to reliably detect AI-generated ASMR videos from real ones, though humans can still identify the fakes relatively easily.
Deepfake Detection Generalization with Diffusion Noise cs.CV · 2026-04-16 · unverdicted · none · ref 27
ANL uses diffusion noise prediction and attention to regularize deepfake detectors for better generalization to unseen synthesis methods without added inference cost.
Beyond Semantics: Uncovering the Physics of Fakes via Universal Physical Descriptors for Cross-Modal Synthetic Detection cs.CV · 2026-04-06 · unverdicted · none · ref 14
Five universal physical descriptors including Laplacian variance, Sobel statistics, and residual noise variance, when integrated as text encodings with CLIP, achieve up to 99.8% accuracy detecting synthetic images across GAN and diffusion model datasets.
Towards multi-modal forgery representation learning for AI-generated video detection and localization cs.CV · 2026-05-08 · unverdicted · none · ref 9
A multi-modal model with LMM semantic, ST visual, and PS audio branches enables simultaneous detection and fine-grained temporal localization of partial AI video forgeries, outperforming prior methods.
Fully AI-Generated Image Detection: Definition, Recent Advances and Challenges cs.CV · 2025-02-27 · unverdicted · none · ref 22
A systematic review of fully AI-generated image detection that organizes prior work around dataset construction and artifact extraction methods based on inductive priors.

Detecting multimedia generated by large ai models: A survey

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer