Large-scale listening study of 35,532 judgments finds human accuracy on real audio fell from 72.7% to 64.1% since 2021 while fake detection remained stable, indicating a skepticism shift toward genuine speech.
Human perception of audio deepfakes
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
extension 1polarities
extend 1representative citing papers
APEX generates four types of prototype-based explanations for pre-trained audio classifiers that preserve output invariance and target acoustic properties better than gradient methods applied to spectrograms.
Proposes MeloDISinger, a flow-matching SVE model with MeloDRP for melody-aware duration-preserving editing and audio infilling, claiming SOTA results.
citing papers explorer
-
Eroding Trust in Real Speech: A Large-Scale Study of Human Audio Deepfake Perception
Large-scale listening study of 35,532 judgments finds human accuracy on real audio fell from 72.7% to 64.1% since 2021 while fake detection remained stable, indicating a skepticism shift toward genuine speech.
-
APEX: Audio Prototype EXplanations for Classification Tasks
APEX generates four types of prototype-based explanations for pre-trained audio classifiers that preserve output invariance and target acoustic properties better than gradient methods applied to spectrograms.
-
MeloDISinger: Melody-Aware & Duration-Preserving Singing Voice Editing with Audio Infilling
Proposes MeloDISinger, a flow-matching SVE model with MeloDRP for melody-aware duration-preserving editing and audio infilling, claiming SOTA results.