ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual neTworks

Cheng-I Lai , Nanxin Chen , Jes\'us Villalba , Najim Dehak

Authors on Pith no claims yet

classification 💻 cs.CL cs.LGcs.SDeess.AS

keywords assertanti-spoofingasvspoofmodelsnetworksresidualsqueeze-excitationnetwork

read the original abstract

We present JHU's system submission to the ASVspoof 2019 Challenge: Anti-Spoofing with Squeeze-Excitation and Residual neTworks (ASSERT). Anti-spoofing has gathered more and more attention since the inauguration of the ASVspoof Challenges, and ASVspoof 2019 dedicates to address attacks from all three major types: text-to-speech, voice conversion, and replay. Built upon previous research work on Deep Neural Network (DNN), ASSERT is a pipeline for DNN-based approach to anti-spoofing. ASSERT has four components: feature engineering, DNN models, network optimization and system combination, where the DNN models are variants of squeeze-excitation and residual networks. We conducted an ablation study of the effectiveness of each component on the ASVspoof 2019 corpus, and experimental results showed that ASSERT obtained more than 93% and 17% relative improvements over the baseline systems in the two sub-challenges in ASVspooof 2019, ranking ASSERT one of the top performing systems. Code and pretrained models will be made publicly available.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ProSDD: Learning Prosodic Representations for Speech Deepfake Detection against Expressive and Emotional Attacks
eess.AS 2026-04 unverdicted novelty 6.0

ProSDD learns speaker-conditioned prosodic variation from real speech via supervised masked prediction and jointly optimizes it with spoof detection, cutting EER substantially on ASVspoof 2024 and emotional datasets.