Raw Differentiable Architecture Search for Speech Deepfake and Spoofing Detection

Jose Patino; Massimiliano Todisco; Nicholas Evans; Wanying Ge

arxiv: 2107.12212 · v2 · pith:UEQXILX2new · submitted 2021-07-26 · 📡 eess.AS

Raw Differentiable Architecture Search for Speech Deepfake and Spoofing Detection

Wanying Ge , Jose Patino , Massimiliano Todisco , Nicholas Evans This is my paper

classification 📡 eess.AS

keywords architecturenetworkdetectionapproachesdeepfakedifferentiableparameterssearch

0 comments

read the original abstract

End-to-end approaches to anti-spoofing, especially those which operate directly upon the raw signal, are starting to be competitive with their more traditional counterparts. Until recently, all such approaches consider only the learning of network parameters; the network architecture is still hand crafted. This too, however, can also be learned. Described in this paper is our attempt to learn automatically the network architecture of a speech deepfake and spoofing detection solution, while jointly optimising other network components and parameters, such as the first convolutional layer which operates on raw signal inputs. The resulting raw differentiable architecture search system delivers a tandem detection cost function score of 0.0517 for the ASVspoof 2019 logical access database, a result which is among the best single-system results reported to date.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
cs.SD 2024-01 unverdicted novelty 6.0

MLAAD provides a large-scale multi-language synthetic audio dataset for training and evaluating audio anti-spoofing models, showing better training performance than InTheWild and FakeOrReal and alternating superiority...