Finetuning a pre-trained WavLM model via a three-stage strategy achieves high detection accuracy for deepfake environmental sounds on two benchmark datasets, outperforming training from scratch.
Din-cts: Low-complexity depthwise-inception neural network with contrastive training strategy for deepfake speech detection
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SD 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Balancing diverse bonafide resources and AI generators in training data is the key to building general deepfake speech detection models.
citing papers explorer
-
Environmental Sound Deepfake Detection Using Deep-Learning Framework
Finetuning a pre-trained WavLM model via a three-stage strategy achieves high detection accuracy for deepfake environmental sounds on two benchmark datasets, outperforming training from scratch.
-
A General Model for Deepfake Speech Detection: Diverse Bonafide Resources or Diverse AI-Based Generators
Balancing diverse bonafide resources and AI generators in training data is the key to building general deepfake speech detection models.