ProVoice-Bench is the first framework to evaluate proactive voice agents, revealing that state-of-the-art multimodal LLMs struggle with over-triggering and context-aware reasoning.
ESC: Dataset for Environmental Sound Classification
5 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 5representative citing papers
A model-free diffusion test for discrete time series that uses the scaling of excursion counts with quadratic variation to classify signals as stochastic or deterministic.
ZEBRA reduces the base-to-novel generalization gap in audio-language models by fusing zero-shot and prompt-learning logits with entropy regularization.
DeePen demonstrates that both production and academic audio deepfake detectors can be reliably deceived by simple signal processing attacks such as time-stretching or echo addition, with some attacks resistible via retraining and others remaining effective.
MLAAD provides a large-scale multi-language synthetic audio dataset for training and evaluating audio anti-spoofing models, showing better training performance than InTheWild and FakeOrReal and alternating superiority with ASVspoof 2019 across eight test sets.
citing papers explorer
-
DeePen: Penetration Testing for Audio Deepfake Detection
DeePen demonstrates that both production and academic audio deepfake detectors can be reliably deceived by simple signal processing attacks such as time-stretching or echo addition, with some attacks resistible via retraining and others remaining effective.