APEX generates four types of prototype-based explanations for pre-trained audio classifiers that preserve output invariance and target acoustic properties better than gradient methods applied to spectrograms.
Natural tts synthesis by conditioning wavenet on mel spectrogram predictions
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
A framework using covariance-based spectral signatures and TreeSHAP attributions on AASIST3 branches identifies four operational archetypes and a flawed specialization mode that explains high error rates on specific spoofing attacks.
Balalaika is a data-centric annotation pipeline for Russian speech that combines semantic VAD, ASR ensembling, and prosody enrichment to build a 5.1k-hour corpus showing gains in denoising and TTS.
citing papers explorer
-
APEX: Audio Prototype EXplanations for Classification Tasks
APEX generates four types of prototype-based explanations for pre-trained audio classifiers that preserve output invariance and target acoustic properties better than gradient methods applied to spectrograms.
-
Interpreting Multi-Branch Anti-Spoofing Architectures: Correlating Internal Strategy with Empirical Performance
A framework using covariance-based spectral signatures and TreeSHAP attributions on AASIST3 branches identifies four operational archetypes and a flawed specialization mode that explains high error rates on specific spoofing attacks.
-
Balalaika: Data-Centric, Prosody-Aware Annotation Pipeline for Russian Speech
Balalaika is a data-centric annotation pipeline for Russian speech that combines semantic VAD, ASR ensembling, and prosody enrichment to build a 5.1k-hour corpus showing gains in denoising and TTS.
- Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations