PolySpeech-100 is a new benchmark for native-level speech comprehension across 110 linguistic variants that evaluates 22 models and reports E2E advantages on dialects, robustness gaps on low-resource languages, and degradation from Chain-of-Thought prompting.
Small-footprint keyword spotting using deep neural networks
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
A distribution-alignment framework for unpaired cross-modal knowledge distillation with theoretical guarantees on feature and label alignment.
FedMChain improves multimodal federated learning by chaining modality-wise optimization phases with error-compensated regularization and sparse sign-guided aggregation to mitigate modality competition and cut communication overhead.
APEX generates four types of prototype-based explanations for pre-trained audio classifiers that preserve output invariance and target acoustic properties better than gradient methods applied to spectrograms.
A framework using covariance-based spectral signatures and TreeSHAP attributions on AASIST3 branches identifies four operational archetypes and a flawed specialization mode that explains high error rates on specific spoofing attacks.
GS-NFS accelerates dynamic 3DGS encoding and decoding by 1-2 orders of magnitude on GPU while maintaining competitive compression ratios and rendering quality.
BMRUs enable analog recurrent neural network hardware via discrete outputs that suppress noise 20-fold, with one-to-one parameter-to-circuit mapping and linear power scaling for recurrence.