Speech language models fail at reasoning about sentence stress but improve after fine-tuning on a new 17k-example synthetic dataset that varies stress to alter meaning.
Binghuai Lin, Liyuan Wang, Xiaoli Feng, and Jinsong Zhang
3 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Adaptive selection among a library of audio perturbations in contrastive decoding produces task-dependent accuracy gains, including +4.3% on an existence task via a hidden-state selector.
AUDITA is a challenging audio QA benchmark where humans score 32% accuracy on average while state-of-the-art models score below 9%, using IRT to reveal systematic model deficiencies.
citing papers explorer
-
Adaptive Perturbation Selection for Contrastive Audio Decoding
Adaptive selection among a library of audio perturbations in contrastive decoding produces task-dependent accuracy gains, including +4.3% on an existence task via a hidden-state selector.
-
AUDITA: A New Dataset to Audit Humans vs. AI Skill at Audio QA
AUDITA is a challenging audio QA benchmark where humans score 32% accuracy on average while state-of-the-art models score below 9%, using IRT to reveal systematic model deficiencies.