Binghuai Lin, Liyuan Wang, Xiaoli Feng, and Jinsong Zhang

· 2022 · arXiv 2204.09634

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

StressTest: Can YOUR Speech LM Handle the Stress?

cs.CL · 2025-05-28 · conditional · novelty 6.0

Speech language models fail at reasoning about sentence stress but improve after fine-tuning on a new 17k-example synthetic dataset that varies stress to alter meaning.

Adaptive Perturbation Selection for Contrastive Audio Decoding

cs.SD · 2026-06-30 · unverdicted · novelty 5.0

Adaptive selection among a library of audio perturbations in contrastive decoding produces task-dependent accuracy gains, including +4.3% on an existence task via a hidden-state selector.

AUDITA: A New Dataset to Audit Humans vs. AI Skill at Audio QA

cs.CL · 2026-04-23 · unverdicted · novelty 5.0

AUDITA is a challenging audio QA benchmark where humans score 32% accuracy on average while state-of-the-art models score below 9%, using IRT to reveal systematic model deficiencies.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Adaptive Perturbation Selection for Contrastive Audio Decoding cs.SD · 2026-06-30 · unverdicted · none · ref 17
Adaptive selection among a library of audio perturbations in contrastive decoding produces task-dependent accuracy gains, including +4.3% on an existence task via a hidden-state selector.
AUDITA: A New Dataset to Audit Humans vs. AI Skill at Audio QA cs.CL · 2026-04-23 · unverdicted · none · ref 6
AUDITA is a challenging audio QA benchmark where humans score 32% accuracy on average while state-of-the-art models score below 9%, using IRT to reveal systematic model deficiencies.

Binghuai Lin, Liyuan Wang, Xiaoli Feng, and Jinsong Zhang

fields

years

verdicts

representative citing papers

citing papers explorer