Mechanistic tracing shows text suppresses but does not erase audio representations in late layers of Audio LLMs; back-patching reduces text dominance.
When audio-llms don’t listen: A cross-linguistic study of modality arbitration,
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
ALMs encode audio evidence but override it with text in conflicts; GACL interpolates joint and same-audio scores to repair reversals, gaining 17.8 nAUC points under a 5pp faithfulness budget.
CAAD internalizes contrastive audio-aware decoding into student SLM weights via synchronized teacher-forcing, delivering an 8% relative gain over standard knowledge distillation on Dynamic-SUPERB while reducing linguistic bias on MCR-BENCH.
citing papers explorer
-
CAAD: Contrastive Audio-Aware Distillation for Efficient Speech Language Models
CAAD internalizes contrastive audio-aware decoding into student SLM weights via synchronized teacher-forcing, delivering an 8% relative gain over standard knowledge distillation on Dynamic-SUPERB while reducing linguistic bias on MCR-BENCH.