We propose an optimal hyperparameter config- uration for this approach; and (3)Standard KD (Std

Results To evaluate the impact of our distillation framework, we compare our CAAD method against different baselines: (1)Greedy Decoding, representing the model’s vanilla baselin · arXiv 4236.1450

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

CAAD: Contrastive Audio-Aware Distillation for Efficient Speech Language Models

eess.AS · 2026-06-22 · unverdicted · novelty 5.0

CAAD internalizes contrastive audio-aware decoding into student SLM weights via synchronized teacher-forcing, delivering an 8% relative gain over standard knowledge distillation on Dynamic-SUPERB while reducing linguistic bias on MCR-BENCH.

citing papers explorer

Showing 1 of 1 citing paper.

CAAD: Contrastive Audio-Aware Distillation for Efficient Speech Language Models eess.AS · 2026-06-22 · unverdicted · none · ref 6
CAAD internalizes contrastive audio-aware decoding into student SLM weights via synchronized teacher-forcing, delivering an 8% relative gain over standard knowledge distillation on Dynamic-SUPERB while reducing linguistic bias on MCR-BENCH.

We propose an optimal hyperparameter config- uration for this approach; and (3)Standard KD (Std

fields

years

verdicts

representative citing papers

citing papers explorer