A speech-text alignment method generates expressive pseudo-audio prompts for effective text-only domain adaptation in LLM-based ASR, outperforming prior text-only approaches on error rates and OOV coverage.
Decoupled weight decay regulariza- tion,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SD 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Pre-trained ECAPA-TDNN with margin losses reaches 85.95% macro and 90.96% micro accuracy on language identification plus 17.08% EER on verification, beating the official baseline by 45.7%, 15.2%, and 50.8% respectively.
citing papers explorer
-
Refining Pseudo-Audio Prompts with Speech-Text Alignment for Text-Only Domain Adaptation in LLM-Based ASR
A speech-text alignment method generates expressive pseudo-audio prompts for effective text-only domain adaptation in LLM-based ASR, outperforming prior text-only approaches on error rates and OOV coverage.
-
Spoken Language Identification with Pre-trained Models and Margin Loss
Pre-trained ECAPA-TDNN with margin losses reaches 85.95% macro and 90.96% micro accuracy on language identification plus 17.08% EER on verification, beating the official baseline by 45.7%, 15.2%, and 50.8% respectively.