Proposes Bayesian factorized adaptation for multilingual ASR to handle code-switching, reporting 32.87% fewer errors on switched words and 5.31% better overall WER while preserving monolingual accuracy with small synthetic data.
Adding Robust Code-Switching Capabilities to High Performance Multilingual ASR
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Code-switching (CSW) remains challenging for large multi-lingual ASR systems in real-world deployment. While fine-tuning on synthetic CSW data is possible, it generally degrades strong monolingual baselines. Our goal is to preserve these capabilities while extending models to handle complex code-switching, including morphological variations across languages. We propose Bayesian factorized adaptation, which learns to efficiently integrate switching-relevant knowledge into strong pretrained models without overwriting existing capabilities. Requiring only a small amount of synthetic data, our approach reduces transcription errors by 32.87% on code-switched words while improving overall WER by 5.31%, all while maintaining mono-lingual performance. Our results demonstrate that effective CSW adaptation depends more on knowledge integration than data complexity.
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Adding Robust Code-Switching Capabilities to High Performance Multilingual ASR
Proposes Bayesian factorized adaptation for multilingual ASR to handle code-switching, reporting 32.87% fewer errors on switched words and 5.31% better overall WER while preserving monolingual accuracy with small synthetic data.