Empirical analysis of speaker and acoustic factors correlated with ASR word error rates across five Indic languages using zero-shot evaluation on multiple open-source models.
Factors affecting ASR performance: A study using state of the art ASR models in Indic Languages
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
ASR performance varies across languages, speakers, and recording conditions, yet systematic analysis for Indic languages remain limited. We present a large-scale study of decoded outputs from multiple open-source ASR models evaluated on diverse Indian speech datasets in zero-shot settings. We analyze linguistic, speaker-level, and acoustic factors across Hindi, Bengali, Kannada, Telugu, and Marathi. We examine correlations between WER and speaker traits such as average word length, speaking rate, and utterance duration across multiple model dataset pairs. For Hindi, we further analyze audio factors including telephone codecs, bit depth, resampling, and background noise. Results reveal both cross lingual patterns and language-specific sensitivities, showing how speaker behavior and signal processing choices affect ASR robustness in real world Indic scenarios.
fields
eess.AS 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Factors affecting ASR performance: A study using state of the art ASR models in Indic Languages
Empirical analysis of speaker and acoustic factors correlated with ASR word error rates across five Indic languages using zero-shot evaluation on multiple open-source models.