An E2E ASR model with mixed wordpieces and phonemes improves foreign proper noun recognition via phoneme-level contextual biasing, showing 16% gain over grapheme-only and 8% over wordpiece-only baselines.
Shallow Fusion E2E Biasing Shallow fusion has been used in E2E models for decoding [10] and contextual biasing [6]
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2019 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models
An E2E ASR model with mixed wordpieces and phonemes improves foreign proper noun recognition via phoneme-level contextual biasing, showing 16% gain over grapheme-only and 8% over wordpiece-only baselines.