Phoneme-based interfaces match or surpass projector-based ones for LLM ASR, especially in low-resource languages, and a BPE-phoneme hybrid offers additional improvements.
Con- nectionist temporal classification: labelling unsegmented se- quence data with recurrent neural networks,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.AS 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Phonemes vs. Projectors: An Investigation of Speech-Language Interfaces for LLM-based ASR
Phoneme-based interfaces match or surpass projector-based ones for LLM ASR, especially in low-resource languages, and a BPE-phoneme hybrid offers additional improvements.