Phoneme-based interfaces match or surpass projector-based ones for LLM ASR, especially in low-resource languages, and a BPE-phoneme hybrid offers additional improvements.
SALM: Speech-augmented language model with in-context learning for speech recognition and translation
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
A consequence-aware evaluation framework applied to LLMs in ATC finds peak Risk Score of only 0.69 despite high macro-F1, with errors concentrated in high-impact entities.
citing papers explorer
-
Phonemes vs. Projectors: An Investigation of Speech-Language Interfaces for LLM-based ASR
Phoneme-based interfaces match or surpass projector-based ones for LLM ASR, especially in low-resource languages, and a BPE-phoneme hybrid offers additional improvements.
-
Safety-Oriented Evaluation of Language Understanding Systems for Air Traffic Control
A consequence-aware evaluation framework applied to LLMs in ATC finds peak Risk Score of only 0.69 despite high macro-F1, with errors concentrated in high-impact entities.