Investigating decoder-only large language models for speech-to-text translation

Chao-Wei Huang, Hui Lu, Hongyu Gong, Hirofumi Inaguma, Ilia Kulikov, Ruslan Mavlyutov, Sravya Popuri · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Ti-Audio: The First Multi-Dialectal End-to-End Speech LLM for Tibetan

cs.SD · 2026-04-13 · unverdicted · novelty 7.0

Ti-Audio is the first multi-dialectal end-to-end Speech-LLM for Tibetan that achieves state-of-the-art performance on ASR and speech translation benchmarks via a Dynamic Q-Former Adapter and cross-dialect cooperation.

citing papers explorer

Showing 1 of 1 citing paper.

Ti-Audio: The First Multi-Dialectal End-to-End Speech LLM for Tibetan cs.SD · 2026-04-13 · unverdicted · none · ref 9
Ti-Audio is the first multi-dialectal end-to-end Speech-LLM for Tibetan that achieves state-of-the-art performance on ASR and speech translation benchmarks via a Dynamic Q-Former Adapter and cross-dialect cooperation.

Investigating decoder-only large language models for speech-to-text translation

fields

years

verdicts

representative citing papers

citing papers explorer