Ti-Audio is the first multi-dialectal end-to-end Speech-LLM for Tibetan that achieves state-of-the-art performance on ASR and speech translation benchmarks via a Dynamic Q-Former Adapter and cross-dialect cooperation.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
PTNet is a prototype-guided task-adaptive model that jointly performs change detection and captioning on bi-temporal UAV imagery by modeling structured change semantics, outperforming prior methods on the new UCCD urban construction benchmark and WHU-CDC.
citing papers explorer
-
Ti-Audio: The First Multi-Dialectal End-to-End Speech LLM for Tibetan
Ti-Audio is the first multi-dialectal end-to-end Speech-LLM for Tibetan that achieves state-of-the-art performance on ASR and speech translation benchmarks via a Dynamic Q-Former Adapter and cross-dialect cooperation.
-
UAV as Urban Construction Change Monitor: A New Benchmark and Change Captioning Model
PTNet is a prototype-guided task-adaptive model that jointly performs change detection and captioning on bi-temporal UAV imagery by modeling structured change semantics, outperforming prior methods on the new UCCD urban construction benchmark and WHU-CDC.