MCAT scales MLLMs to many-to-many speech translation across 70 languages via curriculum learning and a 30-token speech adapter, surpassing prior SOTA on FLEURS while improving speed.
Breaking the data barrier: Towards robust speech translation via adversarial stability training,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
MCAT: Scaling Many-to-Many Speech-to-Text Translation with MLLMs to 70 Languages
MCAT scales MLLMs to many-to-many speech translation across 70 languages via curriculum learning and a 30-token speech adapter, surpassing prior SOTA on FLEURS while improving speed.