pith. machine review for the scientific record. sign in

arxiv: 1903.00089 · v3 · submitted 2019-02-28 · 💻 cs.CL

Recognition: unknown

Massively Multilingual Neural Machine Translation

Authors on Pith no claims yet
classification 💻 cs.CL
keywords multilinguallanguagesmassivelytranslationmodelstrainingenglishexperiments
0
0 comments X
read the original abstract

Multilingual neural machine translation (NMT) enables training a single model that supports translation from multiple source languages into multiple target languages. In this paper, we push the limits of multilingual NMT in terms of number of languages being used. We perform extensive experiments in training massively multilingual NMT models, translating up to 102 languages to and from English within a single model. We explore different setups for training such models and analyze the trade-offs between translation quality and various modeling decisions. We report results on the publicly available TED talks multilingual corpus where we show that massively multilingual many-to-many models are effective in low resource settings, outperforming the previous state-of-the-art while supporting up to 59 languages. Our experiments on a large-scale dataset with 102 languages to and from English and up to one million examples per direction also show promising results, surpassing strong bilingual baselines and encouraging future work on massively multilingual NMT.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding

    cs.CL 2020-06 unverdicted novelty 6.0

    GShard supplies automatic sharding and conditional computation support that enabled training a 600-billion-parameter multilingual translation model on thousands of TPUs with superior quality.