Contextual Parameter Generation for Universal Neural Machine Translation

Emmanouil Antonios Platanios; Graham Neubig; Mrinmaya Sachan; Tom Mitchell

arxiv: 1808.08493 · v1 · pith:HMDPG2JFnew · submitted 2018-08-26 · 💻 cs.CL · cs.LG· stat.ML

Contextual Parameter Generation for Universal Neural Machine Translation

Emmanouil Antonios Platanios , Mrinmaya Sachan , Graham Neubig , Tom Mitchell This is my paper

classification 💻 cs.CL cs.LGstat.ML

keywords languagelanguagesmodelneuralparametersystemtranslationable

0 comments

read the original abstract

We propose a simple modification to existing neural machine translation (NMT) models that enables using a single universal model to translate between multiple languages while allowing for language specific parameterization, and that can also be used for domain adaptation. Our approach requires no changes to the model architecture of a standard NMT system, but instead introduces a new component, the contextual parameter generator (CPG), that generates the parameters of the system (e.g., weights in a neural network). This parameter generator accepts source and target language embeddings as input, and generates the parameters for the encoder and the decoder, respectively. The rest of the model remains unchanged and is shared across all languages. We show how this simple modification enables the system to use monolingual data for training and also perform zero-shot translation. We further show it is able to surpass state-of-the-art performance for both the IWSLT-15 and IWSLT-17 datasets and that the learned language embeddings are able to uncover interesting relationships between languages.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
cs.CL 2019-07 unverdicted novelty 5.0

A single multilingual NMT model for 103 languages trained on 25B examples demonstrates transfer learning benefits for low-resource languages.
Improving Zero-shot Translation with Language-Independent Constraints
cs.CL 2019-06 unverdicted novelty 4.0

Language-independent constraints and regularization in multilingual Transformer NMT yield a 2.23 BLEU average gain on zero-shot pairs from the IWSLT 2017 dataset.