Transfer Learning for Low-Resource Neural Machine Translation
pith:IMB6CTUG Add to your LaTeX paper
What is a Pith Number?\usepackage{pith}
\pithnumber{IMB6CTUG}
Prints a linked pith:IMB6CTUG badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more
read the original abstract
The encoder-decoder framework for neural machine translation (NMT) has been shown effective in large data scenarios, but is much less effective for low-resource languages. We present a transfer learning method that significantly improves Bleu scores across a range of low-resource languages. Our key idea is to first train a high-resource language pair (the parent model), then transfer some of the learned parameters to the low-resource pair (the child model) to initialize and constrain training. Using our transfer learning method we improve baseline NMT models by an average of 5.6 Bleu on four low-resource language pairs. Ensembling and unknown word replacement add another 2 Bleu which brings the NMT performance on low-resource machine translation close to a strong syntax based machine translation (SBMT) system, exceeding its performance on one language pair. Additionally, using the transfer learning model for re-scoring, we can improve the SBMT system by an average of 1.3 Bleu, improving the state-of-the-art on low-resource machine translation.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
The Role of Vocabularies in Learning Sparse Representations for Ranking
Larger 100K vocabularies in SPLADE models, especially those initialized with ESPLADE pretraining, improve retrieval effectiveness after pruning compared to 32K baselines while keeping similar efficiency.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.