Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations

Jiatao Gu; Kyunghyun Cho; Victor O.K. Li; Yong Wang

Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations

Not yet reviewed by Pith; the record is open.

Re-run · record.json Download PDF Read on arXiv ↗

This paper has not been read by Pith yet. Machine review is queued; the pith claim, tier, and objections will appear here once it completes.

SPECIMEN: schema-true, not a live event

T0 review · schema-true

One-sentence machine reading of the paper's core claim.

pith:XXXXXXXX · record.json · timestamp

arxiv 1906.01181 v1 pith:PKI6VVWN submitted 2019-06-04 cs.CL

Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations

Jiatao Gu , Yong Wang , Kyunghyun Cho , Victor O.K. Li This is my paper

classification cs.CL

keywords translationzero-shotlanguageapproachcorrelationsmachinemultilingualneural

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

0 comments

read the original abstract

Zero-shot translation, translating between language pairs on which a Neural Machine Translation (NMT) system has never been trained, is an emergent property when training the system in multilingual settings. However, naive training for zero-shot NMT easily fails, and is sensitive to hyper-parameter setting. The performance typically lags far behind the more conventional pivot-based approach which translates twice using a third language as a pivot. In this work, we address the degeneracy problem due to capturing spurious correlations by quantitatively analyzing the mutual information between language IDs of the source and decoded sentences. Inspired by this analysis, we propose to use two simple but effective approaches: (1) decoder pre-training; (2) back-translation. These methods show significant improvement (4~22 BLEU points) over the vanilla zero-shot translation on three challenging multilingual datasets, and achieve similar or better results than the pivot-based approach.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Evaluating the Supervised and Zero-shot Performance of Multi-lingual Translation Models
cs.CL 2019-06 unverdicted novelty 4.0

Task-specific decoder parameters outperform fully shared decoder parameters in both supervised and zero-shot multilingual translation performance.