arxiv: 1811.00739 · v1 · pith:2GVYVACDnew · submitted 2018-11-02 · 💻 cs.CL · cs.LG

An Empirical Exploration of Curriculum Learning for Neural Machine Translation

Xuan Zhang , Gaurav Kumar , Huda Khayrallah , Kenton Murray , Jeremy Gwinnup , Marianna J Martindale , Paul McNamee , Kevin Duh

show 1 more author

Marine Carpuat

This is my paper

classification 💻 cs.CL cs.LG

keywords curriculumtranslationlearningexplorationmachineneuralresultstrain

0 comments

read the original abstract

Machine translation systems based on deep neural networks are expensive to train. Curriculum learning aims to address this issue by choosing the order in which samples are presented during training to help train better models faster. We adopt a probabilistic view of curriculum learning, which lets us flexibly evaluate the impact of curricula design, and perform an extensive exploration on a German-English translation task. Results show that it is possible to improve convergence time at no loss in translation quality. However, results are highly sensitive to the choice of sample difficulty criteria, curriculum schedule and other hyperparameters.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

The Benefits of Temporal Correlations: SGD Learns k-Juntas from Random Walks Efficiently
cs.LG 2026-05 unverdicted novelty 7.0

Temporal correlations from lazy random walks enable efficient SGD learning of k-juntas via temporal-difference loss on ReLU networks, achieving linear sample complexity in d.