The University of Edinburgh's Submissions to the WMT19 News Translation Task
Pith reviewed 2026-05-24 22:17 UTC · model grok-4.3
The pith
Vast amounts of back-translated data continue to raise German-to-English news translation quality.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For German-to-English, we studied the impact of vast amounts of back-translated training data on translation quality, gaining a few additional insights over Edunov et al. (2018).
What carries the argument
Back-translation of large monolingual target-language corpora to create synthetic parallel training data.
If this is right
- For high-resource pairs, simply scaling back-translated data remains an effective route to better systems.
- Character-based tokenization can be directly compared against sub-word segmentation for Chinese source and target text.
- Cross-lingual language-model pre-training plus pivoting through Hindi offers an alternative path for English-Gujarati.
- Different pre-processing and tokenisation choices can be tested for English-to-Czech without changing the core back-translation approach.
Where Pith is reading between the lines
- If the scaling result holds, future high-resource systems may be limited mainly by the supply of clean monolingual target text rather than by model capacity.
- The same back-translation pipeline could be applied to other high-resource pairs where large monolingual corpora already exist.
- Low-resource directions may still require the additional techniques tested here once back-translation data becomes scarce.
Load-bearing premise
The quality of the back-translated synthetic data stays high enough at very large volumes that extra data keeps helping rather than adding noise.
What would settle it
A controlled scaling curve for German-to-English in which BLEU or human scores stop rising or begin to fall once back-translated data exceeds a few hundred million sentences.
read the original abstract
The University of Edinburgh participated in the WMT19 Shared Task on News Translation in six language directions: English-to-Gujarati, Gujarati-to-English, English-to-Chinese, Chinese-to-English, German-to-English, and English-to-Czech. For all translation directions, we created or used back-translations of monolingual data in the target language as additional synthetic training data. For English-Gujarati, we also explored semi-supervised MT with cross-lingual language model pre-training, and translation pivoting through Hindi. For translation to and from Chinese, we investigated character-based tokenisation vs. sub-word segmentation of Chinese text. For German-to-English, we studied the impact of vast amounts of back-translated training data on translation quality, gaining a few additional insights over Edunov et al. (2018). For English-to-Czech, we compared different pre-processing and tokenisation regimes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript describes the University of Edinburgh's submissions to the WMT19 News Translation shared task across six language pairs (En-Gu, Gu-En, En-Zh, Zh-En, De-En, En-Cs). It reports the creation and use of back-translated monolingual data as synthetic training data for all directions, plus pair-specific experiments: semi-supervised MT with cross-lingual LM pre-training and Hindi pivoting for English-Gujarati; character-based vs. sub-word tokenisation for Chinese; a scaling study on large volumes of back-translated data for German-English (extending Edunov et al. 2018); and comparisons of pre-processing and tokenisation regimes for English-Czech.
Significance. As a shared-task system description, the paper documents concrete engineering choices and the existence of a scaling study on back-translation volume for German-English. When the reported experimental outcomes and the additional insights relative to prior work are reproducible from the full system details, it supplies useful reference material for the community on practical NMT data-augmentation and tokenisation decisions.
Simulated Author's Rebuttal
We thank the referee for their positive review of our WMT19 system description paper and for recommending acceptance. We are pleased that the manuscript is viewed as supplying useful reference material on practical NMT choices for data augmentation and tokenisation.
Circularity Check
No significant circularity
full rationale
The paper is a factual system-description report of WMT19 submissions. It contains no equations, no fitted parameters renamed as predictions, no derivations, and no load-bearing self-citations or uniqueness theorems. All claims reduce to the existence of the described experiments and comparisons with prior external work (e.g., Edunov et al. 2018), which are independent of the present manuscript.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.