pith. sign in

Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and considering three variables that were not taken into account in that previous study: the language in which the source side of the test set was originally written, the translation proficiency of the evaluators, and the provision of inter-sentential context. If we consider only original source text (i.e. not translated from another language, or translationese), then we find evidence showing that human parity has not been achieved. We compare the judgments of professional translators against those of non-experts and discover that those of the experts result in higher inter-annotator agreement and better discrimination between human and machine translations. In addition, we analyse the human translations of the test set and identify important translation issues. Finally, based on these findings, we provide a set of recommendations for future human evaluations of MT.

fields

cs.CL 1

years

2019 1

verdicts

UNVERDICTED 1

representative citing papers

Translationese in Machine Translation Evaluation

cs.CL · 2019-06-24 · unverdicted · novelty 6.0

Translationese in MT test sets biases evaluations, supporting exclusion of reverse-created data, re-evaluation of human-parity claims, and power analysis for reliable significance testing.

citing papers explorer

Showing 1 of 1 citing paper.

  • Translationese in Machine Translation Evaluation cs.CL · 2019-06-24 · unverdicted · none · ref 19 · internal anchor

    Translationese in MT test sets biases evaluations, supporting exclusion of reverse-created data, re-evaluation of human-parity claims, and power analysis for reliable significance testing.