Evaluating Machine Translation Performance on Chinese Idioms with a Blacklist Method

Bonnie Webber; Federico Fancellu; Rico Sennrich; Yutong Shao

arxiv: 1711.07646 · v3 · pith:374JLHP2new · submitted 2017-11-21 · 💻 cs.CL

Evaluating Machine Translation Performance on Chinese Idioms with a Blacklist Method

Yutong Shao , Rico Sennrich , Bonnie Webber , Federico Fancellu This is my paper

classification 💻 cs.CL

keywords translationidiomsliteralblacklisterrorevaluationmethodchinese

0 comments

read the original abstract

Idiom translation is a challenging problem in machine translation because the meaning of idioms is non-compositional, and a literal (word-by-word) translation is likely to be wrong. In this paper, we focus on evaluating the quality of idiom translation of MT systems. We introduce a new evaluation method based on an idiom-specific blacklist of literal translations, based on the insight that the occurrence of any blacklisted words in the translation output indicates a likely translation error. We introduce a dataset, CIBB (Chinese Idioms Blacklists Bank), and perform an evaluation of a state-of-the-art Chinese-English neural MT system. Our evaluation confirms that a sizable number of idioms in our test set are mistranslated (46.1%), that literal translation error is a common error type, and that our blacklist method is effective at identifying literal translation errors.

This paper has not been read by Pith yet.

Evaluating Machine Translation Performance on Chinese Idioms with a Blacklist Method

discussion (0)