Character-based Neural Machine Translation

Alan W Black; Chris Dyer; Isabel Trancoso; Wang Ling

arxiv: 1511.04586 · v1 · pith:472M3X66new · submitted 2015-11-14 · 💻 cs.CL

Character-based Neural Machine Translation

Wang Ling , Isabel Trancoso , Chris Dyer , Alan W Black This is my paper

classification 💻 cs.CL

keywords modeltranslationcharacterwordwordsinputmachineneural

0 comments

read the original abstract

We introduce a neural machine translation model that views the input and output sentences as sequences of characters rather than words. Since word-level information provides a crucial source of bias, our input model composes representations of character sequences into representations of words (as determined by whitespace boundaries), and then these are translated using a joint attention/translation model. In the target language, the translation is modeled as a sequence of word vectors, but each word is generated one character at a time, conditional on the previous character generations in each word. As the representation and generation of words is performed at the character level, our model is capable of interpreting and generating unseen word forms. A secondary benefit of this approach is that it alleviates much of the challenges associated with preprocessing/tokenization of the source and target languages. We show that our model can achieve translation results that are on par with conventional word-based models.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Language Models as Knowledge Bases?
cs.CL 2019-09 accept novelty 7.0

BERT stores relational knowledge extractable via cloze queries without fine-tuning and matches supervised baselines on open-domain QA tasks.
PheMT: A Phenomenon-wise Dataset for Machine Translation Robustness on User-Generated Contents
cs.CL 2020-11 unverdicted novelty 6.0

PheMT is a phenomenon-wise dataset created to evaluate NMT robustness against linguistic phenomena in Japanese-English UGC translation, with experiments showing major performance drops on certain phenomena.
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
cs.CL 2019-07 unverdicted novelty 5.0

A single multilingual NMT model for 103 languages trained on 25B examples demonstrates transfer learning benefits for low-resource languages.