Character-based Neural Machine Translation
read the original abstract
We introduce a neural machine translation model that views the input and output sentences as sequences of characters rather than words. Since word-level information provides a crucial source of bias, our input model composes representations of character sequences into representations of words (as determined by whitespace boundaries), and then these are translated using a joint attention/translation model. In the target language, the translation is modeled as a sequence of word vectors, but each word is generated one character at a time, conditional on the previous character generations in each word. As the representation and generation of words is performed at the character level, our model is capable of interpreting and generating unseen word forms. A secondary benefit of this approach is that it alleviates much of the challenges associated with preprocessing/tokenization of the source and target languages. We show that our model can achieve translation results that are on par with conventional word-based models.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
Language Models as Knowledge Bases?
BERT stores relational knowledge extractable via cloze queries without fine-tuning and matches supervised baselines on open-domain QA tasks.
-
PheMT: A Phenomenon-wise Dataset for Machine Translation Robustness on User-Generated Contents
PheMT is a phenomenon-wise dataset created to evaluate NMT robustness against linguistic phenomena in Japanese-English UGC translation, with experiments showing major performance drops on certain phenomena.
-
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
A single multilingual NMT model for 103 languages trained on 25B examples demonstrates transfer learning benefits for low-resource languages.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.