pith. the verified trust layer for science. sign in

arxiv: 1907.08158 · v1 · pith:F75DNJYDnew · submitted 2019-07-18 · 💻 cs.CL

Understanding Neural Machine Translation by Simplification: The Case of Encoder-free Models

classification 💻 cs.CL
keywords encoder-freemodelsembeddingsneuralattentionmachinesourcetranslation
0
0 comments X p. Extension
Add this Pith Number to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{F75DNJYD}

Prints a linked pith:F75DNJYD badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

In this paper, we try to understand neural machine translation (NMT) via simplifying NMT architectures and training encoder-free NMT models. In an encoder-free model, the sums of word embeddings and positional embeddings represent the source. The decoder is a standard Transformer or recurrent neural network that directly attends to embeddings via attention mechanisms. Experimental results show (1) that the attention mechanism in encoder-free models acts as a strong feature extractor, (2) that the word embeddings in encoder-free models are competitive to those in conventional models, (3) that non-contextualized source representations lead to a big performance drop, and (4) that encoder-free models have different effects on alignment quality for German-English and Chinese-English.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.