pith. sign in

arxiv: 1810.13409 · v1 · pith:TTSJPPPNnew · submitted 2018-10-31 · 💻 cs.CL

You May Not Need Attention

classification 💻 cs.CL
keywords attentionmodeldecodingseparatetranslationwithoutanswerattention-based
0
0 comments X
read the original abstract

In NMT, how far can we get without attention and without separate encoding and decoding? To answer that question, we introduce a recurrent neural translation model that does not use attention and does not have a separate encoder and decoder. Our eager translation model is low-latency, writing target tokens as soon as it reads the first source token, and uses constant memory during decoding. It performs on par with the standard attention-based model of Bahdanau et al. (2014), and better on long sentences.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation

    cs.CL 2019-06 unverdicted novelty 6.0

    Reinforce-NAT and FS-decoder retrieve target sequential information for non-autoregressive translation, yielding higher BLEU than baseline NAT while preserving fast decoding and approaching autoregressive quality.