pith. sign in

Attention is all you need

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.CL 1

years

2019 1

verdicts

UNVERDICTED 1

representative citing papers

Sharing Attention Weights for Fast Transformer

cs.CL · 2019-06-26 · unverdicted · novelty 4.0

Sharing attention weights in adjacent Transformer layers yields 1.3X inference speedup with negligible BLEU loss on ten WMT and NIST tasks.

citing papers explorer

Showing 1 of 1 citing paper.

  • Sharing Attention Weights for Fast Transformer cs.CL · 2019-06-26 · unverdicted · none · ref 18

    Sharing attention weights in adjacent Transformer layers yields 1.3X inference speedup with negligible BLEU loss on ten WMT and NIST tasks.