pith. machine review for the scientific record. sign in

arxiv: 1705.07393 · v2 · submitted 2017-05-21 · 💻 cs.CL

Recognition: unknown

Recurrent Additive Networks

Authors on Pith no claims yet
classification 💻 cs.CL
keywords additivenetworksstategatedgatesinputlstmsproblems
0
0 comments X
read the original abstract

We introduce recurrent additive networks (RANs), a new gated RNN which is distinguished by the use of purely additive latent state updates. At every time step, the new state is computed as a gated component-wise sum of the input and the previous state, without any of the non-linearities commonly used in RNN transition dynamics. We formally show that RAN states are weighted sums of the input vectors, and that the gates only contribute to computing the weights of these sums. Despite this relatively simple functional form, experiments demonstrate that RANs perform on par with LSTMs on benchmark language modeling problems. This result shows that many of the non-linear computations in LSTMs and related networks are not essential, at least for the problems we consider, and suggests that the gates are doing more of the computational work than previously understood.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.