Recurrent Additive Networks

Kenton Lee , Omer Levy , Luke Zettlemoyer

Authors on Pith no claims yet

classification 💻 cs.CL

keywords additivenetworksstategatedgatesinputlstmsproblems

read the original abstract

We introduce recurrent additive networks (RANs), a new gated RNN which is distinguished by the use of purely additive latent state updates. At every time step, the new state is computed as a gated component-wise sum of the input and the previous state, without any of the non-linearities commonly used in RNN transition dynamics. We formally show that RAN states are weighted sums of the input vectors, and that the gates only contribute to computing the weights of these sums. Despite this relatively simple functional form, experiments demonstrate that RANs perform on par with LSTMs on benchmark language modeling problems. This result shows that many of the non-linear computations in LSTMs and related networks are not essential, at least for the problems we consider, and suggests that the gates are doing more of the computational work than previously understood.

This paper has not been read by Pith yet.

Recurrent Additive Networks

discussion (0)