Parallelizing linear recurrent neural nets over sequence length

Eric Martin, Chris Cundy · 2018

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

On the Importance of Multistability for Horizon Generalization in Reinforcement Learning

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

Multistability is necessary for temporal horizon generalization in POMDPs, sufficient in simple tasks along with transient dynamics in complex ones, while monostable parallelizable RNNs like SSMs and gated linear RNNs fail by construction.

Parallel-in-Time Training of Recurrent Neural Networks for Dynamical Systems Reconstruction

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

GTF-DEER augments the DEER framework with Generalized Teacher Forcing to allow effective parallel training of nonlinear recurrent models on extremely long sequences, improving dynamical systems reconstruction for data with long time scales.

Simplified State Space Layers for Sequence Modeling

cs.LG · 2022-08-09 · accept · novelty 6.0

S5 uses a single MIMO state space model with S4-derived initialization to match S4 efficiency and reach 87.4% average accuracy on the Long Range Arena benchmark.

citing papers explorer

Showing 3 of 3 citing papers.

On the Importance of Multistability for Horizon Generalization in Reinforcement Learning cs.LG · 2026-05-12 · unverdicted · none · ref 7
Multistability is necessary for temporal horizon generalization in POMDPs, sufficient in simple tasks along with transient dynamics in complex ones, while monostable parallelizable RNNs like SSMs and gated linear RNNs fail by construction.
Parallel-in-Time Training of Recurrent Neural Networks for Dynamical Systems Reconstruction cs.LG · 2026-05-12 · unverdicted · none · ref 49
GTF-DEER augments the DEER framework with Generalized Teacher Forcing to allow effective parallel training of nonlinear recurrent models on extremely long sequences, improving dynamical systems reconstruction for data with long time scales.
Simplified State Space Layers for Sequence Modeling cs.LG · 2022-08-09 · accept · none · ref 130
S5 uses a single MIMO state space model with S4-derived initialization to match S4 efficiency and reach 87.4% average accuracy on the Long Range Arena benchmark.

Parallelizing linear recurrent neural nets over sequence length

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer