Multistability is necessary for temporal horizon generalization in POMDPs, sufficient in simple tasks along with transient dynamics in complex ones, while monostable parallelizable RNNs like SSMs and gated linear RNNs fail by construction.
Parallelizing linear recurrent neural nets over sequence length
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.LG 3representative citing papers
GTF-DEER augments the DEER framework with Generalized Teacher Forcing to allow effective parallel training of nonlinear recurrent models on extremely long sequences, improving dynamical systems reconstruction for data with long time scales.
S5 uses a single MIMO state space model with S4-derived initialization to match S4 efficiency and reach 87.4% average accuracy on the Long Range Arena benchmark.
citing papers explorer
-
On the Importance of Multistability for Horizon Generalization in Reinforcement Learning
Multistability is necessary for temporal horizon generalization in POMDPs, sufficient in simple tasks along with transient dynamics in complex ones, while monostable parallelizable RNNs like SSMs and gated linear RNNs fail by construction.
-
Parallel-in-Time Training of Recurrent Neural Networks for Dynamical Systems Reconstruction
GTF-DEER augments the DEER framework with Generalized Teacher Forcing to allow effective parallel training of nonlinear recurrent models on extremely long sequences, improving dynamical systems reconstruction for data with long time scales.
-
Simplified State Space Layers for Sequence Modeling
S5 uses a single MIMO state space model with S4-derived initialization to match S4 efficiency and reach 87.4% average accuracy on the Long Range Arena benchmark.