Learning in low-rank RNNs reduces to an exact low-dimensional ODE system in overlap space, where loss-invisible overlaps encode training history without affecting function.
Neural networks and physical systems with emergent collective computational abilities.Proceedings of the national academy of sciences, 79(8):2554–2558
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.LG 4years
2026 4roles
background 2polarities
background 2representative citing papers
Discrete decentralized learning dynamics on manifolds converge uniformly to an overdamped Langevin SDE whose stationary states produce orthogonally disentangled, linearly separable features.
Cubit replaces Transformer's attention with a closed-form Kernel Ridge Regression token mixer and reports larger gains as training sequence length increases.
citing papers explorer
-
Learning reveals invisible structure in low-rank RNNs
Learning in low-rank RNNs reduces to an exact low-dimensional ODE system in overlap space, where loss-invisible overlaps encode training history without affecting function.
-
Continuous Limits of Coupled Flows in Representation Learning
Discrete decentralized learning dynamics on manifolds converge uniformly to an overdamped Langevin SDE whose stationary states produce orthogonally disentangled, linearly separable features.
-
Cubit: Token Mixer with Kernel Ridge Regression
Cubit replaces Transformer's attention with a closed-form Kernel Ridge Regression token mixer and reports larger gains as training sequence length increases.
- The General Theory of Localization Methods