Derives exact Frobenius norm imbalance identity for deep nonlinear networks, classifies activations into four classes, and obtains critical-depth escape time law τ★ = Θ(ε^{-(r-2)}) from reduction to scalar ODE on permutation-symmetric submanifold.
arXiv preprint arXiv:2012.04728 , year=
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6verdicts
UNVERDICTED 6roles
background 1polarities
background 1representative citing papers
Data symmetries generically do not induce conserved quantities in NN training for analytic non-polynomial losses, but can for MSE with tensorizable networks.
Dead directions recover Watanabe's RLCT contribution and triple (λ, m, ν) from directional Fisher curvature decay rates in original parameter space for singular models, extended via K-FAC to networks and gauge-equivariant optimizers.
Learning in low-rank RNNs reduces to an exact low-dimensional ODE system in overlap space, where loss-invisible overlaps encode training history without affecting function.
Derives second-order path-kernel interpolation formulas for gradient descent, SGD, and momentum training, adding curvature terms and a concentration estimate around the expected prediction.
SOLAR introduces a self-optimizing agent using meta-learning on model weights and RL-driven strategy discovery for lifelong adaptation in LLMs, claiming superior performance on reasoning tasks across domains.
citing papers explorer
No citing papers match the current filters.