Gradient-informed placement of LoRA parameters recovers full performance under GRPO while random placement does not, due to differences in gradient rank and stability across training regimes.
In: Advances in Neural Information Proces s- ing Systems (2020)
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
A Jacobian sensitivity curve computed at initialization identifies the narrowest U-Net configuration that avoids performance collapse, matching nnU-Net accuracy with 400-1600x fewer parameters on six medical datasets.
A high-level outline is given for a unified theory that reduces learning to a small set of ideas from dynamical systems, geometry, and physics via definitions of solvable problems and parametrized methods.
Arbitrary heterogeneous fan-in profiles in sparse networks match uniform random accuracy at high sparsity, but initializing RigL dynamic sparse training with equilibrium-matched lognormal profiles improves performance by up to 0.49% on classification tasks.
citing papers explorer
-
Not How Many, But Which: Parameter Placement in Low-Rank Adaptation
Gradient-informed placement of LoRA parameters recovers full performance under GRPO while random placement does not, due to differences in gradient rank and stability across training regimes.
-
XTinyU-Net: Training-Free U-Net Scaling via Initialization-Time Sensitivity
A Jacobian sensitivity curve computed at initialization identifies the narrowest U-Net configuration that avoids performance collapse, matching nnU-Net accuracy with 400-1600x fewer parameters on six medical datasets.
-
Man, Machine, and Mathematics
A high-level outline is given for a unified theory that reduces learning to a small set of ideas from dynamical systems, geometry, and physics via definitions of solvable problems and parametrized methods.
-
Heterogeneous Connectivity in Sparse Networks: Fan-in Profiles, Gradient Hierarchy, and Topological Equilibria
Arbitrary heterogeneous fan-in profiles in sparse networks match uniform random accuracy at high sparsity, but initializing RigL dynamic sparse training with equilibrium-matched lognormal profiles improves performance by up to 0.49% on classification tasks.