In: Advances in Neural Information Proces s- ing Systems (2020)

Tanaka, H · 2020 · arXiv 2006.05467

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Not How Many, But Which: Parameter Placement in Low-Rank Adaptation

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

Gradient-informed placement of LoRA parameters recovers full performance under GRPO while random placement does not, due to differences in gradient rank and stability across training regimes.

XTinyU-Net: Training-Free U-Net Scaling via Initialization-Time Sensitivity

eess.IV · 2026-05-10 · unverdicted · novelty 6.0 · 2 refs

A Jacobian sensitivity curve computed at initialization identifies the narrowest U-Net configuration that avoids performance collapse, matching nnU-Net accuracy with 400-1600x fewer parameters on six medical datasets.

Man, Machine, and Mathematics

math.OC · 2026-04-29 · unverdicted · novelty 5.0

A high-level outline is given for a unified theory that reduces learning to a small set of ideas from dynamical systems, geometry, and physics via definitions of solvable problems and parametrized methods.

Heterogeneous Connectivity in Sparse Networks: Fan-in Profiles, Gradient Hierarchy, and Topological Equilibria

cs.LG · 2026-04-12 · unverdicted · novelty 5.0

Arbitrary heterogeneous fan-in profiles in sparse networks match uniform random accuracy at high sparsity, but initializing RigL dynamic sparse training with equilibrium-matched lognormal profiles improves performance by up to 0.49% on classification tasks.

citing papers explorer

Showing 4 of 4 citing papers.

Not How Many, But Which: Parameter Placement in Low-Rank Adaptation cs.LG · 2026-05-12 · unverdicted · none · ref 80
Gradient-informed placement of LoRA parameters recovers full performance under GRPO while random placement does not, due to differences in gradient rank and stability across training regimes.
XTinyU-Net: Training-Free U-Net Scaling via Initialization-Time Sensitivity eess.IV · 2026-05-10 · unverdicted · none · ref 19 · 2 links
A Jacobian sensitivity curve computed at initialization identifies the narrowest U-Net configuration that avoids performance collapse, matching nnU-Net accuracy with 400-1600x fewer parameters on six medical datasets.
Man, Machine, and Mathematics math.OC · 2026-04-29 · unverdicted · none · ref 93
A high-level outline is given for a unified theory that reduces learning to a small set of ideas from dynamical systems, geometry, and physics via definitions of solvable problems and parametrized methods.
Heterogeneous Connectivity in Sparse Networks: Fan-in Profiles, Gradient Hierarchy, and Topological Equilibria cs.LG · 2026-04-12 · unverdicted · none · ref 24
Arbitrary heterogeneous fan-in profiles in sparse networks match uniform random accuracy at high sparsity, but initializing RigL dynamic sparse training with equilibrium-matched lognormal profiles improves performance by up to 0.49% on classification tasks.

In: Advances in Neural Information Proces s- ing Systems (2020)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer