Configuration-to-performance scaling law with neural ansatz

Zhang, H · 2026 · arXiv 2602.10300

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

On the Nonlinearity of Learning Rate Scaling for LLM Training

cs.LG · 2026-06-28 · unverdicted · novelty 6.0

Optimal learning rate for models from 22M to 707M parameters shows nonlinear upward curvature with scale that disappears under effective learning rate and data-scale extrapolation.

citing papers explorer

Showing 1 of 1 citing paper.

On the Nonlinearity of Learning Rate Scaling for LLM Training cs.LG · 2026-06-28 · unverdicted · none · ref 47
Optimal learning rate for models from 22M to 707M parameters shows nonlinear upward curvature with scale that disappears under effective learning rate and data-scale extrapolation.

Configuration-to-performance scaling law with neural ansatz

fields

years

verdicts

representative citing papers

citing papers explorer