Title resolution pending

Configuration-to-Performance Scaling Law with Neural Ansatz , year = · 2026 · arXiv 2602.10300

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

On the Nonlinearity of Learning Rate Scaling for LLM Training

cs.LG · 2026-06-28 · unverdicted · novelty 6.0

Optimal learning rate for models from 22M to 707M parameters shows nonlinear upward curvature with scale that disappears under effective learning rate and data-scale extrapolation.

How to Allocate Your Tokens? Scaling Laws with Training Steps and Batch Size

cs.LG · 2026-07-01 · unverdicted · novelty 5.0

Proposes a three-term scaling law for model size, training steps and batch size that recovers optimal batch size scaling and can be fitted using fewer runs by incorporating suboptimal batch sizes.

citing papers explorer

Showing 2 of 2 citing papers after filters.

On the Nonlinearity of Learning Rate Scaling for LLM Training cs.LG · 2026-06-28 · unverdicted · none · ref 47
Optimal learning rate for models from 22M to 707M parameters shows nonlinear upward curvature with scale that disappears under effective learning rate and data-scale extrapolation.
How to Allocate Your Tokens? Scaling Laws with Training Steps and Batch Size cs.LG · 2026-07-01 · unverdicted · none · ref 27
Proposes a three-term scaling law for model size, training steps and batch size that recovers optimal batch size scaling and can be fitted using fewer runs by incorporating suboptimal batch sizes.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer