Title resolution pending

On the Role of Batch Size in Stochastic Conditional Gradient Methods , year = · arXiv 2603.21191

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

How to Allocate Your Tokens? Scaling Laws with Training Steps and Batch Size

cs.LG · 2026-07-01 · unverdicted · novelty 5.0

Proposes a three-term scaling law for model size, training steps and batch size that recovers optimal batch size scaling and can be fitted using fewer runs by incorporating suboptimal batch sizes.

citing papers explorer

Showing 1 of 1 citing paper.

How to Allocate Your Tokens? Scaling Laws with Training Steps and Batch Size cs.LG · 2026-07-01 · unverdicted · none · ref 11
Proposes a three-term scaling law for model size, training steps and batch size that recovers optimal batch size scaling and can be fitted using fewer runs by incorporating suboptimal batch sizes.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer