Pretraining data determines loss-to-loss scaling laws in LLMs, while model size, optimization, tokenizer, and architecture have limited impact.
Measuring robustness to natural distribution shifts in image classification, 2020
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws
Pretraining data determines loss-to-loss scaling laws in LLMs, while model size, optimization, tokenizer, and architecture have limited impact.