Data mixing induces phase transitions in LLM knowledge acquisition from dense sources, with critical thresholds in model size and mixing ratio that follow power laws.
In other words, the conditional distribution of the next token y given the context x remains consistent across both domains and is unaffected by variations in values ofθ 1 andθ 2
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
Data mixing induces phase transitions in LLM knowledge acquisition from dense sources, with critical thresholds in model size and mixing ratio that follow power laws.