Sparse crosscoders on LLM checkpoint triplets track emergence, maintenance, and discontinuation of linguistic features during pretraining via a new RelIE metric.
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Emergent capabilities arise stochastically from abrupt learning of sparse attention patterns on synthetic linear map and cellular automata tasks, with larger models learning them earlier on average.
citing papers explorer
-
Crosscoding Through Time: Tracking Emergence & Consolidation Of Linguistic Representations Throughout LLM Pretraining
Sparse crosscoders on LLM checkpoint triplets track emergence, maintenance, and discontinuation of linguistic features during pretraining via a new RelIE metric.
-
Emergent Capabilities Arise Randomly from Learning Sparse Attention Patterns
Emergent capabilities arise stochastically from abrupt learning of sparse attention patterns on synthetic linear map and cellular automata tasks, with larger models learning them earlier on average.