Grokked transformers are implicit reasoners: A mechanistic journey to the edge of generalization

URLhttps://arxiv · 2024 · arXiv 2405.15071

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Tracing Persona Vectors Through LLM Pretraining

cs.CL · 2026-05-13 · unverdicted · novelty 8.0

Persona vectors form within the first 0.22% of LLM pretraining and remain effective for steering post-trained models, with continued refinement and transfer to other models.

The Long Delay to Arithmetic Generalization: When Learned Representations Outrun Behavior

cs.LG · 2026-03-30 · unverdicted · novelty 7.0

The grokking delay in encoder-decoder models on one-step Collatz prediction stems from decoder inability to use early-learned encoder representations of parity and residue structure, with numeral base acting as a strong inductive bias that can raise accuracy from failure to 99.8%.

Detecting overfitting in Neural Networks during long-horizon grokking using Random Matrix Theory

cs.LG · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

Random Matrix Theory detects overfitting via growing Correlation Traps in weight spectra during the anti-grokking phase of neural network training.

The Power of Power Law: Asymmetry Enables Compositional Reasoning

cs.AI · 2026-04-24 · unverdicted · novelty 6.0

Power-law data sampling creates beneficial asymmetry in the loss landscape that lets models acquire high-frequency skill compositions first, enabling more efficient learning of rare long-tail skills than uniform distributions.

How Do Language Models Compose Functions?

cs.CL · 2025-10-02 · conditional · novelty 6.0

LLMs solve compositional factual recall either by computing intermediates or directly, with mechanism choice correlated to translation geometry in embedding spaces.

Model Capacity Determines Grokking through Competing Memorisation and Generalisation Speeds

cs.LG · 2026-05-10 · unverdicted · novelty 5.0

Grokking emerges near the model size where memorization timescale T_mem(P) intersects generalization timescale T_gen(P) on modular arithmetic.

citing papers explorer

Showing 6 of 6 citing papers.

Tracing Persona Vectors Through LLM Pretraining cs.CL · 2026-05-13 · unverdicted · none · ref 12
Persona vectors form within the first 0.22% of LLM pretraining and remain effective for steering post-trained models, with continued refinement and transfer to other models.
The Long Delay to Arithmetic Generalization: When Learned Representations Outrun Behavior cs.LG · 2026-03-30 · unverdicted · none · ref 29
The grokking delay in encoder-decoder models on one-step Collatz prediction stems from decoder inability to use early-learned encoder representations of parity and residue structure, with numeral base acting as a strong inductive bias that can raise accuracy from failure to 99.8%.
Detecting overfitting in Neural Networks during long-horizon grokking using Random Matrix Theory cs.LG · 2026-05-12 · unverdicted · none · ref 24 · 2 links
Random Matrix Theory detects overfitting via growing Correlation Traps in weight spectra during the anti-grokking phase of neural network training.
The Power of Power Law: Asymmetry Enables Compositional Reasoning cs.AI · 2026-04-24 · unverdicted · none · ref 51
Power-law data sampling creates beneficial asymmetry in the loss landscape that lets models acquire high-frequency skill compositions first, enabling more efficient learning of rare long-tail skills than uniform distributions.
How Do Language Models Compose Functions? cs.CL · 2025-10-02 · conditional · none · ref 42
LLMs solve compositional factual recall either by computing intermediates or directly, with mechanism choice correlated to translation geometry in embedding spaces.
Model Capacity Determines Grokking through Competing Memorisation and Generalisation Speeds cs.LG · 2026-05-10 · unverdicted · none · ref 15
Grokking emerges near the model size where memorization timescale T_mem(P) intersects generalization timescale T_gen(P) on modular arithmetic.

Grokked transformers are implicit reasoners: A mechanistic journey to the edge of generalization

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer