Persistent homology detects a sharp increase in maximum and total H1 persistence during grokking on modular arithmetic, offering a topological diagnostic that links representation geometry to generalization.
arXiv preprint arXiv:2408.08944 , year =
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
Epistemic uncertainty collapses sharply at grokking in in-context learning transformers, serving as a diagnostic of delayed generalization and linked via a Bayesian linear model to a shared spectral mechanism.
Grokking emerges near the model size where memorization timescale T_mem(P) intersects generalization timescale T_gen(P) on modular arithmetic.
Proposes a two-gradient-field model with candidate order parameters alpha_dagger and kappa_c to unify phase transitions across learning theory and non-equilibrium chemistry.
citing papers explorer
-
Topological Signatures of Grokking
Persistent homology detects a sharp increase in maximum and total H1 persistence during grokking on modular arithmetic, offering a topological diagnostic that links representation geometry to generalization.
-
A Bayesian Perspective on the Role of Epistemic Uncertainty for Delayed Generalization in In-Context Learning
Epistemic uncertainty collapses sharply at grokking in in-context learning transformers, serving as a diagnostic of delayed generalization and linked via a Bayesian linear model to a shared spectral mechanism.
-
Model Capacity Determines Grokking through Competing Memorisation and Generalisation Speeds
Grokking emerges near the model size where memorization timescale T_mem(P) intersects generalization timescale T_gen(P) on modular arithmetic.
-
Phase Transitions in Driven Informational Systems: A Two-Field Perspective on Learning Theory and Non-Equilibrium Chemistry
Proposes a two-gradient-field model with candidate order parameters alpha_dagger and kappa_c to unify phase transitions across learning theory and non-equilibrium chemistry.