Exact RMT-derived formula for CoT generalization error in linear ICL reveals phase transition between exponential/polynomial improvement, saturation, and overthinking regimes depending on depth, pretraining, and context length.
Two-Point Deterministic Equivalence for Stochastic Gradient Dynamics in Linear Models
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
We derive a novel deterministic equivalence for the two-point function of a random matrix resolvent. Using this result, we give a unified derivation of the performance of a wide variety of high-dimensional linear models trained with stochastic gradient descent. This includes high-dimensional linear regression, kernel regression, and linear random feature models. Our results include previously known asymptotics as well as novel ones.
fields
stat.ML 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
An Asymptotic Theory of Chain-of-Thought in In-Context Learning
Exact RMT-derived formula for CoT generalization error in linear ICL reveals phase transition between exponential/polynomial improvement, saturation, and overthinking regimes depending on depth, pretraining, and context length.