LLM residual streams during addition form an Iso-Raw-Sum Trajectory anchored by digit semantics and modulated by continuous carry signals, with errors arising as geometric slippages across quantization thresholds in a noisy model.
Planning in a recurrent neural network that plays Sokoban
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Small transformers on HMM prediction tasks exhibit correlated scaling between performance and linear encoding of belief distributions in residual activations.
citing papers explorer
-
The Shape of Addition: Geometric Structures of Arithmetic in Large Language Models
LLM residual streams during addition form an Iso-Raw-Sum Trajectory anchored by digit semantics and modulated by continuous carry signals, with errors arising as geometric slippages across quantization thresholds in a noisy model.
-
Structure and Scale in Simplicial Sequence Modelling
Small transformers on HMM prediction tasks exhibit correlated scaling between performance and linear encoding of belief distributions in residual activations.