pith. sign in

arxiv: 2605.04899 · v2 · submitted 2026-05-06 · 💻 cs.LG

A geometric relation of the error introduced by sampling a language model's output distribution to its internal state

Pith reviewed 2026-05-12 01:58 UTC · model grok-4.3

classification 💻 cs.LG
keywords language modelstoken embeddingsdifferential formsmodel interpretabilitychess reasoningworld modelsgeometric analysis
0
0 comments X p. Extension

The pith

A purely geometric 1-form from token embeddings has curvature that tracks a language model's internal world model on chess tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives an so(n)-valued 1-form that depends only on the geometry of token embeddings in GPT-style language models. This object captures the model's sensitivity to single-token changes at points where the output distribution spreads across multiple tokens. Despite its geometric origin, the curvature of the 1-form proves semantically meaningful: on chess reasoning tasks it couples to the model's world model, with transformations clustering by board region and respecting piece importance. The result indicates that token space geometry directly encodes how models internally represent problems.

Core claim

We derive an so(n)-valued 1-form that depends only on the geometry of the token embeddings. Despite this purely geometric origin, we show that its curvature is semantically meaningful: On chess reasoning tasks, the curvature couples to the world model of an off-the-shelf instruction-tuned model, with transformations clustering by board region and respecting piece importance. Our findings suggest that token space geometry directly reflects how models internally represent problems.

What carries the argument

The so(n)-valued 1-form built from token embedding geometry, whose curvature extracts semantic structure from the model's reasoning states.

If this is right

  • Sampling sensitivity during generation can be analyzed directly through embedding geometry without access to model weights.
  • Curvature provides a probe for internal world models on structured reasoning tasks.
  • The method applies to off-the-shelf instruction-tuned models without additional training or task-specific fine-tuning.
  • Token geometry encodes domain structure such as piece importance in a way that is visible in curvature patterns.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same geometric construction could be tested on other strategic domains such as math word problems or planning tasks to check whether curvature consistently reveals internal representations.
  • If the 1-form generalizes, curvature might serve as a lightweight diagnostic for whether a model has built an accurate internal model of a problem.
  • The approach raises the possibility of using embedding geometry alone to compare how different models represent the same task without running behavioral probes.

Load-bearing premise

The observed clustering of curvature transformations by board region and piece importance arises because the 1-form reflects the model's internal world model rather than from task-specific artifacts, data selection, or coincidental patterns.

What would settle it

Running the curvature analysis on chess tasks and finding that transformations do not cluster by board region or respect piece importance across multiple models and prompt variations would falsify the claim of semantic coupling.

Figures

Figures reproduced from arXiv: 2605.04899 by Albert F. Modenbach.

Figure 1
Figure 1. Figure 1: Impact of single-token perturbation at position 4 on reasoning and final move recommendation. Shows the completion before branching and then the difference between the continuations. In the difference, black, red, and green colours indicate coinciding, deleted, and new text in the perturbed completions. is a vector and b is a scalar. We refer to w as a world vector. Prior work on analysing neural network r… view at source ↗
Figure 2
Figure 2. Figure 2: Parallelepiped spanned by zt, (p1v1), and (p2v2) in three cases (left to right): Vectors are not aligned and probability mass is evenly distributed, probability mass is evenly distributed but vectors are aligned, vectors are not aligned but p2 ≪ p1. 4.1. Parallel Transport Equation 3 is the infinitesimal effect on y due to the blurring geometry. The total change on y due to blurring along a curve γ on the … view at source ↗
Figure 3
Figure 3. Figure 3: (a) The manifold of the last hidden activations produced by the F process. Each activation lies on the surface of an (am￾bient) sphere, and is connected via an instantaneous tunnel that it takes no time to travel through. (b) Two holonomy operators at two different points on the last hidden activation sphere. a moment assume all the probability mass is concentrated between the top two tokens, then we may w… view at source ↗
Figure 5
Figure 5. Figure 5: (a) Log space density and log-log survival plot of bulk and active distributions coupling strengths, showing they are in￾separable. (b) Distribution of coupling strengths to random world vector from the bulk (gray), maximum probe from the active set (green), and maximum probe from the bulk set (red). (c) Two dimensional PCA scatter of maximal probe couplings. We call the absolute cosine similarity the coup… view at source ↗
Figure 7
Figure 7. Figure 7: a) Scattering of coupling strengths as three dimensional PCA. Shades of blue and green indicate points belonging to left and right ear identified in two dimensional PCA. Red and yellow lines indicate greedy and branch points that form similar but separate lines. b) Coupling of ∆y to probes (by piece type) vs coupling of ∆q to same probes. Shows that the blurring geometry still twists the world model betwee… view at source ↗
Figure 8
Figure 8. Figure 8: (a) Average probe coupling to q vectors for the greedy and branch continuation plotted against each other. We note the systematically stronger coupling in the branch continuation. (b) Average coupling strength of ∆q vector to probes by piece type and opponent / own piece classification. 7 view at source ↗
Figure 9
Figure 9. Figure 9: Collection of results graph like those in figure 5, 6, 8, and 7 for the Mistral model. 11 view at source ↗
Figure 10
Figure 10. Figure 10: PCA projections of q = Hy − y for the true holonomy operator, versus a scale-matched random SO(n) matrix, and a holonomy operator rotating zt onto v1 and one onto v2. The structured V-pattern and higher variance concentration (52.9% vs 34.5% and 31.4%) in the true holonomy confirms that the semantic clustering arises from the specific geometry of token-space blurring, not arbitrary rotations. 12 view at source ↗
read the original abstract

GPT-style language models are sensitive to single-token changes at generation points where the predicted probability distribution is spread across multiple tokens. Viewing this sensitivity as a geometric property, we derive an $\mathfrak{so}(n)$-valued 1-form that depends only on the geometry of the token embeddings. Despite this purely geometric origin, we show that its curvature is semantically meaningful: On chess reasoning tasks, the curvature couples to the world model of an off-the-shelf instruction-tuned model, with transformations clustering by board region and respecting piece importance. Our findings suggest that token space geometry directly reflects how models internally represent problems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper derives an so(n)-valued 1-form on the token embedding space whose value at each token depends only on the geometry (inner products) of the embeddings; this 1-form is claimed to quantify the error introduced by sampling from the model's output distribution. The authors then compute its curvature along generation trajectories on chess prompts and report that the resulting transformations cluster by board region and respect piece importance, interpreting this as evidence that the embedding geometry directly reflects the model's internal world model.

Significance. If the empirical interpretation holds after controls, the work would be significant for LLM interpretability: it supplies a parameter-free geometric object (the so(n) 1-form) whose curvature can be computed from embeddings alone yet appears to track semantic structure. The purely geometric derivation is a clear strength, as it avoids any learned parameters or task-specific fitting and therefore admits direct falsification on other domains.

major comments (1)
  1. [chess reasoning experiments] In the chess reasoning experiments (the section presenting the curvature clustering results), the headline claim that the observed clustering by board region and piece importance demonstrates coupling to an internal world model is under-supported. No ablations are described that randomize board semantics (e.g., permuting piece identities or using syntactically similar but semantically scrambled board encodings) while preserving token co-occurrence statistics; without such controls it is impossible to distinguish the reported patterns from artifacts of the textual board representation and prompting format.
minor comments (2)
  1. [abstract] The abstract states the existence of the derivation but does not display the defining equation for the so(n)-valued 1-form; including the explicit formula (presumably in the methods section) would allow readers to verify the claimed dependence on embedding geometry alone.
  2. [methods] Notation for the curvature 2-form and its evaluation along trajectories should be introduced once and used consistently; occasional shifts between matrix-valued and Lie-algebra-valued descriptions reduce readability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and constructive feedback. The primary concern is addressed point-by-point below, and we have revised the manuscript to incorporate additional controls that strengthen the empirical claims.

read point-by-point responses
  1. Referee: In the chess reasoning experiments (the section presenting the curvature clustering results), the headline claim that the observed clustering by board region and piece importance demonstrates coupling to an internal world model is under-supported. No ablations are described that randomize board semantics (e.g., permuting piece identities or using syntactically similar but semantically scrambled board encodings) while preserving token co-occurrence statistics; without such controls it is impossible to distinguish the reported patterns from artifacts of the textual board representation and prompting format.

    Authors: We agree that the interpretation would be more robust with explicit controls that isolate semantic content from syntactic and co-occurrence artifacts. While the 1-form is derived solely from embedding geometry (inner products) and thus independent of any learned task-specific parameters, this does not by itself rule out that the observed curvature patterns arise from the particular textual encoding of chess boards rather than the model's internal world model. In the revised manuscript we have added the suggested ablations: (i) permutations of piece identities that preserve token co-occurrence statistics and syntactic structure, and (ii) syntactically similar but semantically scrambled board encodings. Under these controls the clustering by board region and piece importance is substantially reduced or eliminated, while the geometric properties of the 1-form remain unchanged. These results are now reported in the updated chess experiments section together with quantitative measures of cluster quality before and after randomization. We believe this addresses the concern and supports the claim that the curvature couples to semantic structure. revision: yes

Circularity Check

0 steps flagged

Derivation of so(n)-valued 1-form is geometrically self-contained with no load-bearing circular steps

full rationale

The paper explicitly derives the so(n)-valued 1-form from the geometry of token embeddings alone (inner products or angles between vectors), as stated in the abstract, without reference to semantic content, world models, or fitted parameters from the target chess data. Curvature is then computed along generation trajectories as a downstream empirical measurement. No equations or steps reduce by construction to the chess-task observations, no self-citations are invoked as uniqueness theorems, and no ansatz or renaming of known results is described that would make the central geometric claim tautological. The semantic-coupling interpretation is an external empirical claim, not part of the derivation chain itself.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available, preventing identification of specific free parameters, axioms, or invented entities. The text claims the 1-form depends solely on token embedding geometry with no new entities introduced.

pith-pipeline@v0.9.0 · 5392 in / 1149 out tokens · 58245 ms · 2026-05-12T01:58:26.401025+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages

  1. [1]

    2023 , eprint=

    Emergent Linear Representations in World Models of Self-Supervised Sequence Models , author=. 2023 , eprint=

  2. [2]

    2024 , eprint=

    Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task , author=. 2024 , eprint=

  3. [3]

    Gattringer, C

    Gattringer, Christof and Lang, Christian B. , title =. Quantum Chromodynamics on the Lattice: An Introductory Presentation , year =. doi:10.1007/978-3-642-01850-3 , isbn =

  4. [4]

    and Muniain, Javier P

    Baez, John C. and Muniain, Javier P. , year =. Gauge fields, knots and gravity , address =. Gauge fields, knots and gravity , isbn =

  5. [5]

    OpenAI Technical Report , year=

    Improving Language Understanding by Generative Pre-Training , author=. OpenAI Technical Report , year=

  6. [6]

    and Efendiev, Yalchin and Leung, Wing Tat , TITLE =

    Zhang, Zecheng and Chung, Eric T. and Efendiev, Yalchin and Leung, Wing Tat , TITLE =. Mathematics , VOLUME =. 2020 , NUMBER =

  7. [7]

    Root Mean Square Layer Normalization , url =

    Zhang, Biao and Sennrich, Rico , booktitle =. Root Mean Square Layer Normalization , url =

  8. [8]

    Qwen2.5: A Party of Foundation Models , url =

  9. [9]

    2025 , eprint=

    The Curved Spacetime of Transformer Architectures , author=. 2025 , eprint=

  10. [10]

    2022 , eprint=

    Chess as a Testbed for Language Model State Tracking , author=. 2022 , eprint=

  11. [11]

    First Conference on Language Modeling , year=

    Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models , author=. First Conference on Language Modeling , year=

  12. [12]

    2023 , eprint=

    Self-Consistency Improves Chain of Thought Reasoning in Language Models , author=. 2023 , eprint=

  13. [13]

    2024 , eprint=

    What's the Magic Word? A Control Theory of LLM Prompting , author=. 2024 , eprint=

  14. [14]

    2025 , month = mar, howpublished =

    Mistral Small 3.1 , author =. 2025 , month = mar, howpublished =

  15. [15]

    Advances in Neural Information Processing Systems , volume=

    Chain-of-thought reasoning without prompting , author=. Advances in Neural Information Processing Systems , volume=

  16. [16]

    interpreting

    nostalgebraist , year =. interpreting

  17. [17]

    James Bradbury and Roy Frostig and Peter Hawkins and Matthew James Johnson and Chris Leary and Dougal Maclaurin and George Necula and Adam Paszke and Jake Vander

  18. [18]

    PeerJ Computer Science , issn =

    SymPy: symbolic computing in Python , author =. PeerJ Computer Science , issn =

  19. [19]

    Hazineh and Zechen Zhang and Jeffery Chiu , title=

    Dean S. Hazineh and Zechen Zhang and Jeffery Chiu , title=. CoRR , volume=. 2023 , cdate=

  20. [20]

    The Twelfth International Conference on Learning Representations , year=

    Language Models Represent Space and Time , author=. The Twelfth International Conference on Learning Representations , year=

  21. [21]

    The Eleventh International Conference on Learning Representations , year=

    Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation , author=. The Eleventh International Conference on Learning Representations , year=

  22. [22]

    , title =

    Shao, Hang and Kumar, Abhishek and Thomas Fletcher, P. , title =. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops , month =

  23. [23]

    2022 , cdate=

    Yao Lu and Max Bartolo and Alastair Moore and Sebastian Riedel and Pontus Stenetorp , title=. 2022 , cdate=

  24. [24]

    International Conference on Learning Representations , year=

    The Curious Case of Neural Text Degeneration , author=. International Conference on Learning Representations , year=

  25. [25]

    Forty-first International Conference on Machine Learning , year=

    How Language Model Hallucinations Can Snowball , author=. Forty-first International Conference on Machine Learning , year=

  26. [26]

    2025 , eprint=

    Eliciting Latent Predictions from Transformers with the Tuned Lens , author=. 2025 , eprint=

  27. [27]

    2025 , eprint=

    Representation Engineering: A Top-Down Approach to AI Transparency , author=. 2025 , eprint=