arxiv: 2605.14125 · v1 · submitted 2026-05-13 · 💻 cs.CL

Recognition: 3 theorem links

· Lean Theorem

Polar probe linearly decodes semantic structures from LLMs

Pablo J. Diego-Sim\'on , Pierre Orhan , Yair Lakretz , Jean-R\'emi King

Authors on Pith no claims yet

Pith reviewed 2026-05-15 04:56 UTC · model grok-4.3

classification 💻 cs.CL

keywords semantic structureslarge language modelspolar probelinear decodingembeddingsrelation bindinggeometric codeactivations

0 comments

The pith

Large language models encode the existence and type of semantic relations as distance and direction between embeddings in a linear subspace of their activations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes that LLMs bind concepts into complex structures using a simple geometric code: the distance between entity embeddings signals whether a relation exists, while the direction signals the relation's type. This code is tested by feeding natural-language descriptions of tasks in arithmetic, visual scenes, family trees, metro maps, and social interactions into multiple LLMs and applying a Polar Probe to recover the underlying structures from layer activations. The probe succeeds in linearly decoding the true structures from a subspace of middle-layer activations, with better recovery in stronger models and degradation as structure size increases. The quality of this representation also tracks the model's ability to answer questions about the structures, and the code generalizes to new entities and relation types.

Core claim

The true semantic structures can be linearly recovered with a Polar Probe targeting a subspace of LLMs' layer activations, where distance between embeddings represents the existence of relations and direction represents their type. This polar code emerges mostly in middle layers, improves with model performance, generalizes to novel entities and relations, but degrades with larger structures, and its quality correlates with the LLM's question-answering accuracy on the structures.

What carries the argument

The Polar Probe, a linear decoder applied to a subspace of layer activations that extracts distance-direction geometry to represent relation existence and type.

If this is right

The polar representation emerges primarily in middle layers of the network.
Decoding quality improves as overall LLM performance on the tasks increases.
The code generalizes to previously unseen entities and relation types.
Representation quality declines as the size of the semantic structure grows.
Better polar decoding predicts better ability to answer questions about the structure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the geometry is causal, targeted interventions on embedding distances and directions could be used to edit or control the model's internal knowledge of relations.
Similar distance-direction codes might appear in other modalities or architectures, suggesting a general principle for binding representations.
The degradation with structure size points to a potential limit on how complex relational knowledge LLMs can maintain in this format.

Load-bearing premise

That the distance-direction geometry in embedding space is the actual causal mechanism the model uses rather than a convenient post-hoc linear fit that happens to correlate with task performance.

What would settle it

Observing that perturbing the distances and directions between embeddings in the identified subspace leaves the model's answers to questions about the semantic structures unchanged.

Figures

Figures reproduced from arXiv: 2605.14125 by Jean-R\'emi King, Pablo J. Diego-Sim\'on, Pierre Orhan, Yair Lakretz.

**Figure 1.** Figure 1: Polar probes linearly read out semantic structures from LLM activations. A: A natural-language description specifies a set of entities and their typed relations (illustrated here for spatial layout , where entities are objects and relations are spatial predicates (left of/right, top of/ below). B: The description corresponds to a semantic structure, formalized as a relational graph whose nodes are entitie… view at source ↗

**Figure 2.** Figure 2: Polar probe geometry mirrors the gold semantic structure. Top: Expected polar probe geometry for semantic structures from every domain. Bottom: 2D PCA of probe-space entity representations from 10 different descriptions of a semantic structure in the test set; large markers denote entity centroids and lines indicate gold relations. The projections tend to follow the polar code: direction encodes relation t… view at source ↗

**Figure 3.** Figure 3: Semantic structures are most linearly decodable in the middle layers, only in pretrained LLMs. Spearman’s ρ for relation existence (blue) and type (orange) decoded by a polar probe from Llama3-8B across layers in five domains. In pretrained models (solid), decoding peaks around layers 12–15 and remains high in late layers. In randomly initialized models (dashed), both scores remain close to chance across a… view at source ↗

**Figure 4.** Figure 4: Polar probe performance grows with pretraining, falls with the number of entities in the relational graph, and degrades with out-of-distribution (OOD) entities and relation surface forms. Top: Spearman’s ρ for relation existence (blue) and type (orange) vs. pretraining steps at the best layer of OLMo-7B. Middle: Polar probe performance vs. number of entities in the graph at the best layer of Llama3.1-8B. B… view at source ↗

**Figure 5.** Figure 5 [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Polar probe prototypes steer LLM predictions: Probability of a correct answer under steering at layer 11 of Llama3-8B. LLM size. To evaluate whether the capacity of the LLMs influenced the geometry of semantic structures, we trained and evaluated polar probes on models from the Pythia suite spanning 14M–6.9B parameters (Biderman et al., 2023). Polar probe performance increases with model size, for each o… view at source ↗

**Figure 7.** Figure 7: Semantic domain subspaces are largely disjoint, with a spatial–ordinal overlap. Cross-domain alignment at the best layer of Llama3-8B, quantified via the principal angles in LLM space (higher = more overlap). 5.2 Correlation with downstream predictions To determine whether polar representations are merely epiphenomenal or instead reflect representations used by the LLM, we conduct a representation–behavior… view at source ↗

**Figure 8.** Figure 8: Type probe errors predict LLM’s downstream performance. Layerwise Spearman correlation between existence (blue) and type (orange) probe-space errors and logit of the correct answer on a Question-Answering task over semantic structures. 5.3 Additional baselines [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 9.** Figure 9: Semantic structures are encoded in linear subspaces of LLM activations. Training prototype vectors with a fixed identity probe yields Existence scores close to chance. Type scores are high in the spatial layout and family tree domains but remain near chance elsewhere. Finally, the linear baseline, shown as a horizontal line, performs very close to chance across all domains. 5.4 Naturalistic evaluation We e… view at source ↗

**Figure 10.** Figure 10: Polar probes trained on the Spatial Layout controlled dataset generalize to a naturalistic and multilingual sentences. Polar probes are trained on the controlled Spatial Layout dataset and evaluated on an LLM-generated naturalistic dataset within the same semantic domain. Probe performance substantially exceeds both chance level and the untrained baseline. 5.5 Causal interventions 0 10 20 30 40 50 0.06 0.… view at source ↗

**Figure 11.** Figure 11: Interventions along polar probe directions causally modulate model predictions, with the strongest effects in middle layers.. Probability of the correct token under positive and negative direction steering. In middle layers, positive-prototype interventions reliably increase the probability and negative-prototype interventions decrease it. 0 5 10 15 20 25 30 Layer 0.00 0.01 0.02 0.03 0.04 0.05 Steering e… view at source ↗

**Figure 12.** Figure 12: Middle layers show maximal response to causal interventions. Layerwise mean difference in probability between positive-signed and negative-signed prototypes. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_12.png] view at source ↗

**Figure 13.** Figure 13: Semantic domain subspaces and uncontextualized embeddings are disjoint. Cross-domain alignment at the best layer of Llama3-8B, quantified via the principal angles in LLM space (higher = more overlap) 5.7 Predicted vs.Ground-truth distances 1 2 3 4 5 6 Semantic graph distance 0.001 0.002 0.003 0.004 0.005 Probe distance =0.83 [PITH_FULL_IMAGE:figures/full_fig_p018_13.png] view at source ↗

**Figure 14.** Figure 14: Semantic distance and Probe distance used to calculate Spearman’s ρ 18 [PITH_FULL_IMAGE:figures/full_fig_p018_14.png] view at source ↗

read the original abstract

How do artificial neural networks bind concepts to form complex semantic structures? Here, we propose a simple neural code, whereby the existence and the type of relations between entities are represented by the distance and the direction between their embeddings, respectively. We test this hypothesis in a variety of Large Language Models (LLMs), each input with natural-language descriptions of minimalist tasks from five different domains: arithmetic, visual scenes, family trees, metro maps and social interactions. Results show that the true semantic structures can be linearly recovered with a Polar Probe targeting a subspace of LLMs' layer activations. Second, this code emerges mostly in middle layers and improves with LLM performance. Third, these Polar Probes successfully generalize to new entities and relation types, but degrades with the size of the semantic structure. Finally, the quality of the polar representation correlates with the LLM's ability to answer questions about the semantic structure. Together, these findings suggest that LLMs learn to build complex semantic structures by binding representations with a simple geometrical principle.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Polar probes recover relations linearly from LLM activations across domains but lack evidence they capture the model's actual binding mechanism.

read the letter

The paper's core result is that distance and direction between entity embeddings in LLM activations let a linear polar probe recover relational structures. They test this on minimalist tasks in arithmetic, visual scenes, family trees, metro maps, and social interactions, finding the signal strongest in middle layers, stronger in better models, able to generalize to new entities and relation types, and correlated with the model's own question-answering accuracy. Performance falls as structures get bigger. This gives a clean, testable geometric hypothesis for how LLMs might bind concepts and ties the probe quality directly to task behavior, which is a useful check. The cross-domain consistency is also straightforward to verify. The main limitation is that successful linear decoding does not show the model uses this distance-direction geometry internally. The results are consistent with the probe simply finding any useful linear mapping on the right subspace. The abstract does not describe ablations against other linear decoders or interventions that would test whether altering the polar axes specifically hurts performance. Without those, the causal claim stays indirect. Methods details on held-out structures and chance controls also need checking in the full text. This is worth a serious referee for interpretability groups. The hypothesis is simple enough to extend or falsify, and the experiments are concrete even if they need tighter controls to move beyond correlational decoding.

Referee Report

3 major / 2 minor

Summary. The paper claims that LLMs bind concepts into semantic structures via a polar geometry in embedding space, with distance encoding relation existence and direction encoding relation type. A Polar Probe linearly decodes these structures from a subspace of middle-layer activations across five domains (arithmetic, visual scenes, family trees, metro maps, social interactions). The representation emerges in middle layers, scales with model performance, generalizes to novel entities and relations (but degrades with structure size), and correlates with the model's question-answering accuracy on the structures.

Significance. If the geometric code is shown to be more than a post-hoc fit, the work supplies a simple, falsifiable account of compositional representation in LLMs that links representational geometry directly to task behavior. The reported generalization and performance correlation are concrete strengths that could inform interpretability methods and model analysis.

major comments (3)

[Methods] Methods section: the abstract states successful linear recovery and generalization, yet provides no detail on whether probe training used held-out structures or whether accuracy metrics were compared against chance-level or random linear baselines; this leaves open whether the polar geometry is required or any linear decoder would suffice.
[Results] Results: no ablation is reported that compares Polar Probes against arbitrary linear decoders trained on the identical activation subspace. Without this comparison, the distance-direction geometry cannot be distinguished from a convenient post-hoc mapping that happens to correlate with the annotated structures.
[Discussion] Discussion: the claim that the geometry constitutes the model's internal binding mechanism requires interventional evidence (e.g., editing activations along polar axes and measuring downstream task degradation), which is absent; correlational decoding alone does not establish causality.

minor comments (2)

[Abstract] Abstract: define the Polar Probe and the precise subspace selection criterion more explicitly.
[Figures] Figures: ensure all panels include chance-level baselines and error bars for the reported generalization and correlation results.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and insightful comments. We have carefully considered each point and provide point-by-point responses below, along with planned revisions to the manuscript.

read point-by-point responses

Referee: [Methods] Methods section: the abstract states successful linear recovery and generalization, yet provides no detail on whether probe training used held-out structures or whether accuracy metrics were compared against chance-level or random linear baselines; this leaves open whether the polar geometry is required or any linear decoder would suffice.

Authors: We appreciate this observation. The Methods section describes the use of held-out structures for training the Polar Probes to test generalization to new entities and relations. Accuracy is evaluated against chance-level baselines appropriate for each task. To make this clearer, we will expand the Methods section with explicit details on the data splits, training procedure, and baseline comparisons including random linear probes. revision: yes
Referee: [Results] Results: no ablation is reported that compares Polar Probes against arbitrary linear decoders trained on the identical activation subspace. Without this comparison, the distance-direction geometry cannot be distinguished from a convenient post-hoc mapping that happens to correlate with the annotated structures.

Authors: We acknowledge the value of such an ablation. While the Polar Probe is tailored to the polar geometry hypothesis, we did not directly compare it to generic linear decoders in the original submission. In the revision, we will include an ablation where we train standard linear probes (e.g., MLPs or linear classifiers) on the same middle-layer activations and report their performance in recovering the semantic structures. This will help demonstrate whether the specific distance-direction encoding provides additional benefit. revision: yes
Referee: [Discussion] Discussion: the claim that the geometry constitutes the model's internal binding mechanism requires interventional evidence (e.g., editing activations along polar axes and measuring downstream task degradation), which is absent; correlational decoding alone does not establish causality.

Authors: We concur that interventional evidence would be necessary to claim causality. Our manuscript presents correlational findings: the polar geometry is decodable and correlates with model performance. We will revise the Discussion to emphasize that these results are consistent with the geometry serving as a binding mechanism but do not prove it is the causal internal representation. We will add a section on limitations and future work including potential activation editing experiments. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical probe validation independent of fitted inputs

full rationale

The paper proposes a geometric hypothesis (distance for existence, direction for type of relations) and validates it via linear probing on LLM activations across multiple domains. Probe performance, generalization to new entities, layer-wise emergence, and correlation with task accuracy are reported as empirical measurements against ground-truth structures. No derivation reduces predictions to inputs by construction, no self-citations bear the central claim, and no uniqueness theorems or ansatzes are smuggled in. The analysis remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on the domain assumption that relational semantics are linearly separable in activation space and on the empirical observation that middle-layer subspaces contain the relevant geometry. No new entities are postulated.

axioms (1)

domain assumption Relational semantics between entities are represented geometrically in embedding space
Stated as the central hypothesis tested by the polar probe

pith-pipeline@v0.9.0 · 5480 in / 1027 out tokens · 25948 ms · 2026-05-15T04:56:34.852465+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

the existence and the type of relations between entities are represented by the distance and the direction between their embeddings, respectively... Polar Probe... (M̂ρ_G)ij = ||δ_ij||₂ ... (M̂ϕ_G)ijr = δij · p_r / (||δ_ij||₂ ||p_r||₂)
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

Polar probe performance saturates at low ranks... semantic structures are represented in a compact subspace... interventions along polar probe directions causally modulate model predictions
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_strictMono_of_one_lt echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

this code emerges mostly in middle layers and improves with LLM performance... generalizes to new entities and relation types

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

112 extracted references · 112 canonical work pages · 2 internal anchors

[1]

Understanding intermediate layers using linear classifier probes, 2017

Guillaume Alain and Yoshua Bengio. Understanding intermediate layers using linear classifier probes, 2017. URL https://openreview.net/forum?id=ryF7rTqgl

work page 2017
[2]

Pythia: a suite for analyzing large language models across training and scaling

Stella Biderman, Hailey Schoelkopf, Quentin Anthony, Herbie Bradley, Kyle O'Brien, Eric Hallahan, Mohammad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, Aviya Skowron, Lintang Sutawika, and Oskar Van Der Wal. Pythia: a suite for analyzing large language models across training and scaling. In Proceedings of the 40th International Confere...

work page 2023
[3]

Fast differentiable sorting and ranking

Mathieu Blondel, Olivier Teboul, Quentin Berthet, and Josip Djolonga. Fast differentiable sorting and ranking. In Proceedings of the 37th International Conference on Machine Learning, ICML'20. JMLR.org, 2020

work page 2020
[4]

Language models are few-shot learners

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gr...

work page 1901
[8]

Diego-Simon, St \'e phane d'Ascoli, Emmanuel Chemla, Yair Lakretz, and Jean-Remi King

Pablo J. Diego-Simon, St \'e phane d'Ascoli, Emmanuel Chemla, Yair Lakretz, and Jean-Remi King. A polar coordinate system represents syntax in large language models. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. URL https://openreview.net/forum?id=x2780VcMOI

work page 2024
[10]

Toy models of superposition

Nelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, Robert Lasenby, Dawn Drain, Carol Chen, Roger Grosse, Sam McCandlish, Jared Kaplan, Dario Amodei, Martin Wattenberg, and Christopher Olah. Toy models of superposition. Transformer Circuits Thread, 2022

work page 2022
[11]

How do language models bind entities in context? In The Twelfth International Conference on Learning Representations, 2024

Jiahai Feng and Jacob Steinhardt. How do language models bind entities in context? In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=zb3b6oKO77

work page 2024
[15]

Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava S...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[16]

Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Nai...

work page 2024
[17]

Language models represent space and time

Wes Gurnee and Max Tegmark. Language models represent space and time. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=jE8xbmvFin

work page 2024
[21]

Linear representations of political perspective emerge in large language models

Junsol Kim, James Evans, and Aaron Schein. Linear representations of political perspective emerge in large language models. In The Thirteenth International Conference on Learning Representations, 2025. URL https://openreview.net/forum?id=rwqShzb9li

work page 2025
[25]

The algebraic mind: Integrating connectionism and cognitive science

Gary F Marcus. The algebraic mind: Integrating connectionism and cognitive science. MIT press, 2003

work page 2003
[26]

The geometry of truth: Emergent linear structure in large language model representations of true/false datasets

Samuel Marks and Max Tegmark. The geometry of truth: Emergent linear structure in large language model representations of true/false datasets. In First Conference on Language Modeling, 2024. URL https://openreview.net/forum?id=aajyHYjjsk

work page 2024
[30]

ICLR : In-context learning of representations

Core Francisco Park, Andrew Lee, Ekdeep Singh Lubana, Yongyi Yang, Maya Okawa, Kento Nishi, Martin Wattenberg, and Hidenori Tanaka. ICLR : In-context learning of representations. In The Thirteenth International Conference on Learning Representations, 2025 a . URL https://openreview.net/forum?id=pXlmOmlHJZ

work page 2025
[31]

The geometry of categorical and hierarchical concepts in large language models

Kiho Park, Yo Joong Choe, Yibo Jiang, and Victor Veitch. The geometry of categorical and hierarchical concepts in large language models. In The Thirteenth International Conference on Learning Representations, 2025 b . URL https://openreview.net/forum?id=bVTM2QKYuA

work page 2025
[32]

Diego Simon, Emmanuel Chemla, Jean-Remi King, and Yair Lakretz

Pablo J. Diego Simon, Emmanuel Chemla, Jean-Remi King, and Yair Lakretz. Probing syntax in large language models: Successes and remaining challenges. In Second Conference on Language Modeling, 2025. URL https://openreview.net/forum?id=nrZysNmJ0n

work page 2025
[35]

Steering language models with activation engineering, 2025

Alexander Matt Turner, Lisa Thiergart, Gavin Leech, David Udell, Juan J Vazquez, Ulisse Mini, and Monte MacDiarmid. Steering language models with activation engineering, 2025. URL https://openreview.net/forum?id=2XBPdPIcFK

work page 2025
[36]

Attention is all you need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ukasz Kaiser, and Illia Polosukhin. Attention is all you need. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (eds.), Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https:...

work page 2017
[37]

Åke Björck and Gene H. Golub. Numerical methods for computing angles between linear subspaces. Mathematics of Computation, 27 0 (123): 0 579--594, 1973. ISSN 00255718, 10886842. URL http://www.jstor.org/stable/2005662

work page arXiv 1973
[38]

, year =

Tolman, Edward C. , year =. Cognitive maps in rats and men. , volume =. Psychological Review , publisher =. doi:10.1037/h0061626 , number =

work page doi:10.1037/h0061626
[39]

Proceedings of the 37th International Conference on Machine Learning , articleno =

Blondel, Mathieu and Teboul, Olivier and Berthet, Quentin and Djolonga, Josip , title =. Proceedings of the 37th International Conference on Machine Learning , articleno =. 2020 , publisher =

work page 2020
[40]

2003 , publisher=

The algebraic mind: Integrating connectionism and cognitive science , author=. 2003 , publisher=

work page 2003
[41]

arXiv preprint cs/0412059 , year=

Vector symbolic architectures answer Jackendoff's challenges for cognitive neuroscience , author=. arXiv preprint cs/0412059 , year=

work page arXiv
[42]

2017 , eprint=

Adam: A Method for Stochastic Optimization , author=. 2017 , eprint=

work page 2017
[43]

Preprint , year=

OLMo: Accelerating the Science of Language Models , author=. Preprint , year=

work page
[44]

2024 , eprint=

The Llama 3 Herd of Models , author=. 2024 , eprint=

work page 2024
[45]

and O’Reilly, Jill X

Constantinescu, Alexandra O. and O’Reilly, Jill X. and Behrens, Timothy E. J. , year =. Organizing conceptual knowledge in humans with a gridlike code , volume =. Science , publisher =. doi:10.1126/science.aaf0941 , number =

work page doi:10.1126/science.aaf0941
[46]

, year =

Theves, Stephanie and Fernandez, Guillén and Doeller, Christian F. , year =. The Hippocampus Encodes Distances in Multidimensional Feature Space , volume =. Current Biology , publisher =. doi:10.1016/j.cub.2019.02.035 , number =

work page doi:10.1016/j.cub.2019.02.035 2019
[47]

, year =

Aronov, Dmitriy and Nevers, Rhino and Tank, David W. , year =. Mapping of a non-spatial dimension by the hippocampal–entorhinal circuit , volume =. Nature , publisher =. doi:10.1038/nature21692 , number =

work page doi:10.1038/nature21692
[48]

, year =

Theves, Stephanie and Fernández, Guillén and Doeller, Christian F. , year =. The Hippocampus Maps Concept Space, Not Feature Space , volume =. The Journal of Neuroscience , publisher =. doi:10.1523/jneurosci.0494-20.2020 , number =

work page doi:10.1523/jneurosci.0494-20.2020 2020
[49]

A Map for Social Navigation in the Human Brain , volume =

Tavares, Rita Morais and Mendelsohn, Avi and Grossman, Yael and Williams, Christian Hamilton and Shapiro, Matthew and Trope, Yaacov and Schiller, Daniela , year =. A Map for Social Navigation in the Human Brain , volume =. Neuron , publisher =. doi:10.1016/j.neuron.2015.06.011 , number =

work page doi:10.1016/j.neuron.2015.06.011 2015
[50]

Place units in the hippocampus of the freely moving rat , volume =

O’Keefe, John , year =. Place units in the hippocampus of the freely moving rat , volume =. Experimental Neurology , publisher =. doi:10.1016/0014-4886(76)90055-8 , number =

work page doi:10.1016/0014-4886(76)90055-8
[51]

Précis of O’Keefe &; Nadel’sThe hippocampus as a cognitive map , volume =

O’Keefe, John and Nadel, Lynn , year =. Précis of O’Keefe &; Nadel’sThe hippocampus as a cognitive map , volume =. Behavioral and Brain Sciences , publisher =. doi:10.1017/s0140525x00063949 , number =

work page doi:10.1017/s0140525x00063949
[52]

and Clark, Kevin and Hewitt, John and Khandelwal, Urvashi and Levy, Omer , year =

Manning, Christopher D. and Clark, Kevin and Hewitt, John and Khandelwal, Urvashi and Levy, Omer , year =. Emergent linguistic structure in artificial neural networks trained by self-supervision , volume =. Proceedings of the National Academy of Sciences , publisher =. doi:10.1073/pnas.1907367117 , number =

work page doi:10.1073/pnas.1907367117
[53]

A Structural Probe for Finding Syntax in Word Representations

Hewitt, John and Manning, Christopher D. A Structural Probe for Finding Syntax in Word Representations. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi:10.18653/v1/N19-1419

work page doi:10.18653/v1/n19-1419 2019
[54]

arXiv preprint arXiv:2312.16257 , year=

More than correlation: Do large language models learn causal representations of space? , author=. arXiv preprint arXiv:2312.16257 , year=

work page arXiv
[55]

Forty-second International Conference on Machine Learning , year=

How Do Transformers Learn Variable Binding in Symbolic Programs? , author=. Forty-second International Conference on Machine Learning , year=

work page
[56]

First Conference on Language Modeling , year=

The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets , author=. First Conference on Language Modeling , year=

work page
[57]

2025 , url=

Steering Language Models with Activation Engineering , author=. 2025 , url=

work page 2025
[58]

Language Models Encode Numbers Using Digit Representations in Base 10

Levy, Amit Arnold and Geva, Mor. Language Models Encode Numbers Using Digit Representations in Base 10. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers). 2025. doi:10.18653/v1/2025.naacl-short.33

work page doi:10.18653/v1/2025.naacl-short.33 2025
[59]

Probing for Incremental Parse States in Autoregressive Language Models

Eisape, Tiwalayo and Gangireddy, Vineet and Levy, Roger and Kim, Yoon. Probing for Incremental Parse States in Autoregressive Language Models. Findings of the Association for Computational Linguistics: EMNLP 2022. 2022. doi:10.18653/v1/2022.findings-emnlp.203

work page doi:10.18653/v1/2022.findings-emnlp.203 2022
[60]

Probing for Labeled Dependency Trees

M. Probing for Labeled Dependency Trees. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. doi:10.18653/v1/2022.acl-long.532

work page doi:10.18653/v1/2022.acl-long.532 2022
[61]

Language Models are Few-Shot Learners , url =

Brown, Tom and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared D and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and Agarwal, Sandhini and Herbert-Voss, Ariel and Krueger, Gretchen and Henighan, Tom and Child, Rewon and Ramesh, Aditya and Ziegler, Daniel and Wu, Jeffrey and Winte...

work page
[62]

and Bruna, Joan and LeCun, Yann and Szlam, Arthur and Vandergheynst, Pierre , journal=

Bronstein, Michael M. and Bruna, Joan and LeCun, Yann and Szlam, Arthur and Vandergheynst, Pierre , journal=. Geometric Deep Learning: Going beyond Euclidean data , year=

work page
[63]

Poincar\'

Nickel, Maximillian and Kiela, Douwe , booktitle =. Poincar\'

work page
[64]

Hyperbolic neural networks , year =

Ganea, Octavian-Eugen and B\'. Hyperbolic neural networks , year =. Proceedings of the 32nd International Conference on Neural Information Processing Systems , pages =

work page
[65]

Boli Chen and Yao Fu and Guangwei Xu and Pengjun Xie and Chuanqi Tan and Mosha Chen and Liping Jing , booktitle=. Probing. 2021 , url=

work page 2021
[66]

Representational Analysis of Binding in Language Models

Dai, Qin and Heinzerling, Benjamin and Inui, Kentaro. Representational Analysis of Binding in Language Models. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.967

work page doi:10.18653/v1/2024.emnlp-main.967 2024
[67]

The Thirty-eighth Annual Conference on Neural Information Processing Systems , year=

Transformers Represent Belief State Geometry in their Residual Stream , author=. The Thirty-eighth Annual Conference on Neural Information Processing Systems , year=

work page
[68]

The Twelfth International Conference on Learning Representations , year=

How do Language Models Bind Entities in Context? , author=. The Twelfth International Conference on Learning Representations , year=

work page
[69]

and Gardner, Matt and Belinkov, Yonatan and Peters, Matthew E

Liu, Nelson F. and Gardner, Matt and Belinkov, Yonatan and Peters, Matthew E. and Smith, Noah A. Linguistic Knowledge and Transferability of Contextual Representations. Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. doi...

work page doi:10.18653/v1/n19-1112 2019
[70]

What Does BERT Learn about the Structure of Language?

Jawahar, Ganesh and Sagot, Beno \^i t and Seddah, Djam \'e. What Does BERT Learn about the Structure of Language?. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. doi:10.18653/v1/P19-1356

work page doi:10.18653/v1/p19-1356 2019
[71]

Language Models Encode the Value of Numbers Linearly

Zhu, Fangwei and Dai, Damai and Sui, Zhifang. Language Models Encode the Value of Numbers Linearly. Proceedings of the 31st International Conference on Computational Linguistics. 2025

work page 2025
[72]

The Thirteenth International Conference on Learning Representations , year=

Linear Representations of Political Perspective Emerge in Large Language Models , author=. The Thirteenth International Conference on Learning Representations , year=

work page
[73]

The Thirteenth International Conference on Learning Representations , year=

The Geometry of Categorical and Hierarchical Concepts in Large Language Models , author=. The Thirteenth International Conference on Learning Representations , year=

work page
[74]

What you can cram into a single \ &!\#* vector:

Conneau, Alexis and Kruszewski, German and Lample, Guillaume and Barrault, Lo. What you can cram into a single \ & ! \# * vector: Probing sentence embeddings for linguistic properties. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018. doi:10.18653/v1/P18-1198

work page doi:10.18653/v1/p18-1198 2018
[75]

2017 , url=

Understanding intermediate layers using linear classifier probes , author=. 2017 , url=

work page 2017
[76]

The Thirty-eighth Annual Conference on Neural Information Processing Systems , year=

A Polar coordinate system represents syntax in large language models , author=. The Thirty-eighth Annual Conference on Neural Information Processing Systems , year=

work page
[77]

Composition in Distributional Models of Semantics , volume =

Mitchell, Jeff and Lapata, Mirella , year =. Composition in Distributional Models of Semantics , volume =. Cognitive Science , publisher =. doi:10.1111/j.1551-6709.2010.01106.x , number =

work page doi:10.1111/j.1551-6709.2010.01106.x 2010
[78]

and Quillian, M

Collins, Allan M. and Quillian, M. Ross , year =. Retrieval time from semantic memory , volume =. Journal of Verbal Learning and Verbal Behavior , publisher =. doi:10.1016/s0022-5371(69)80069-1 , number =

work page doi:10.1016/s0022-5371(69)80069-1
[79]

and Loftus, Elizabeth F

Collins, Allan M. and Loftus, Elizabeth F. , year =. A spreading-activation theory of semantic processing. , volume =. Psychological Review , publisher =. doi:10.1037/0033-295x.82.6.407 , number =

work page doi:10.1037/0033-295x.82.6.407
[80]

and Slocum, J

Simmons, R. and Slocum, J. , year =. Generating English discourse from semantic networks , volume =. Communications of the ACM , publisher =. doi:10.1145/355604.361595 , number =

work page doi:10.1145/355604.361595
[81]

On generative semantics , isbn =

Lakoff, George , editor =. On generative semantics , isbn =. Semantics:. 1971 , keywords =

work page 1971
[82]

, author=

The case for case. , author=. 1967 , publisher=

work page 1967
[83]

1892 , publisher=

Frege, Gottlob and others , journal=. 1892 , publisher=

work page
[84]

, journal=

Plate, T.A. , journal=. Holographic reduced representations , year=

work page
[85]

Approaches to natural language: Proceedings of the 1970 Stanford workshop on grammar and semantics , pages=

The proper treatment of quantification in ordinary English , author=. Approaches to natural language: Proceedings of the 1970 Stanford workshop on grammar and semantics , pages=. 1973 , organization=

work page 1970
[86]

, year =

McRae, Ken and Ferretti and Liane Amyote, Todd R. , year =. Thematic Roles as Verb-specific Concepts , volume =. Language and Cognitive Processes , publisher =. doi:10.1080/016909697386835 , number =

work page doi:10.1080/016909697386835
[87]

Semantic Networks , ISBN =

Sowa, John F , year =. Semantic Networks , ISBN =. doi:10.1002/0470018860.s00065 , journal =

work page doi:10.1002/0470018860.s00065
[88]

Compositionality in Formal Semantics: Selected Papers of Barbara H

Barbara Hall Partee , editor =. Compositionality in Formal Semantics: Selected Papers of Barbara H. Partee , year =

work page
[89]

Situations and Attitudes , year =

Jon Barwise and John Perry , publisher =. Situations and Attitudes , year =

work page
[90]

From Discourse to Logic: Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory , year =

Hans Kamp and Uwe Reyle , editor =. From Discourse to Logic: Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory , year =

work page
[91]

The neural basis of combinatory syntax and semantics , volume =

Pylkk\". The neural basis of combinatory syntax and semantics , volume =. Science , publisher =. 2019 , month = oct, pages =. doi:10.1126/science.aax0050 , number =

work page doi:10.1126/science.aax0050 2019
[92]

Two Ways to Build a Thought: Distinct Forms of Compositional Semantic Representation across Brain Regions , volume =

Frankland, Steven M and Greene, Joshua D , year =. Two Ways to Build a Thought: Distinct Forms of Compositional Semantic Representation across Brain Regions , volume =. Cerebral Cortex , publisher =. doi:10.1093/cercor/bhaa001 , number =

work page doi:10.1093/cercor/bhaa001
[93]

Minimal Recursion Semantics: An Introduction , volume =

Copestake, Ann and Flickinger, Dan and Pollard, Carl and Sag, Ivan , year =. Minimal Recursion Semantics: An Introduction , volume =. Reseach On Language And Computation , doi =

work page
[94]

A bstract M eaning R epresentation for Sembanking

Banarescu, Laura and Bonial, Claire and Cai, Shu and Georgescu, Madalina and Griffitt, Kira and Hermjakob, Ulf and Knight, Kevin and Koehn, Philipp and Palmer, Martha and Schneider, Nathan. A bstract M eaning R epresentation for Sembanking. Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse. 2013

work page 2013
[95]

Eye movements in reading and information processing: 20 years of research

Rayner, Keith , year =. Eye movements in reading and information processing: 20 years of research. , volume =. Psychological Bulletin , publisher =. doi:10.1037/0033-2909.124.3.372 , number =

work page doi:10.1037/0033-2909.124.3.372
[96]

and Spivey-Knowlton, Michael J

Tanenhaus, Michael K. and Spivey-Knowlton, Michael J. and Eberhard, Kathleen M. and Sedivy, Julie C. , year =. Integration of Visual and Linguistic Information in Spoken Language Comprehension , volume =. Science , publisher =. doi:10.1126/science.7777863 , number =

work page doi:10.1126/science.7777863
[97]

and Dehaene, Stanislas and King, Jean-Rémi , year =

Desbordes, Théo and Lakretz, Yair and Chanoine, Valérie and Oquab, Maxime and Badier, Jean-Michel and Trébuchon, Agnès and Carron, Romain and Bénar, Christian-G. and Dehaene, Stanislas and King, Jean-Rémi , year =. Dimensionality and Ramping: Signatures of Sentence Integration in the Dynamics of Brains and Deep Language Models , volume =. The Journal of N...

work page doi:10.1523/jneurosci.1163-22.2023 2023
[98]

Disentangling Semantic Composition and Semantic Association in the Left Temporal Lobe , volume =

Li, Jixing and Pylkk\". Disentangling Semantic Composition and Semantic Association in the Left Temporal Lobe , volume =. The Journal of Neuroscience , publisher =. doi:10.1523/jneurosci.2317-20.2021 , number =

work page doi:10.1523/jneurosci.2317-20.2021 2021

Showing first 80 references.