Polar probe linearly decodes semantic structures from LLMs

· 2026 · cs.CL · arXiv 2605.14125

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

How do artificial neural networks bind concepts to form complex semantic structures? Here, we propose a simple neural code, whereby the existence and the type of relations between entities are represented by the distance and the direction between their embeddings, respectively. We test this hypothesis in a variety of Large Language Models (LLMs), each input with natural-language descriptions of minimalist tasks from five different domains: arithmetic, visual scenes, family trees, metro maps and social interactions. Results show that the true semantic structures can be linearly recovered with a Polar Probe targeting a subspace of LLMs' layer activations. Second, this code emerges mostly in middle layers and improves with LLM performance. Third, these Polar Probes successfully generalize to new entities and relation types, but degrades with the size of the semantic structure. Finally, the quality of the polar representation correlates with the LLM's ability to answer questions about the semantic structure. Together, these findings suggest that LLMs learn to build complex semantic structures by binding representations with a simple geometrical principle.

representative citing papers

Tool-Call Dependency Structure is Linearly Decodable in LLM Agent Residual Streams

cs.CL · 2026-05-25 · unverdicted · novelty 7.0

A linear edge probe on Qwen3-32B residual streams decodes tool-call dependency graphs in agent trajectories above random and positional baselines, with evidence the signal tracks abstract topology.

citing papers explorer

Showing 1 of 1 citing paper.

Tool-Call Dependency Structure is Linearly Decodable in LLM Agent Residual Streams cs.CL · 2026-05-25 · unverdicted · none · ref 1 · internal anchor
A linear edge probe on Qwen3-32B residual streams decodes tool-call dependency graphs in agent trajectories above random and positional baselines, with evidence the signal tracks abstract topology.

Polar probe linearly decodes semantic structures from LLMs

fields

years

verdicts

representative citing papers

citing papers explorer