Large language models reorganize representational geometry during in-context learning

Hua-Dong Xiong; Kwonjoon Lee; Li Ji-An; Robert C. Wilson; Xue-Xin Wei

arxiv: 2605.28854 · v1 · pith:ZSWHGW2Snew · submitted 2026-05-16 · 💻 cs.CL · cs.LG· q-bio.NC

Large language models reorganize representational geometry during in-context learning

Hua-Dong Xiong , Li Ji-An , Robert C. Wilson , Kwonjoon Lee , Xue-Xin Wei This is my paper

Pith reviewed 2026-06-30 18:48 UTC · model grok-4.3

classification 💻 cs.CL cs.LGq-bio.NC

keywords in-context learningrepresentational geometrylarge language modelsclassificationprototype algorithmseparabilityneural representations

0 comments

The pith

Large language models reorganize their internal representations during in-context learning to increase separability of task features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how large language models perform in-context learning by looking at changes in their high-dimensional representation space. It finds that successful ICL correlates with the initial structure of representations for the classification task and involves geometric reorganization that makes relevant features more separable online. LLM behavior aligns with a prototype-like algorithm that integrates evidence from examples while reshaping representations. This offers a geometric explanation for ICL without parameter updates, showing how pretrained representations constrain what in-context learning can achieve.

Core claim

Successful in-context learning in LLMs is accompanied by geometric reorganization of representations that increases online separability, and this behavior is well described by a prototype-like algorithm that integrates evidence while reshaping representations to support classification. ICL performance correlates systematically with the representational structure of the underlying classification task.

What carries the argument

Geometric reorganization of representations that increases online separability, carried out by a prototype-like algorithm integrating evidence from examples.

If this is right

ICL performance varies systematically with the structure of the classification task in the model's representations.
Successful ICL produces geometric changes that increase online separability of task features.
LLM in-context behavior matches a prototype-like algorithm that integrates evidence while reshaping representations.
Representational geometry serves as a mechanistic constraint on what ICL can exploit from pretrained models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Pretraining methods that produce more separable initial representations for common tasks could improve ICL across a wider range of examples.
The same reorganization mechanism may limit or enable other forms of adaptation without weight updates in transformer models.
Comparing reorganization patterns across model scales or architectures could test whether this geometric account generalizes beyond the studied LLMs.

Load-bearing premise

Defining classification labels from the model's own internal representations with known structure provides a valid test of whether ICL depends on successful online untangling of task-relevant representations.

What would settle it

Experiments showing no systematic correlation between ICL performance and the structure of the classification task in representations, or no increase in separability during successful ICL, would falsify the claim.

Figures

Figures reproduced from arXiv: 2605.28854 by Hua-Dong Xiong, Kwonjoon Lee, Li Ji-An, Robert C. Wilson, Xue-Xin Wei.

**Figure 1.** Figure 1: Classification as untangling neural representations. Raw sensory inputs (e.g., at the retina) initially give rise to highly entangled representations, in which category manifolds are not yet linearly separable. Through successive stages of processing, both visual cortex and deep networks progressively untangle these representations so that category manifolds become increasingly linearly separable in later … view at source ↗

**Figure 2.** Figure 2: Representation-defined in-context classification task. (a) For each sentence xi , we extract a sentence-level representation h (ℓ) i from a selected target layer ℓ, project it onto a task-defining axis w, and binarize it into a label yi . Background shading shows the decision regions of the logistic-regression axis, aligned with the model’s pretrained semantics for the given dataset. Point colors indicate … view at source ↗

**Figure 3.** Figure 3: In-context classification accuracy varies across task-defining axes. In-context classification accuracy as a function of the number of in-context examples, showing graded separation between easy task-defining axes (the LR axis and leading PCs) and difficult task-defining axes (trailing PCs). (a) Accuracy averaged over five target layers sampled at evenly spaced depth quantiles and over eight models of vary… view at source ↗

**Figure 4.** Figure 4: Schematic of how neural manifold geometry shapes classification difficulty. Each column illustrates one geometric property of class manifolds before (top) and after (bottom) successful in-context learning. Capacity: when class manifolds overlap (low capacity), linear classification is difficult; ICL separates the manifolds, yielding high capacity. Radius: large within-class spread hinders separation; ICL c… view at source ↗

**Figure 5.** Figure 5: Geometric reorganization of neural representations during ICL. Capacity, radius, dimension, and SNR of the final-layer representations across in-context examples, averaged over five target layers and eight models. Each curve denotes a different task-defining axis. The LR axis and leading PCs (PC1/2/4/8) exhibit strong geometric reorganization with in-context examples, with increasing capacity and SNR and d… view at source ↗

**Figure 6.** Figure 6: Comparison of different online algorithms in predicting the response of LLMs. Accuracy of six different online learners in predicting the LLM’s responses (descriptive fit), computed from final-layer representations and averaged over five target layers. The prototype model matches LLM behavior best. (a) Model fitting accuracy averaged over eight LLMs on ETHICS-commonsense. Shaded regions denote ±1 SEM acros… view at source ↗

**Figure 7.** Figure 7: Geometric reorganization across datasets (Qwen3-4B-Instruct-2507). Dataset-level counterpart of [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗

read the original abstract

Large language models (LLMs) exhibit remarkable flexibility: they can adapt to novel tasks from in-context examples without any parameter updates, a capability known as in-context learning (ICL). Prior work on synthetic tasks has shown that ICL can implement specific algorithms, demonstrating architectural competence, and mechanistic analyses have identified key circuits that support this behavior. However, because in-context computation -- regardless of its algorithmic form -- relies on transformations in high-dimensional representation space, it remains unclear how the geometry of that space shapes ICL effectiveness. Motivated by the neuroscience view of classification as the untangling of neural representations, we hypothesize that ICL depends on the successful online untangling of task-relevant representations. To test this idea, we study how LLMs classify in-context examples whose labels are defined by the model's own internal representations with known structure. We show that ICL performance correlates systematically with the representational structure of the underlying classification task and that successful ICL is accompanied by geometric reorganization that increases online separability. We further find that LLM behavior is well described by a prototype-like algorithm that integrates evidence while reshaping representations to support classification. These findings offer a geometric account of ICL in pretrained LLMs, establish representational geometry as a mechanistic constraint on ICL, and quantify the gap between what pretrained representations afford and what in-context learning can exploit.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper links ICL success to geometric reorganization in LLM representations but the task design risks circularity by pulling labels from the same internal structure it claims to untangle.

read the letter

The main point is that ICL performance tracks the separability already present in the model's representations, and successful cases show further geometric shifts that increase that separability. LLM behavior here looks like a prototype classifier that updates on the fly.

What is new is the direct measurement of representational change during ICL on tasks built from the model's own hidden states, plus the correlation between initial geometry and final accuracy. This moves beyond the circuit-level accounts in prior synthetic work by adding a geometry constraint.

The paper does a reasonable job grounding the hypothesis in the untangling idea from neuroscience and in showing that pretrained representations leave a measurable gap that ICL can partially close.

The soft spot is exactly the one in the stress-test note. Because labels are defined from the model's existing representations, the selected tasks already have detectable structure; any reorganization or performance correlation could be an artifact of that selection rather than evidence that ICL generally requires online untangling. Without controls that assign labels independently of the same geometry, the central claim stays hard to separate from the method.

The methods and results sections would need to show that ICL fails cleanly on low-separability tasks even under different label assignments. If they do, the geometric account strengthens; if not, the finding is narrower.

This is for people working on mechanistic interpretability of ICL. A reader already following representation geometry in transformers would find the measurements useful to check against their own setups.

It deserves peer review so the methods can be examined in detail and the circularity concern can be tested directly.

Referee Report

1 major / 2 minor

Summary. The paper claims that in-context learning (ICL) in large language models depends on the successful online untangling of task-relevant representations. By defining classification labels directly from the model's own internal representations (with known structure), the authors report that ICL performance correlates systematically with the representational structure of the task, that successful ICL is accompanied by geometric reorganization increasing online separability, and that LLM behavior is well described by a prototype-like algorithm integrating evidence while reshaping representations to support classification. These results are presented as establishing representational geometry as a mechanistic constraint on ICL.

Significance. If the central claims hold after addressing methodological concerns, the work provides a geometric account of ICL that bridges neuroscience-inspired views of representation untangling with empirical LLM behavior. The identification of a prototype-like algorithm is a concrete strength, offering a falsifiable description of the mechanism. The use of model-derived tasks is a direct way to probe internal geometry, though its validity for testing the untangling hypothesis requires additional validation. This could inform mechanistic interpretability research if the correlation and reorganization findings are shown to be independent of task-selection artifacts.

major comments (1)

[Task construction procedure (abstract and methods)] Task construction procedure (abstract and methods): defining classification labels from the LLM's pre-existing internal representations with known structure selects for tasks where some geometry is already detectable. This risks making the reported correlation between representational structure and ICL performance, as well as the reorganization that increases separability, an artifact of pre-alignment rather than evidence that ICL requires successful online untangling. The manuscript should include controls with labels defined independently of the same representations (e.g., random or external labels) to test whether ICL fails specifically when initial separability is low, independent of the selection procedure.

minor comments (2)

Clarify the exact metrics used to quantify 'representational structure' and 'online separability' (e.g., which distance or clustering measures) and report effect sizes or statistical tests for the claimed systematic correlations.
The description of the prototype-like algorithm would benefit from an explicit comparison to alternative models (e.g., linear classifiers) to substantiate that it 'well describes' the behavior.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for highlighting this important methodological consideration regarding task construction. We agree that additional controls are warranted to rule out selection artifacts and will incorporate them in the revision. Below we respond to the major comment.

read point-by-point responses

Referee: Task construction procedure (abstract and methods): defining classification labels from the LLM's pre-existing internal representations with known structure selects for tasks where some geometry is already detectable. This risks making the reported correlation between representational structure and ICL performance, as well as the reorganization that increases separability, an artifact of pre-alignment rather than evidence that ICL requires successful online untangling. The manuscript should include controls with labels defined independently of the same representations (e.g., random or external labels) to test whether ICL fails specifically when initial separability is low, independent of the selection procedure.

Authors: We agree that the task-construction procedure merits additional controls to strengthen the claim that ICL depends on online untangling rather than pre-existing alignment. Our design intentionally uses model-derived labels to probe the geometry that the model itself has learned, allowing a direct test of whether ICL performance tracks the structure already present in the representations. The observed reorganization during ICL and the match to a prototype-like algorithm provide evidence that the model is actively reshaping representations rather than merely exploiting static structure. Nevertheless, to address the referee's concern directly, we will add two sets of control experiments in the revised manuscript: (1) tasks with randomly assigned labels (zero initial separability) and (2) tasks whose labels are defined by an external, non-representational criterion. These controls will test whether ICL performance collapses when initial separability is low, independent of the selection procedure used in the main experiments. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's abstract and described methods present an empirical study correlating ICL performance with representational structure on tasks whose labels are drawn from the model's own internal representations. No equations, fitted parameters renamed as predictions, or self-citation chains are provided that reduce any claimed result to its inputs by construction. The methodological choice to define labels from internal representations is a design decision for testing the untangling hypothesis rather than a self-definitional loop or load-bearing reduction in the derivation. The findings remain externally falsifiable via the reported correlations and geometric measures.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no identifiable free parameters, axioms, or invented entities; all elements of the central claim rest on unstated experimental details and the neuroscience-motivated hypothesis.

pith-pipeline@v0.9.1-grok · 5783 in / 1199 out tokens · 32878 ms · 2026-06-30T18:48:56.118296+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

4 extracted references · 1 canonical work pages · 1 internal anchor

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...
[2]

@esa (Ref

\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...
[3]

\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...
[4]

In-context Learning and Induction Heads

@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.18653/v1/p17-2067 2026

[1] [1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

[2] [2]

@esa (Ref

\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

[3] [3]

\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

[4] [4]

In-context Learning and Induction Heads

@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.18653/v1/p17-2067 2026