Representation in large language models

Cameron Yetman

arxiv: 2501.00885 · v2 · submitted 2025-01-01 · 💻 cs.CL · cs.AI· cs.LG

Representation in large language models

Cameron Yetman This is my paper

Pith reviewed 2026-05-23 05:48 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.LG

keywords large language modelsrepresentationsinformation processingmemorizationinterpretabilityexplanationscognitive modeling

0 comments

The pith

Large language models engage in representation-based information processing rather than relying solely on memorization and stochastic lookup.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper asks whether LLM behavior stems partly from representation-based information processing of the kind seen in biological cognition, or entirely from memorization and table lookup. It claims the former holds and supplies a set of practical techniques for locating and using those representations to build explanations. This distinction matters because it shapes answers to whether models possess beliefs, intentions, concepts, knowledge, or understanding. Agreement on the underlying algorithm can move stalled debates between optimists and pessimists forward. The techniques are presented as a concrete starting point for future work on language models and their successors.

Core claim

LLM behavior is partially driven by representation-based information processing of the sort implicated in biological cognition, and not driven entirely by processes of memorization and stochastic table look-up. This is a question about what kind of algorithm LLMs implement. The answer carries serious implications for higher level questions about whether these systems have beliefs, intentions, concepts, knowledge, and understanding. Practical techniques are described and defended for investigating these representations and developing explanations on their basis, providing groundwork for future theorizing.

What carries the argument

The operational distinction between representation-based information processing and memorization/stochastic table look-up, together with the practical techniques offered for probing internal representations to separate evidence for each.

If this is right

Explanations of specific LLM behaviors can be constructed by identifying and tracking the relevant internal representations.
Questions about whether LLMs possess beliefs, intentions, or understanding can be addressed by examining the representations that drive their outputs.
The described techniques supply a shared method that can reduce disagreement between camps that currently disagree on how LLMs work.
The same approach supplies groundwork for theorizing about future language models and their successors.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The techniques could be tested on other neural network architectures to see whether representation-based processing appears outside transformer-based LLMs.
If the distinction holds, it suggests that interpretability work should prioritize locating stable internal representations rather than treating all behavior as surface-level pattern matching.
The framework may connect to existing methods in cognitive science for distinguishing genuine conceptual processing from rote recall in humans and animals.
Practical use of the techniques might allow targeted interventions that edit or strengthen specific representations inside a deployed model.

Load-bearing premise

The distinction between representation-based information processing and memorization can be made operational and investigated with the described techniques so that evidence for one can be separated from evidence for the other.

What would settle it

A demonstration that every LLM output on a wide range of tasks can be fully accounted for by stochastic table lookup with no residual explanatory work left for internal representations.

read the original abstract

The extraordinary success of recent Large Language Models (LLMs) on a diverse array of tasks has led to an explosion of scientific and philosophical theorizing aimed at explaining how they do what they do. Unfortunately, disagreement over fundamental theoretical issues has led to stalemate, with entrenched camps of LLM optimists and pessimists often committed to very different views of how these systems work. Overcoming stalemate requires agreement on fundamental questions, and the goal of this paper is to address one such question, namely: is LLM behavior driven partly by representation-based information processing of the sort implicated in biological cognition, or is it driven entirely by processes of memorization and stochastic table look-up? This is a question about what kind of algorithm LLMs implement, and the answer carries serious implications for higher level questions about whether these systems have beliefs, intentions, concepts, knowledge, and understanding. I argue that LLM behavior is partially driven by representation-based information processing, and then I describe and defend a series of practical techniques for investigating these representations and developing explanations on their basis. The resulting account provides a groundwork for future theorizing about language models and their successors.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper claims LLMs do some representation-based processing rather than pure memorization and sketches techniques to study it, but supplies no data, examples, or worked methods.

read the letter

The main point is that this paper stakes out a middle position in the ongoing debate: LLM behavior is partly driven by representation-based information processing, not just memorization or stochastic lookup, and it offers practical techniques to investigate those representations. That framing is the clearest contribution here. It does a solid job stating the stalemate between optimists and pessimists and narrowing it to one operational question about the algorithm being implemented. The author is direct about the implications for beliefs, concepts, and understanding. What is actually new is the proposal of a series of techniques for probing internal states and building explanations from them. The paper treats this as groundwork rather than a finished empirical separation. The soft spot is that the abstract (and the limited material available) gives no concrete examples, derivations, or results showing the techniques in action. Without that, it is hard to judge whether the distinction between representation and lookup can be made non-circular or falsifiable in practice. The central claim rests on the promise that such methods exist and can be defended, but nothing is demonstrated yet. This is for readers working in AI philosophy or interpretability who care about conceptual groundwork. It is not for someone seeking new measurements or formal results. The paper deserves a serious referee because it engages the literature plainly and identifies a usable question, even if the techniques section needs substantial expansion to carry the argument.

Referee Report

1 major / 0 minor

Summary. The paper argues that LLM behavior is partially driven by representation-based information processing (as opposed to entirely by memorization and stochastic table look-up), addresses the resulting theoretical stalemate between optimists and pessimists, and describes/defends practical techniques for investigating these representations to ground future explanations about beliefs, intentions, and understanding in LLMs.

Significance. If the promised techniques can be shown to operationalize a non-circular distinction between representation-based processing and memorization, the work could supply shared methodological ground for higher-level debates on LLM cognition. The manuscript is explicitly positioned as groundwork rather than a completed empirical demonstration or formal proof.

major comments (1)

[Abstract] Abstract: the central claim that LLM behavior is 'partially driven by representation-based information processing' and that this can be investigated via 'a series of practical techniques' is load-bearing, yet the text supplies no description, derivation, example, or data for any such technique, leaving the operationalizability of the representation/memorization distinction unaddressed.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review and for highlighting the importance of operationalizing the representation/memorization distinction. The manuscript is explicitly framed as groundwork rather than a full empirical demonstration, and we address the specific concern about the abstract below.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that LLM behavior is 'partially driven by representation-based information processing' and that this can be investigated via 'a series of practical techniques' is load-bearing, yet the text supplies no description, derivation, example, or data for any such technique, leaving the operationalizability of the representation/memorization distinction unaddressed.

Authors: The abstract is a high-level summary and therefore does not contain detailed descriptions, derivations, or examples; these appear in the body of the paper. That said, the referee is correct that the abstract itself provides no concrete illustration of the techniques or their ability to draw a non-circular distinction. We will revise the abstract to include a brief, high-level indication of the kinds of techniques discussed (e.g., activation patching, representation similarity analyses, and causal intervention methods) and to note that the paper defends their utility as investigative tools rather than presenting new empirical results. Because the work is positioned as conceptual groundwork, it does not include original datasets or large-scale experiments; the techniques are described and motivated at the level appropriate to that framing. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper is framed explicitly as groundwork for investigating a distinction between representation-based processing and memorization/stochastic lookup, without presenting equations, fitted parameters, or any derivation chain. The central claim is that LLM behavior is partially driven by the former and that practical techniques can operationalize the distinction; these techniques are positioned as the means to separate the two rather than presupposing one in the definition of the other. No self-citations, ansatzes, or uniqueness theorems are invoked in the provided text to bear the load of the argument. The manuscript does not reduce any prediction or result to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper is a philosophical argument that relies on background assumptions about cognition, representation, and what counts as evidence for internal processing in artificial systems. No free parameters or invented physical entities are introduced in the abstract.

axioms (2)

domain assumption Biological cognition involves representation-based information processing that can be meaningfully contrasted with pure memorization.
Invoked when the paper poses the question about whether LLMs implement the same kind of processing.
ad hoc to paper It is possible to develop practical techniques that distinguish representation-based processing from memorization in LLMs.
This is the load-bearing premise that allows the author to move from the claim to actionable investigation methods.

pith-pipeline@v0.9.0 · 5715 in / 1433 out tokens · 29280 ms · 2026-05-23T05:48:12.327569+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

When Behavioral Safety Evaluation Fails: A Representation-Level Perspective
cs.LG 2026-06 unverdicted novelty 6.0

Behavioral safety metrics for LLMs are insufficient because models can maintain safe outputs while remaining vulnerable to latent-space interventions, as shown via dissociated models and the new Latent Vulnerability Score.