From Chains to DAGs: Probing the Graph Structure of Reasoning in LLMs

Linyang He; Nima Mesgarani; Tianjun Zhong

arxiv: 2601.17593 · v2 · submitted 2026-01-24 · 💻 cs.CL

From Chains to DAGs: Probing the Graph Structure of Reasoning in LLMs

Tianjun Zhong , Linyang He , Nima Mesgarani This is my paper

Pith reviewed 2026-05-16 10:58 UTC · model grok-4.3

classification 💻 cs.CL

keywords LLM reasoningdirected acyclic graphshidden state probinggraph structuremulti-step reasoningintermediate layersdependency graph recovery

0 comments

The pith

LLM hidden states encode directed acyclic graph structure for reasoning tasks, peaking in intermediate layers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper asks whether large language models represent multi-step reasoning as directed acyclic graphs rather than simple linear chains, where intermediate steps can branch, merge, and be reused. It introduces Reasoning DAG Probing, which pairs textual descriptions of reasoning nodes with model hidden states and trains linear probes to recover node depth, pairwise distances, and adjacency relations. If the structure is present, these probes should succeed above controls that preserve text but break reasoning dependencies, with performance varying by layer, node position, and model size. A sympathetic reader cares because this would mean reasoning is not purely sequential inside the model and could be inspected or steered at the graph level. Experiments across benchmarks show nontrivial recovery of the underlying DAGs, strongest in middle layers.

Core claim

We associate each node in a reasoning DAG with a textual realization and train lightweight linear probes on LLM hidden states to predict node depth, pairwise distance, and adjacency. Across reasoning benchmarks the probes recover these properties above control baselines, with accuracy peaking in intermediate layers and varying systematically by node depth, edge span, and model scale, enabling approximate reconstruction of the dependency graph.

What carries the argument

Reasoning DAG Probing, a framework that trains linear probes on hidden states to recover depth, distance, and adjacency properties of an underlying reasoning DAG.

If this is right

DAG recoverability peaks in intermediate layers rather than early or final layers.
Probe performance varies systematically with node depth and the span of edges connecting nodes.
Larger models exhibit stronger encoding of the graph structure than smaller ones.
Approximate reasoning dependency graphs can be reconstructed from hidden-state probes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Interventions at intermediate layers might improve complex reasoning by strengthening graph-like reuse of intermediate conclusions.
Models could be trained or evaluated explicitly on their ability to maintain consistent DAG structure during generation.
This internal graph view may explain why chain-of-thought prompting helps: it externalizes an already graph-shaped computation.

Load-bearing premise

The chosen textual realizations of nodes match the intended reasoning DAG nodes, and linear probes on hidden states can detect the graph structure if it exists.

What would settle it

On held-out reasoning problems with known DAGs, if linear probes trained on hidden states predict adjacency or depth no better than a text-only baseline that ignores model internals, the claim that DAG structure is encoded would be falsified.

read the original abstract

Recent progress in large language models has renewed interest in how multi-step reasoning is represented internally. While prior work often treats reasoning as a linear chain, many reasoning problems are more naturally modeled as directed acyclic graphs (DAGs), where intermediate conclusions branch, merge, and are reused. Whether such graph structure is reflected in model internals remains unclear. We introduce Reasoning DAG Probing, a framework for testing whether LLM hidden states linearly encode properties of an underlying reasoning DAG and where this structure emerges across layers. We associate each reasoning node with a textual realization and train lightweight probes to predict node depth, pairwise distance, and adjacency from hidden states. Using these probes, we analyze the emergence of DAG structure across layers, reconstruct approximate reasoning graphs, and evaluate controls that disrupt reasoning-relevant structure while preserving surface text. Across reasoning benchmarks, we find that DAG structure is meaningfully encoded in LLM representations, with recoverability peaking in intermediate layers, varying systematically by node depth, edge span, and model scale, and enabling nontrivial recovery of dependency graphs. These findings suggest that LLM reasoning is not purely sequential, but exhibits measurable internal graph structure.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The new DAG probing framework is a useful extension beyond chain analyses, but the results may still be driven by surface text cues rather than genuine internal graph encoding.

read the letter

The paper's main contribution is a probing method that checks whether LLM hidden states linearly encode properties of a reasoning DAG—node depth, pairwise distances, and adjacency—rather than treating reasoning as a simple chain. They report that recoverability peaks in intermediate layers and varies with node depth, edge span, and model scale, which lets them reconstruct approximate dependency graphs from activations alone. That is a clear step past the linear-chain focus in most prior interpretability work on reasoning. The controls that disrupt logical structure while keeping surface text are a reasonable attempt to isolate the effect, and the layer-wise and scale patterns give readers something concrete to test in follow-up experiments. The citation pattern looks standard for probing papers and does not rely on self-citation for the core claim. The central argument holds up in outline: the setup is straightforward, the probes are lightweight, and the question about graph versus chain structure is worth asking. The soft spot is exactly the one flagged in the stress-test note. Each node is tied to a textual realization, so probes could succeed by learning correlations between wording and graph position—certain phrases might reliably mark deeper nodes or longer edges—without the model itself maintaining an abstract DAG. Because the controls preserve surface text, they do not rule this out. The abstract does not describe an explicit check for lexical predictors or independent verification that the ground-truth DAGs are defined without reference to the chosen wording. That leaves the strongest claim—that DAG structure is meaningfully encoded internally—on weaker footing than the framework itself. This is for readers working on mechanistic interpretability of reasoning or on ways to make multi-step inference more graph-aware. A serious referee would be useful to press on the text confound and to ask for the exact control details and any lexical baseline results. I would send it to review rather than desk-reject.

Referee Report

2 major / 2 minor

Summary. The paper introduces Reasoning DAG Probing, a framework that associates reasoning nodes with textual realizations and trains linear probes on LLM hidden states to predict node depth, pairwise distance, and adjacency. It reports that DAG structure is encoded in representations, with recoverability peaking in intermediate layers, varying systematically by node depth, edge span, and model scale, and enabling nontrivial reconstruction of dependency graphs from hidden states across benchmarks.

Significance. If the central claims hold after addressing controls, this would provide concrete evidence that LLM multi-step reasoning is represented internally as directed acyclic graphs rather than linear chains, with layer-specific emergence patterns. The probing approach and controls that preserve surface text while disrupting reasoning structure represent a useful methodological contribution for interpretability research, potentially informing future work on graph-based reasoning models and layer-wise analysis.

major comments (2)

[Framework / Probing Setup] The description of ground-truth DAG construction from benchmarks and node-text associations (likely in the framework section) does not explicitly demonstrate independence from surface textual cues (e.g., logical connectives correlating with depth). Without this, the linear probes may recover graph properties via text correlations rather than model-internal encoding, directly undermining the claim of meaningful DAG structure in hidden states.
[Controls and Results] In the controls evaluation (likely §4), the paper states that controls disrupt reasoning-relevant structure while preserving surface text and reports nontrivial recovery, but provides no quantitative before/after probe accuracy deltas or ablation on whether text-graph correlations are broken. This leaves open whether the reported layer-wise peaks and scale variations reflect internal structure or dataset artifacts.

minor comments (2)

[Abstract / Experiments] The abstract and results sections should list the specific reasoning benchmarks used, as generalizability claims depend on their diversity and how DAGs were extracted.
[Methods] Notation for pairwise distance and adjacency predictions could be clarified with a small example table showing input hidden states and target labels.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight important aspects of our framework and evaluation that require clarification. We address each major point below and will revise the manuscript to strengthen the evidence that our probes capture internal DAG structure rather than surface textual artifacts.

read point-by-point responses

Referee: [Framework / Probing Setup] The description of ground-truth DAG construction from benchmarks and node-text associations (likely in the framework section) does not explicitly demonstrate independence from surface textual cues (e.g., logical connectives correlating with depth). Without this, the linear probes may recover graph properties via text correlations rather than model-internal encoding, directly undermining the claim of meaningful DAG structure in hidden states.

Authors: We agree that the current description does not sufficiently demonstrate independence from surface textual cues. In the revised manuscript, we will expand the framework section with a detailed account of the node-text association process, explicitly showing how textual realizations are tied to reasoning steps extracted from the benchmarks rather than surface features such as connectives. We will also add an ablation comparing probe performance on original hidden states against a text-only baseline (e.g., probes trained on bag-of-words or connective features) and on paraphrased node texts that preserve surface form but alter structural cues, to quantify that the reported accuracies reflect model-internal encodings. revision: yes
Referee: [Controls and Results] In the controls evaluation (likely §4), the paper states that controls disrupt reasoning-relevant structure while preserving surface text and reports nontrivial recovery, but provides no quantitative before/after probe accuracy deltas or ablation on whether text-graph correlations are broken. This leaves open whether the reported layer-wise peaks and scale variations reflect internal structure or dataset artifacts.

Authors: We acknowledge that the controls section lacks the requested quantitative details. In the revision, we will add explicit tables reporting probe accuracy before and after each control condition across layers, benchmarks, and model scales, including the deltas. We will further include an analysis of text-graph correlations (e.g., Pearson correlations between textual features such as connective frequency and graph properties such as depth or adjacency) computed on the original and controlled data to confirm that these correlations are broken while surface text statistics remain comparable. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the probing framework

full rationale

The paper trains linear probes on LLM hidden states to recover externally defined DAG properties (node depth, pairwise distance, adjacency) taken directly from benchmark reasoning graphs. These ground-truth properties are independent inputs derived from the benchmarks rather than from the model's representations or any fitted parameters within the paper. Probe accuracy, layer-wise emergence, and controls that preserve surface text while disrupting structure are measured outcomes, not reductions by construction. No self-definitional loops, fitted inputs renamed as predictions, or load-bearing self-citations appear in the derivation chain. The framework is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that linear probes can extract graph structure from hidden states and that the constructed textual reasoning problems possess true underlying DAGs that models might encode. No free parameters or invented entities are described in the abstract.

free parameters (1)

probe weights
Lightweight linear probes are trained on hidden states, introducing fitted parameters.

axioms (1)

domain assumption Linear probes are sufficient to detect encoded graph properties in hidden states
The method relies on training linear probes to predict node depth, pairwise distance, and adjacency.

pith-pipeline@v0.9.0 · 5499 in / 1183 out tokens · 36456 ms · 2026-05-16T10:58:29.249137+00:00 · methodology

From Chains to DAGs: Probing the Graph Structure of Reasoning in LLMs

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)