pith. sign in

arxiv: 2605.20784 · v1 · pith:GV3OLV5Xnew · submitted 2026-05-20 · 💻 cs.AI · cs.LG

Interaction Locality in Hierarchical Recursive Reasoning

Pith reviewed 2026-05-21 05:00 UTC · model grok-4.3

classification 💻 cs.AI cs.LG
keywords interaction localityhierarchical recursive reasoningactivation patchingspatial reasoninginformation flowrecurrent statesgrid tasks
0
0 comments X

The pith

High-level recurrent states in recursive reasoning models write information locally, accumulating into global solutions through repeated updates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes interaction locality as a framework to measure whether information flow in spatial reasoning stays local to nearby cells or semantic segments. It tests this on hierarchical recursive models HRM and TRM using activation patching on tasks like Maze-Hard, Sudoku Extreme, and ARC-AGI. The results show that recurrent states handle local writes while recursion builds these into broader structures. A similar but boundary-focused pattern appears in an embodied 3D model. This offers a concrete way to track how models balance local computation with global planning in complex spatial problems.

Core claim

Across these models, activation patching gives the clearest architectural fingerprint: high-level recurrent states tend to write information within nearby cells or same-segment units, while repeated recursive updates accumulate these local writes into broader solution structure. This pattern holds across maze paths, Sudoku constraints, and ARC-AGI object neighborhoods, with the strongest concentration in TRM.

What carries the argument

interaction locality, a task-geometry-aware framework for measuring whether information flow stays within nearby cells or semantic segments, or crosses them

If this is right

  • High-level recurrent states concentrate their information writes within nearby cells or same-segment units.
  • Repeated recursive updates accumulate local writes to form broader solution structures.
  • This local accumulation pattern is observed consistently across maze, Sudoku, and ARC-AGI tasks.
  • The strongest concentration of locality occurs in the TRM model.
  • In embodied 3D models, spatial locality concentrates at the handoff between visual features and the grounding module.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The framework could be applied to additional reasoning models to identify which architectures naturally support local interaction patterns.
  • Designing models to explicitly encourage local writes might lead to more efficient training or inference on spatial tasks.
  • The contrast between recursive and embodied models suggests that explicit recursion may be required for distributed local processing throughout the network.
  • Future experiments could test if disrupting local writes specifically impairs performance on tasks requiring hierarchical planning.

Load-bearing premise

The chosen intervention techniques of sparse-autoencoder feature ablations and finite-noise activation patching provide faithful causal measurements of information flow without substantial artifacts.

What would settle it

If models with disrupted local write patterns maintain high performance on Maze-Hard, Sudoku Extreme, and ARC-AGI, or if non-recursive models exhibit the same accumulation effect under activation patching.

Figures

Figures reproduced from arXiv: 2605.20784 by Tetsuro Morimura, Yosuke Miyanishi.

Figure 1
Figure 1. Figure 1: Interaction-locality framework. Panel (a) shows the task geometries that define local [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: H/L hidden-state change across reasoning steps for HRM (top) and TRM (bottom). Blue [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Initial, step-1, and final decoded outputs for all model–task pairs (HRM top row, TRM [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Representative SAE semantic locality examples. Left: a Maze HRM feature whose ablation [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Finite-noise activation patching analogs for Maze and Sudoku. Left column: normalized [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Interaction locality on MTU3D/ScanNet. Bars show locality scores with 95% CIs; the [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Q-value evolution across reasoning steps for HRM and TRM on all three tasks. The scalar [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: Selected first-step cycle decodes for Sudoku HRM. The two columns show the H-cycle [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗
Figure 11
Figure 11. Figure 11: Selected first-step cycle decodes for ARC-AGI HRM. The two columns show H-cycle [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Selected first-step cycle decodes for ARC-AGI TRM. As in Maze and Sudoku, these [PITH_FULL_IMAGE:figures/full_fig_p015_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Full top-segment qualitative analysis. The strongest human-readable positive cases are the [PITH_FULL_IMAGE:figures/full_fig_p016_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Maze-TRM segment-level peak comparison. The left panel shows corridor segments on [PITH_FULL_IMAGE:figures/full_fig_p017_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Additional quantitative Jacobian segment analyses. Panel (a) breaks Sudoku Jacobian [PITH_FULL_IMAGE:figures/full_fig_p018_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Per-position Jacobian qualitative analysis for the selected high-variance Maze-TRM [PITH_FULL_IMAGE:figures/full_fig_p018_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Per-position Jacobian qualitative analysis for the selected high-variance Sudoku-TRM [PITH_FULL_IMAGE:figures/full_fig_p019_17.png] view at source ↗
Figure 19
Figure 19. Figure 19: Raw accuracy-drop curves and self-drop fractions for the finite-noise activation patching. [PITH_FULL_IMAGE:figures/full_fig_p020_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Activation-difference heatmaps for HRM/Maze patching analogs. Rows are source [PITH_FULL_IMAGE:figures/full_fig_p021_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Activation-difference heatmaps for TRM/Maze patching analogs. The within-L channel is [PITH_FULL_IMAGE:figures/full_fig_p021_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Activation-difference heatmaps for HRM/Sudoku patching analogs. Sudoku shows [PITH_FULL_IMAGE:figures/full_fig_p021_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Activation-difference heatmaps for TRM/Sudoku patching analogs. Several cross channels [PITH_FULL_IMAGE:figures/full_fig_p021_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: Cross-dataset finite-noise patching locality from the experiment summarized numerically [PITH_FULL_IMAGE:figures/full_fig_p022_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: Sudoku constraint-type breakdown for all three probe methods, HRM (top) and TRM [PITH_FULL_IMAGE:figures/full_fig_p023_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: Qualitative illustration of zero-ablation patching on two ScanNet scenes (top: [PITH_FULL_IMAGE:figures/full_fig_p024_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: MTU3D locality across checkpoints. Stage 1–2 input-patching locality is stable across [PITH_FULL_IMAGE:figures/full_fig_p025_27.png] view at source ↗
Figure 28
Figure 28. Figure 28: MTU3D attention weight versus object-pair distance. Attention contains spatial distance [PITH_FULL_IMAGE:figures/full_fig_p025_28.png] view at source ↗
Figure 29
Figure 29. Figure 29: Position-level L→L cross-cycle Jacobian heatmaps for early and late cycle pairs. The heatmaps visualize the same transition captured by same-position diagonal concentration: early pairs are more self-focused and later pairs show broader spatial attribution [PITH_FULL_IMAGE:figures/full_fig_p026_29.png] view at source ↗
read the original abstract

Spatial reasoning requires both location-bound computation and location-invariant structure: agents must make local moves while preserving route, object, or constraint-level plans. We propose interaction locality, a task-geometry-aware framework for measuring whether information flow stays within nearby cells or semantic segments, or crosses them. We instantiate the framework with sparse-autoencoder feature ablations and finite-noise activation patching, with structural Jacobian and attention checks reported in the appendix, and apply it to HRM and TRM, two compact hierarchical and recursive reasoning models, on Maze-Hard, Sudoku Extreme, and ARC-AGI. Across these models, activation patching gives the clearest architectural fingerprint: high-level recurrent states tend to write information within nearby cells or same-segment units, while repeated recursive updates accumulate these local writes into broader solution structure. This pattern holds across maze paths, Sudoku constraints, and ARC-AGI object neighborhoods, with the strongest concentration in TRM. To test whether interaction locality extends beyond toy-yet-challenging grid benchmarks, we also apply it to MTU3D, a large-scale embodied 3D scene-grounding model. In this MTU3D setting, causal spatial locality appears primarily at the transition where visual scene features are handed to the downstream grounding module, rather than uniformly throughout the visual encoder. This contrast suggests that the local-to-global handoff observed in HRM and TRM is tied to explicit recursive reasoning dynamics, while embodied 3D models may concentrate causal spatial structure at module boundaries. Interaction locality turns the intuitive local-execution/global-planning story into a reproducible measurement framework for recursive and embodied spatial reasoning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces 'interaction locality' as a task-geometry-aware framework for quantifying whether information flow in spatial reasoning models remains confined to nearby cells or semantic segments. It instantiates the framework via sparse-autoencoder feature ablations and finite-noise activation patching (with Jacobian and attention checks in the appendix) and applies it to the hierarchical recursive models HRM and TRM on Maze-Hard, Sudoku Extreme, and ARC-AGI, as well as to the embodied model MTU3D. The central empirical claim is that high-level recurrent states produce local writes that repeated recursive updates then accumulate into broader solution structure, with the pattern strongest in TRM and appearing at module boundaries in MTU3D.

Significance. If the causal interventions prove faithful, the framework supplies a reproducible measurement tool that turns the local-execution/global-planning intuition into falsifiable, cross-model comparisons. The extension from grid-based recursive reasoning to 3D embodied grounding is a useful contrast that could inform architectural choices in spatial AI systems.

major comments (2)
  1. [§4] §4 (Activation Patching Results): The headline claim that 'activation patching gives the clearest architectural fingerprint' of local writes in high-level recurrent states rests on finite-noise interventions. In recursive models such as HRM and TRM, such interventions can break the consistency required for information to propagate across segments, potentially inducing the observed locality as an artifact rather than revealing an intrinsic property. A direct comparison to zero-noise or alternative causal probes is needed to establish that the locality is not measurement-induced.
  2. [Appendix] Appendix (Jacobian and Attention Checks): The structural Jacobian and attention analyses are presented as supporting evidence, yet their independence from the finite-noise patching family is not demonstrated. If these checks rely on the same intervention style or post-hoc interpretation, they do not constitute an independent validation of the local-to-global accumulation pattern.
minor comments (2)
  1. [Abstract and Results] The abstract and results sections would benefit from explicit quantitative metrics (e.g., locality scores with error bars or statistical tests) rather than qualitative descriptions of 'strongest concentration' or 'clearest fingerprint'.
  2. [§3] Notation for 'interaction locality' is introduced descriptively; an explicit mathematical definition or distance metric in the main text would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address the major comments point by point below, providing clarifications and indicating revisions to the manuscript.

read point-by-point responses
  1. Referee: [§4] §4 (Activation Patching Results): The headline claim that 'activation patching gives the clearest architectural fingerprint' of local writes in high-level recurrent states rests on finite-noise interventions. In recursive models such as HRM and TRM, such interventions can break the consistency required for information to propagate across segments, potentially inducing the observed locality as an artifact rather than revealing an intrinsic property. A direct comparison to zero-noise or alternative causal probes is needed to establish that the locality is not measurement-induced.

    Authors: We appreciate the referee's concern that finite-noise interventions could disrupt recursive propagation and potentially artifactually induce locality in HRM and TRM. Finite noise was deliberately chosen to preserve sufficient model functionality for the task to continue, allowing isolation of causal effects from specific states without complete disruption. To directly address this, the revised Section 4 now includes a side-by-side comparison with zero-noise activation patching. These additional results show that the local-write pattern in high-level recurrent states remains consistent under zero-noise conditions, supporting that the observed locality reflects an intrinsic property rather than a measurement artifact. We have also added explicit discussion of intervention strength rationale. revision: yes

  2. Referee: [Appendix] Appendix (Jacobian and Attention Checks): The structural Jacobian and attention analyses are presented as supporting evidence, yet their independence from the finite-noise patching family is not demonstrated. If these checks rely on the same intervention style or post-hoc interpretation, they do not constitute an independent validation of the local-to-global accumulation pattern.

    Authors: The structural Jacobian computes output sensitivity to inputs via direct differentiation without any activation replacement or noise. The attention checks examine the model's native attention weights during standard forward passes. Neither involves patching-style interventions. The revised appendix now includes a dedicated subsection explicitly contrasting the computational procedures of these analyses with finite-noise patching, demonstrating their methodological independence and showing how they provide convergent support for the local-to-global accumulation without shared assumptions or post-hoc interpretations. revision: yes

Circularity Check

0 steps flagged

No significant circularity: framework defined then applied empirically

full rationale

The paper defines interaction locality as a new task-geometry-aware measurement framework for information flow in recursive models, then instantiates it using standard causal interventions (SAE ablations, finite-noise patching) plus appendix checks. These are applied to HRM/TRM on grid tasks and to MTU3D, yielding observed patterns such as local writes by high-level states and accumulation across recursive updates. No equations or derivations are shown that reduce a claimed result to the framework definition by construction, nor are predictions fitted from subsets and re-labeled as outputs. Self-citations are not load-bearing for the central empirical fingerprint, and the measurements are presented as independent probes rather than tautological restatements of the inputs. The derivation chain is therefore self-contained as an application of defined methods to model behavior.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Abstract provides limited detail on underlying assumptions; framework relies on validity of causal interventions and task geometry definitions.

axioms (1)
  • domain assumption Activation patching and feature ablations faithfully capture causal information flow in the tested models
    Central to interpreting the architectural fingerprints reported.
invented entities (1)
  • interaction locality no independent evidence
    purpose: Framework for measuring local vs cross-segment information flow in spatial reasoning
    New concept introduced to turn intuitive local-execution story into measurable framework

pith-pipeline@v0.9.0 · 5812 in / 1218 out tokens · 25865 ms · 2026-05-21T05:00:16.641274+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · 5 internal anchors

  1. [1]

    Bae, Y .-J

    doi: 10.1109/CVPR52733.2024.01370. URL https://arxiv.org/abs/2401.12168. Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. Scannet: Richly-annotated 3d reconstructions of indoor scenes. InProc. Computer Vision and Pattern Recognition (CVPR), IEEE,

  2. [2]

    Adaptive Computation Time for Recurrent Neural Networks

    Alex Graves. Adaptive computation time for recurrent neural networks.arXiv preprint arXiv:1603.08983,

  3. [3]

    Ivanitskiy et al

    Michael I. Ivanitskiy et al. Structured world representations in maze-solving transformers.arXiv preprint arXiv:2312.02566,

  4. [4]

    Less is More: Recursive Reasoning with Tiny Networks

    Alexia Jolicoeur-Martineau. Less is more: Recursive reasoning with tiny networks.arXiv preprint arXiv:2510.04871,

  5. [5]

    URLhttps://arxiv.org/abs/2601.12626. Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan Foster, Grace Lam, Pannag Sanketi, Quan Vuong, Thomas Kollar, Benjamin Burchfiel, Russ Tedrake, Dorsa Sadigh, Sergey Levine, Percy Liang, and Chelsea Finn. OpenVLA: An open-source vision-language-action mod...

  6. [6]

    OpenVLA: An Open-Source Vision-Language-Action Model

    URL https://arxiv.org/abs/2406.09246. Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov. ROME: Locating and editing factual associations in GPT. InAdvances in Neural Information Processing Systems,

  7. [7]

    Geospatial mechanistic interpretability of large language models.arXiv preprint arXiv:2505.03368,

    Stef De Sabbata, Stefano Mizzaro, and Kevin Roitero. Geospatial mechanistic interpretability of large language models.arXiv preprint arXiv:2505.03368,

  8. [8]

    Spies, William Edwards, Michael I

    Alex F. Spies, William Edwards, Michael I. Ivanitskiy, Adrians Skapars, Tilman Räuker, Katsumi Inoue, Alessandra Russo, and Murray Shanahan. Transformers use causal world models in maze-solving tasks.arXiv preprint arXiv:2412.11867,

  9. [9]

    Attribution patching outperforms automated circuit discovery.arXiv preprint arXiv:2310.10348,

    Aaquib Syed, Can Rager, and Arthur Conmy. Attribution patching outperforms automated circuit discovery.arXiv preprint arXiv:2310.10348,

  10. [10]

    Hierarchical Reasoning Model

    Guan Wang, Jin Li, Yuhao Sun, Xing Chen, Changling Liu, Yue Wu, Meng Lu, Sen Song, and Yasin Abbasi Yadkori. Hierarchical reasoning model.arXiv preprint arXiv:2506.21734,

  11. [12]

    Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing

    URL https://arxiv.org/abs/2506.09965. Duo Zheng, Shijia Huang, Lin Zhao, Yiwu Zhong, and Liwei Wang. Towards learning a generalist model for embodied navigation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,

  12. [13]

    Bae, Y .-J

    doi: 10.1109/CVPR52733.2024.01293. URL https: //arxiv.org/abs/2312.02010. Ziyu Zhu, Xilin Wang, Yixuan Li, Zhuofan Zhang, Xiaojian Ma, Yixin Chen, Baoxiong Jia, Wei Liang, Qian Yu, Zhidong Deng, Siyuan Huang, and Qing Li. Move to understand a 3d scene: Bridging visual grounding and exploration for efficient and versatile embodied navigation.International ...

  13. [14]

    11 Table 3: Claim-level triangulation

    URLhttps://mtu3d.github.io/. 11 Table 3: Claim-level triangulation. The framework connects semantic readability, finite-intervention locality, and structural topology so each mechanistic claim has a dedicated source of evidence. Mechanistic claim Main evidence Supporting detail Step 1 is a meaningful analysis window Hidden-state drops and step-1 decodes s...