Interaction Locality in Hierarchical Recursive Reasoning
Pith reviewed 2026-05-21 05:00 UTC · model grok-4.3
The pith
High-level recurrent states in recursive reasoning models write information locally, accumulating into global solutions through repeated updates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Across these models, activation patching gives the clearest architectural fingerprint: high-level recurrent states tend to write information within nearby cells or same-segment units, while repeated recursive updates accumulate these local writes into broader solution structure. This pattern holds across maze paths, Sudoku constraints, and ARC-AGI object neighborhoods, with the strongest concentration in TRM.
What carries the argument
interaction locality, a task-geometry-aware framework for measuring whether information flow stays within nearby cells or semantic segments, or crosses them
If this is right
- High-level recurrent states concentrate their information writes within nearby cells or same-segment units.
- Repeated recursive updates accumulate local writes to form broader solution structures.
- This local accumulation pattern is observed consistently across maze, Sudoku, and ARC-AGI tasks.
- The strongest concentration of locality occurs in the TRM model.
- In embodied 3D models, spatial locality concentrates at the handoff between visual features and the grounding module.
Where Pith is reading between the lines
- The framework could be applied to additional reasoning models to identify which architectures naturally support local interaction patterns.
- Designing models to explicitly encourage local writes might lead to more efficient training or inference on spatial tasks.
- The contrast between recursive and embodied models suggests that explicit recursion may be required for distributed local processing throughout the network.
- Future experiments could test if disrupting local writes specifically impairs performance on tasks requiring hierarchical planning.
Load-bearing premise
The chosen intervention techniques of sparse-autoencoder feature ablations and finite-noise activation patching provide faithful causal measurements of information flow without substantial artifacts.
What would settle it
If models with disrupted local write patterns maintain high performance on Maze-Hard, Sudoku Extreme, and ARC-AGI, or if non-recursive models exhibit the same accumulation effect under activation patching.
Figures
read the original abstract
Spatial reasoning requires both location-bound computation and location-invariant structure: agents must make local moves while preserving route, object, or constraint-level plans. We propose interaction locality, a task-geometry-aware framework for measuring whether information flow stays within nearby cells or semantic segments, or crosses them. We instantiate the framework with sparse-autoencoder feature ablations and finite-noise activation patching, with structural Jacobian and attention checks reported in the appendix, and apply it to HRM and TRM, two compact hierarchical and recursive reasoning models, on Maze-Hard, Sudoku Extreme, and ARC-AGI. Across these models, activation patching gives the clearest architectural fingerprint: high-level recurrent states tend to write information within nearby cells or same-segment units, while repeated recursive updates accumulate these local writes into broader solution structure. This pattern holds across maze paths, Sudoku constraints, and ARC-AGI object neighborhoods, with the strongest concentration in TRM. To test whether interaction locality extends beyond toy-yet-challenging grid benchmarks, we also apply it to MTU3D, a large-scale embodied 3D scene-grounding model. In this MTU3D setting, causal spatial locality appears primarily at the transition where visual scene features are handed to the downstream grounding module, rather than uniformly throughout the visual encoder. This contrast suggests that the local-to-global handoff observed in HRM and TRM is tied to explicit recursive reasoning dynamics, while embodied 3D models may concentrate causal spatial structure at module boundaries. Interaction locality turns the intuitive local-execution/global-planning story into a reproducible measurement framework for recursive and embodied spatial reasoning.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces 'interaction locality' as a task-geometry-aware framework for quantifying whether information flow in spatial reasoning models remains confined to nearby cells or semantic segments. It instantiates the framework via sparse-autoencoder feature ablations and finite-noise activation patching (with Jacobian and attention checks in the appendix) and applies it to the hierarchical recursive models HRM and TRM on Maze-Hard, Sudoku Extreme, and ARC-AGI, as well as to the embodied model MTU3D. The central empirical claim is that high-level recurrent states produce local writes that repeated recursive updates then accumulate into broader solution structure, with the pattern strongest in TRM and appearing at module boundaries in MTU3D.
Significance. If the causal interventions prove faithful, the framework supplies a reproducible measurement tool that turns the local-execution/global-planning intuition into falsifiable, cross-model comparisons. The extension from grid-based recursive reasoning to 3D embodied grounding is a useful contrast that could inform architectural choices in spatial AI systems.
major comments (2)
- [§4] §4 (Activation Patching Results): The headline claim that 'activation patching gives the clearest architectural fingerprint' of local writes in high-level recurrent states rests on finite-noise interventions. In recursive models such as HRM and TRM, such interventions can break the consistency required for information to propagate across segments, potentially inducing the observed locality as an artifact rather than revealing an intrinsic property. A direct comparison to zero-noise or alternative causal probes is needed to establish that the locality is not measurement-induced.
- [Appendix] Appendix (Jacobian and Attention Checks): The structural Jacobian and attention analyses are presented as supporting evidence, yet their independence from the finite-noise patching family is not demonstrated. If these checks rely on the same intervention style or post-hoc interpretation, they do not constitute an independent validation of the local-to-global accumulation pattern.
minor comments (2)
- [Abstract and Results] The abstract and results sections would benefit from explicit quantitative metrics (e.g., locality scores with error bars or statistical tests) rather than qualitative descriptions of 'strongest concentration' or 'clearest fingerprint'.
- [§3] Notation for 'interaction locality' is introduced descriptively; an explicit mathematical definition or distance metric in the main text would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address the major comments point by point below, providing clarifications and indicating revisions to the manuscript.
read point-by-point responses
-
Referee: [§4] §4 (Activation Patching Results): The headline claim that 'activation patching gives the clearest architectural fingerprint' of local writes in high-level recurrent states rests on finite-noise interventions. In recursive models such as HRM and TRM, such interventions can break the consistency required for information to propagate across segments, potentially inducing the observed locality as an artifact rather than revealing an intrinsic property. A direct comparison to zero-noise or alternative causal probes is needed to establish that the locality is not measurement-induced.
Authors: We appreciate the referee's concern that finite-noise interventions could disrupt recursive propagation and potentially artifactually induce locality in HRM and TRM. Finite noise was deliberately chosen to preserve sufficient model functionality for the task to continue, allowing isolation of causal effects from specific states without complete disruption. To directly address this, the revised Section 4 now includes a side-by-side comparison with zero-noise activation patching. These additional results show that the local-write pattern in high-level recurrent states remains consistent under zero-noise conditions, supporting that the observed locality reflects an intrinsic property rather than a measurement artifact. We have also added explicit discussion of intervention strength rationale. revision: yes
-
Referee: [Appendix] Appendix (Jacobian and Attention Checks): The structural Jacobian and attention analyses are presented as supporting evidence, yet their independence from the finite-noise patching family is not demonstrated. If these checks rely on the same intervention style or post-hoc interpretation, they do not constitute an independent validation of the local-to-global accumulation pattern.
Authors: The structural Jacobian computes output sensitivity to inputs via direct differentiation without any activation replacement or noise. The attention checks examine the model's native attention weights during standard forward passes. Neither involves patching-style interventions. The revised appendix now includes a dedicated subsection explicitly contrasting the computational procedures of these analyses with finite-noise patching, demonstrating their methodological independence and showing how they provide convergent support for the local-to-global accumulation without shared assumptions or post-hoc interpretations. revision: yes
Circularity Check
No significant circularity: framework defined then applied empirically
full rationale
The paper defines interaction locality as a new task-geometry-aware measurement framework for information flow in recursive models, then instantiates it using standard causal interventions (SAE ablations, finite-noise patching) plus appendix checks. These are applied to HRM/TRM on grid tasks and to MTU3D, yielding observed patterns such as local writes by high-level states and accumulation across recursive updates. No equations or derivations are shown that reduce a claimed result to the framework definition by construction, nor are predictions fitted from subsets and re-labeled as outputs. Self-citations are not load-bearing for the central empirical fingerprint, and the measurements are presented as independent probes rather than tautological restatements of the inputs. The derivation chain is therefore self-contained as an application of defined methods to model behavior.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Activation patching and feature ablations faithfully capture causal information flow in the tested models
invented entities (1)
-
interaction locality
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
activation patching gives the clearest architectural fingerprint: high-level recurrent states tend to write information within nearby cells or same-segment units, while repeated recursive updates accumulate these local writes into broader solution structure
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
interaction locality... task-geometry-aware framework
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
doi: 10.1109/CVPR52733.2024.01370. URL https://arxiv.org/abs/2401.12168. Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. Scannet: Richly-annotated 3d reconstructions of indoor scenes. InProc. Computer Vision and Pattern Recognition (CVPR), IEEE,
-
[2]
Adaptive Computation Time for Recurrent Neural Networks
Alex Graves. Adaptive computation time for recurrent neural networks.arXiv preprint arXiv:1603.08983,
work page internal anchor Pith review Pith/arXiv arXiv
-
[3]
Michael I. Ivanitskiy et al. Structured world representations in maze-solving transformers.arXiv preprint arXiv:2312.02566,
-
[4]
Less is More: Recursive Reasoning with Tiny Networks
Alexia Jolicoeur-Martineau. Less is more: Recursive reasoning with tiny networks.arXiv preprint arXiv:2510.04871,
work page internal anchor Pith review Pith/arXiv arXiv
-
[5]
URLhttps://arxiv.org/abs/2601.12626. Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan Foster, Grace Lam, Pannag Sanketi, Quan Vuong, Thomas Kollar, Benjamin Burchfiel, Russ Tedrake, Dorsa Sadigh, Sergey Levine, Percy Liang, and Chelsea Finn. OpenVLA: An open-source vision-language-action mod...
-
[6]
OpenVLA: An Open-Source Vision-Language-Action Model
URL https://arxiv.org/abs/2406.09246. Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov. ROME: Locating and editing factual associations in GPT. InAdvances in Neural Information Processing Systems,
work page internal anchor Pith review Pith/arXiv arXiv
-
[7]
Geospatial mechanistic interpretability of large language models.arXiv preprint arXiv:2505.03368,
Stef De Sabbata, Stefano Mizzaro, and Kevin Roitero. Geospatial mechanistic interpretability of large language models.arXiv preprint arXiv:2505.03368,
-
[8]
Spies, William Edwards, Michael I
Alex F. Spies, William Edwards, Michael I. Ivanitskiy, Adrians Skapars, Tilman Räuker, Katsumi Inoue, Alessandra Russo, and Murray Shanahan. Transformers use causal world models in maze-solving tasks.arXiv preprint arXiv:2412.11867,
-
[9]
Attribution patching outperforms automated circuit discovery.arXiv preprint arXiv:2310.10348,
Aaquib Syed, Can Rager, and Arthur Conmy. Attribution patching outperforms automated circuit discovery.arXiv preprint arXiv:2310.10348,
-
[10]
Guan Wang, Jin Li, Yuhao Sun, Xing Chen, Changling Liu, Yue Wu, Meng Lu, Sen Song, and Yasin Abbasi Yadkori. Hierarchical reasoning model.arXiv preprint arXiv:2506.21734,
work page internal anchor Pith review Pith/arXiv arXiv
-
[12]
Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
URL https://arxiv.org/abs/2506.09965. Duo Zheng, Shijia Huang, Lin Zhao, Yiwu Zhong, and Liwei Wang. Towards learning a generalist model for embodied navigation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
work page internal anchor Pith review Pith/arXiv arXiv
-
[13]
doi: 10.1109/CVPR52733.2024.01293. URL https: //arxiv.org/abs/2312.02010. Ziyu Zhu, Xilin Wang, Yixuan Li, Zhuofan Zhang, Xiaojian Ma, Yixin Chen, Baoxiong Jia, Wei Liang, Qian Yu, Zhidong Deng, Siyuan Huang, and Qing Li. Move to understand a 3d scene: Bridging visual grounding and exploration for efficient and versatile embodied navigation.International ...
-
[14]
11 Table 3: Claim-level triangulation
URLhttps://mtu3d.github.io/. 11 Table 3: Claim-level triangulation. The framework connects semantic readability, finite-intervention locality, and structural topology so each mechanistic claim has a dedicated source of evidence. Mechanistic claim Main evidence Supporting detail Step 1 is a meaningful analysis window Hidden-state drops and step-1 decodes s...
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.