Recognition: unknown
Loom: A Scalable Analytical Neural Computer Architecture
Pith reviewed 2026-05-10 16:56 UTC · model grok-4.3
The pith
A transformer with analytically derived weights implements a full 22-opcode computer that runs any compiled C program when looped on a state tensor.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that the complete semantics of a 22-opcode instruction set can be realized exactly by the weight matrices of an 8-layer transformer, so that iterative application of the model to a state tensor X in R^{d x n} produces the same sequence of state updates that a conventional CPU would perform on the same program.
What carries the argument
The analytically derived weight matrices of the fixed 8-layer transformer, which encode the 22-opcode instruction set and are applied iteratively to advance the program counter and update the state tensor.
If this is right
- Execution cost per instruction is constant and independent of program length or history.
- The same fixed weights can run any program that fits in the state tensor, because programs live in the data rather than in the model.
- Increasing the state-tensor dimensions d and n scales the architecture while leaving the weight matrices unchanged.
- Compact configurations (smaller d and n) remain sufficient for non-trivial tasks such as a 9x9 Sudoku solver using only 284 instructions.
Where Pith is reading between the lines
- If the approach extends to larger instruction sets, transformers could serve as exact simulators for general-purpose computation without requiring learned control logic.
- Fixed analytical weights open the possibility of hybrid systems in which a neural model performs both learned inference and deterministic algorithmic steps in the same forward pass.
- Because cost is independent of program length, the architecture could be tested for long-running computations where conventional neural execution would become impractical.
Load-bearing premise
The analytically derived weights correctly realize every opcode's semantics for arbitrary programs without overflow, precision loss, or unhandled edge cases inside the fixed-size state tensor.
What would settle it
Compile a C program containing nested loops, function calls, and array operations to Loom's instruction set, run it through the model until the program counter reaches zero, and check whether the final state tensor matches the output of the same program on a standard C compiler.
read the original abstract
We present Loom, a computer architecture that executes programs compiled from C inside a looped transformer whose weights are derived analytically. The architecture implements a 22-opcode instruction set in 8 transformer layers. Each forward pass executes one instruction; the model is applied iteratively until the program counter reaches zero. The full machine state resides in a single tensor $X \in \mathbb{R}^{d \times n}$ of fixed size, and every step has fixed cost for fixed $d$ and $n$, independent of program length or execution history. The default configuration uses $d = 155$ and $n = 1024$, yielding 4.7 million parameters and 928 instruction slots. A compact configuration at $d = 146$ and $n = 512$ suffices for a 9$\times$9 Sudoku solver (284 instructions). The weights are program-independent: programs live in the state tensor, and the same fixed-weight model executes any compiled program. We make Loom source code publicly available at https://github.com/mkturkcan/Loom.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents Loom, a computer architecture that executes programs compiled from C inside a looped transformer whose weights are derived analytically. It implements a 22-opcode instruction set across 8 transformer layers, with each forward pass executing one instruction on a fixed-size state tensor X ∈ ℝ^{d×n} (default d=155, n=1024). Execution iterates until the program counter reaches zero, with fixed per-step cost independent of program length; weights are program-independent and the same model runs any compiled program. A compact configuration (d=146, n=512) is shown for a 9×9 Sudoku solver using 284 instructions. Source code is released publicly.
Significance. If the analytical weight construction is correct and complete, the result would be significant: it offers a fixed-parameter, fixed-cost neural architecture that exactly emulates a symbolic ISA without training or approximation, with programs residing entirely in the state tensor. Public code availability supports reproducibility and allows direct verification of the claimed analytical derivations.
major comments (2)
- [Weight derivation and ISA implementation sections] The central claim that the 8-layer transformer with analytically derived weights exactly implements the full 22-opcode ISA (arithmetic, memory, control flow, PC updates) on the fixed-size state tensor X is load-bearing, yet the manuscript provides no explicit derivation, matrix constructions, attention patterns, or error analysis showing how each opcode is realized without precision loss, overflow, or unhandled cases (e.g., negative values, zero-division). This must be supplied, as any deviation would break exact semantics for arbitrary C programs despite the fixed-cost guarantee.
- [Sudoku example and experimental validation] The Sudoku solver demonstration (compact d=146, n=512 configuration, 284 instructions) is presented as validation, but lacks a full execution trace, comparison against a reference interpreter, or coverage of edge cases in the state tensor slots; this is insufficient to confirm correctness of the analytical construction for the complete ISA.
minor comments (2)
- [State tensor definition] Clarify the exact allocation of the d-dimensional state slots to registers, memory, flags, and program counter, including how overflow or out-of-bounds accesses are handled analytically.
- [Abstract and architecture overview] The abstract states 'the weights are program-independent' and 'programs live in the state tensor'; add a short table or diagram mapping the 22 opcodes to the 8 layers to improve readability.
Simulated Author's Rebuttal
We thank the referee for their careful reading and for recognizing the potential significance of Loom if the analytical claims hold. We address each major comment below and will revise the manuscript to strengthen the presentation of the derivations and validation.
read point-by-point responses
-
Referee: [Weight derivation and ISA implementation sections] The central claim that the 8-layer transformer with analytically derived weights exactly implements the full 22-opcode ISA (arithmetic, memory, control flow, PC updates) on the fixed-size state tensor X is load-bearing, yet the manuscript provides no explicit derivation, matrix constructions, attention patterns, or error analysis showing how each opcode is realized without precision loss, overflow, or unhandled cases (e.g., negative values, zero-division). This must be supplied, as any deviation would break exact semantics for arbitrary C programs despite the fixed-cost guarantee.
Authors: We agree that explicit derivations are necessary to fully substantiate the exact ISA implementation. The manuscript describes the high-level analytical construction of the 8-layer transformer and the fixed state tensor X, with the complete weight matrices and opcode mappings implemented in the publicly released code. To address the concern directly, we will add a dedicated appendix in the revised manuscript containing the explicit matrix constructions, attention patterns, and per-opcode derivations for all 22 instructions, along with an error analysis addressing floating-point precision, potential overflow, and edge cases including negative values and division by zero. This will make the exact semantics verifiable from the paper itself. revision: yes
-
Referee: [Sudoku example and experimental validation] The Sudoku solver demonstration (compact d=146, n=512 configuration, 284 instructions) is presented as validation, but lacks a full execution trace, comparison against a reference interpreter, or coverage of edge cases in the state tensor slots; this is insufficient to confirm correctness of the analytical construction for the complete ISA.
Authors: We acknowledge that the Sudoku demonstration would benefit from more detailed validation to confirm the analytical construction. The example illustrates that a compact configuration suffices for a non-trivial program, and the released code permits direct execution and inspection. In the revision we will augment the experimental section with a full execution trace (for the Sudoku solver or a reduced test case), side-by-side comparisons against a reference interpreter for selected instructions, and explicit discussion of state-tensor slot allocation and edge-case handling. These additions will provide stronger empirical support without altering the core claims. revision: yes
Circularity Check
No significant circularity; analytical derivation is self-contained
full rationale
The paper derives the 8-layer transformer weights analytically from the semantics of the 22-opcode instruction set, with programs residing in the fixed-size state tensor X and the same fixed weights executing any compiled C program. No equations or claims reduce by construction to their own inputs (e.g., no fitted parameters renamed as predictions, no self-definitional loops where the result defines the premise, and no load-bearing self-citations for uniqueness theorems or ansatzes). The central claim is an external assertion of exact emulation correctness for arbitrary programs, which stands or falls on verification outside the derivation itself rather than tautological reduction. This is the expected non-circular outcome for an analytical construction.
Axiom & Free-Parameter Ledger
free parameters (2)
- state dimension d
- state width n
axioms (1)
- domain assumption An 8-layer transformer with analytically set weights can implement the full semantics of a 22-opcode instruction set without training.
invented entities (1)
-
Fixed-size state tensor X in R^{d x n}
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Simulation of graph algorithms with looped transformers
Artur Back de Luca and Kimon Fountoulakis. Simulation of graph algorithms with looped transformers. InProceedings of the 41st International Conference on Machine Learning, volume 235 ofProceedings of Machine Learning Research, pages 2319–2363. PMLR, 2024
2024
-
[2]
Learning to add, multiply, and execute algorithmic instructions exactly with neural networks
Artur Back de Luca, George Giapitzakis, and Kimon Fountoulakis. Learning to add, multiply, and execute algorithmic instructions exactly with neural networks. InAdvances in Neural Information Processing Systems, 2025. NeurIPS 2025 poster
2025
-
[3]
Lee, and Dim- itris Papailiopoulos
Angeliki Giannou, Shashank Rajput, Jy-Yong Sohn, Kangwook Lee, Jason D. Lee, and Dim- itris Papailiopoulos. Looped transformers as programmable computers. InProceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, pages 11398–11442. PMLR, 2023
2023
-
[4]
Kakade, Samy Jelassi, and Eran Malach
Kaiying Hou, David Brandfonbrener, Sham M. Kakade, Samy Jelassi, and Eran Malach. Uni- versal length generalization with Turing programs. InProceedings of the 42nd International Conference on Machine Learning, volume 267 ofProceedings of Machine Learning Research, pages 23873–23893. PMLR, 2025
2025
-
[5]
Zekai Huang, Yingyu Liang, Zhenmei Shi, Zhao Song, and Zhen Zhuang. Neural algorithmic reasoning for hypergraphs with looped transformers.arXiv preprint arXiv:2501.10688, 2025
-
[6]
Softmax trans- formers are Turing-complete
Hongjian Jiang, Michael Hahn, Georg Zetzsche, and Anthony Widjaja Lin. Softmax trans- formers are Turing-complete. InInternational Conference on Learning Representations, 2026. ICLR 2026 oral
2026
-
[7]
Angelo Huang, Samuele Marro, Anthony Cohn, Nigel Shadbolt, and Michael Wooldridge
Emanuele La Malfa, Christoph Weinhuber, Orazio Torre, Fangru Lin, X. Angelo Huang, Samuele Marro, Anthony Cohn, Nigel Shadbolt, and Michael Wooldridge. Code simulation as a proxy for high-order tasks in large language models.arXiv preprint arXiv:2502.03568, 2025
-
[8]
Constant bit-size transformers are Turing complete
Qian Li and Yuyi Wang. Constant bit-size transformers are Turing complete. InAdvances in Neural Information Processing Systems, 2025. NeurIPS 2025 poster
2025
-
[9]
Efficient Turing machine simulation with transformers
Qian Li and Yuyi Wang. Efficient Turing machine simulation with transformers. InInterna- tional Conference on Learning Representations, 2026. ICLR 2026 poster
2026
-
[10]
Looped ReLU MLPs may be all you need as practical programmable computers
Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song, and Yufa Zhou. Looped ReLU MLPs may be all you need as practical programmable computers. InProceedings of The 28th In- ternational Conference on Artificial Intelligence and Statistics, volume 258 ofProceedings of Machine Learning Research, pages 2647–2655. PMLR, 2025. 16
2025
-
[11]
Tracr: Compiled transformers as a laboratory for interpretability
David Lindner, J´ anos Kram´ ar, Sebastian Farquhar, Matthew Rahtz, Thomas McGrath, and Vladimir Mikulik. Tracr: Compiled transformers as a laboratory for interpretability. InAd- vances in Neural Information Processing Systems, 2023. NeurIPS 2023
2023
-
[12]
Algorithmic language models with neurally com- piled libraries.arXiv preprint arXiv:2407.04899, 2024
Lucas Saldyt and Subbarao Kambhampati. Algorithmic language models with neurally com- piled libraries.arXiv preprint arXiv:2407.04899, 2024
-
[13]
Dale Schuurmans, Hanjun Dai, and Francesco Zanini. Autoregressive large language models are computationally universal.arXiv preprint arXiv:2410.03170, 2024
-
[14]
Toutanova
Peter Shaw, James Cohan, Jacob Eisenstein, Kenton Lee, Jonathan Berant, and Kristina N. Toutanova. ALTA: Compiler-based analysis of transformers.Transactions on Machine Learn- ing Research, 2025
2025
-
[15]
Can LLMs be computers? Percepta blog post, 2026
Christos Tzamos. Can LLMs be computers? Percepta blog post, 2026. Published March 11, 2026
2026
-
[16]
Thinking like transformers
Gail Weiss, Yoav Goldberg, and Eran Yahav. Thinking like transformers. InProceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine Learning Research, pages 11080–11090. PMLR, 2021
2021
-
[17]
On expressive power of looped transformers: Theoretical analysis and enhancement via timestep encoding
Kevin Xu and Issei Sato. On expressive power of looped transformers: Theoretical analysis and enhancement via timestep encoding. InProceedings of the 42nd International Conference on Machine Learning, volume 267 ofProceedings of Machine Learning Research, pages 69613– 69646. PMLR, 2025
2025
-
[18]
Transformers are efficient compilers, provably
Xiyu Zhai, Runlong Zhou, Liao Zhang, and Simon Shaolei Du. Transformers are efficient compilers, provably. InConference on Language Modeling, 2025. COLM 2025
2025
-
[19]
Yifan Zhang, Wei Bi, Kechi Zhang, Dongming Jin, Jie Fu, and Zhi Jin. Weights to code: Extracting interpretable algorithms from the discrete transformer.arXiv preprint arXiv:2601.05770, 2026. 17
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.