AGI Maze supplies a family of grid maze environments with a clean API to benchmark agents on learning and using world state representations rather than local pattern matching, with preliminary tests showing vanilla LLMs fail even on small instances.
Training-Free Looped Transformers
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
We introduce training-free looped transformers, in which a lightweight inference-time wrapper loops a contiguous mid-stack block of layers of a frozen checkpoint without additional fine-tuning, continued training, or architectural changes. Unlike prior looped transformer methods that train with the looped structure end-to-end, we retrofit recurrence onto pretrained models at test time. We show that naive block reapplication usually degrades performance, highlighting the importance of the loop application strategy. Motivated by viewing a pre-norm transformer block as a forward Euler step on an ODE, we instead treat looping as a refinement of the same approximation, replacing one large update with smaller damped sub-steps. Across seven dense, sparse MoE, and MLA+MoE model families, our method improves Qwen3-4B-Instruct by +2.64 pp on MMLU-Pro, Qwen3-30B-A3B-Instruct by +1.14 pp on CommonsenseQA, and Moonlight-16B-A3B-Instruct by +1.20 pp on OpenBookQA.
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
AGI Maze as a Benchmark Framework for World-Modeling Agents
AGI Maze supplies a family of grid maze environments with a clean API to benchmark agents on learning and using world state representations rather than local pattern matching, with preliminary tests showing vanilla LLMs fail even on small instances.