No Plan, Yet Human: A Reactive Robotics Model Predicts Human Planning Failures on a Clinical Task

Antonia K\"ongeter; Michael Migacev; Oliver Brock; Vito Mengers

arxiv: 2605.16514 · v1 · pith:M65TCL3Dnew · submitted 2026-05-15 · 💻 cs.RO · cs.AI

No Plan, Yet Human: A Reactive Robotics Model Predicts Human Planning Failures on a Clinical Task

Michael Migacev , Vito Mengers , Antonia K\"ongeter , Oliver Brock This is my paper

Pith reviewed 2026-05-20 17:37 UTC · model grok-4.3

classification 💻 cs.RO cs.AI

keywords Tower of Londonreactive modelplanning capacityParkinson's diseasecognitive testroboticssequential planninggradient descent

0 comments

The pith

A reactive robotics model reproduces the pattern of human planning failures on the Tower of London test better than a planning baseline when planning capacity is reduced.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies a reactive gradient-descent framework originally built for robot manipulation to the Tower of London test, a clinical measure of planning ability. Without lookahead or any built-in knowledge of cognition, the model matches the exact ordering of which problems humans find hard across 24 instances and does better than simple structural features of the tasks. It also predicts performance on problems not used in its evaluation. The central result is a dissociation: the reactive model accounts for the errors of groups with reduced planning capacity, such as certain clinical populations, while a standard planning model accounts better for healthy controls. This pattern indicates that lower planning capacity leads people to fall back on reactive strategies.

Core claim

Without any lookahead planning or knowledge of human cognition, AICON reproduces the fine-grained human difficulty ordering across 24 problems better than structural task parameters and generalizes to held-out problems in a leave-two-out evaluation. Crucially, AICON outperforms a planning baseline for groups with reduced planning capacity while the planning baseline better captures healthy controls. This dissociation was predicted by the original AICON paper, which noted that the model's failure modes resemble those of Parkinson's patients who struggle with goal hierarchies but not move counts. This suggests that as planning capacity is reduced, human behavior shifts toward the reactive mode

What carries the argument

AICON, a reactive gradient-descent framework that solves sequential manipulation tasks through continuous real-time cost minimization without explicit search or lookahead.

If this is right

The reactive model can predict the specific problems and error types that arise for clinical groups on planning assessments.
Human performance on sequential tasks transitions from planning-based to reactive as capacity decreases.
The same model abstraction accounts for both robotic control and human behavior on planning tests.
Leave-two-out generalization shows the model identifies general sources of task difficulty rather than fitting only the training set.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar reactive models might predict performance drops in other sequential decisions under fatigue, stress, or impairment.
Interventions that support planning capacity could be evaluated by whether they increase the relative fit of planning models over reactive ones.
The pattern may appear in additional biological systems where reactive control serves as a default under resource limits.

Load-bearing premise

The assumption that superior fit by the reactive model to reduced-capacity groups means humans are actually using reactive behavior rather than some other form of impaired planning.

What would settle it

Collect error patterns from a new group of participants with reduced planning capacity on additional Tower of London problems and check whether those patterns align more closely with the reactive model's predictions than with the planning baseline's predictions.

Figures

Figures reproduced from arXiv: 2605.16514 by Antonia K\"ongeter, Michael Migacev, Oliver Brock, Vito Mengers.

**Figure 1.** Figure 1: AICON model for the Tower of London test. Board state and action are sensors encoding the current world state (left). Movable and Free are recursive estimators tracking which beads can be picked up and which fields can receive a bead respectively. Exposed Beads and Supported Empty Fields are active interconnections computing these quantities from the current state. Legal Move Constraint is an active inter… view at source ↗

**Figure 2.** Figure 2: AICON reproduces the fine-grained human difficulty ordering and outperforms planning baselines for groups with reduced planning capacity. Kendall’s τ between model-predicted and human difficulty ordering (success rate: a; additional moves: b) is shown for AICON on fitting problems (Train) and held-out problems (Test), alongside bidirectional BFS, unidirectional BFS, and optimal moves as baselines on the T… view at source ↗

**Figure 3.** Figure 3: Within-difficulty generalization reveals where AICON’s reactive mechanism has an advantage. Held-out τ is shown separately for easy vs. easy (left), easy vs. hard (middle), and hard vs. hard (right) splits, for success rate (top row) and additional moves (bottom row). Easy vs. hard generalization (b, e) is near-perfect for all models— distinguishing easy from hard problems is trivial. For easy vs. easy pai… view at source ↗

read the original abstract

Understanding why some sequential planning problems are harder than others requires models that go beyond average performance. They should capture the specific pattern of which problems are hard, and ideally fail in the same way people do when planning capacity is reduced. We apply AICON, a reactive gradient-descent framework developed for robotic manipulation, to the Tower of London test, a cognitive test used to assess planning in Parkinson's disease, mild cognitive impairment, and stroke. Without any lookahead planning or knowledge of human cognition, AICON reproduces the fine-grained human difficulty ordering across 24 problems better than structural task parameters and generalizes to held-out problems in a leave-two-out evaluation. Crucially, AICON outperforms a planning baseline for groups with reduced planning capacity while the planning baseline better captures healthy controls. This dissociation was predicted by the original AICON paper, which noted that the model's failure modes resemble those of Parkinson's patients who struggle with goal hierarchies but not move counts. This suggests that as planning capacity is reduced, human behavior shifts toward the reactive mode AICON models. The finding extends a broader pattern: AICON, originally built for robotics, now captures aspects of biological behavior across perception, eye movements, and sequential planning, suggesting its core abstraction reflects something real about how biological systems are organized.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AICON matches difficulty orderings on Tower of London for reduced-capacity groups better than a planning baseline, but the claimed shift to reactive control rests on prior resemblance rather than fresh error-type matches.

read the letter

The main thing to know is that this paper applies the existing AICON reactive gradient-descent model to the Tower of London test and reports that it reproduces human difficulty orderings across 24 problems better than structural parameters, generalizes in leave-two-out checks, and outperforms a planning baseline specifically for groups with reduced planning capacity while the baseline fits healthy controls better.

Circularity Check

1 steps flagged

Dissociation interpretation anchored in original AICON paper's resemblance note to Parkinson's

specific steps

self citation load bearing [Abstract]
"Crucially, AICON outperforms a planning baseline for groups with reduced planning capacity while the planning baseline better captures healthy controls. This dissociation was predicted by the original AICON paper, which noted that the model's failure modes resemble those of Parkinson's patients who struggle with goal hierarchies but not move counts. This suggests that as planning capacity is reduced, human behavior shifts toward the reactive mode AICON models."

The suggestion that the observed dissociation demonstrates a human shift to the reactive mode is justified by citing the prior AICON paper's resemblance observation rather than by direct, quantitative alignment of AICON-generated error types (e.g., subgoal commitment failures) against the human error distributions collected in the reduced-capacity cohorts.

full rationale

The manuscript supplies independent empirical content via its application of AICON to the Tower of London task, superior reproduction of fine-grained difficulty ordering across 24 problems, and leave-two-out generalization. However, the central interpretive step—that the performance dissociation between AICON and the planning baseline for reduced-capacity groups indicates a shift toward reactive control—rests on the original AICON paper's qualitative note about resemblance to Parkinson's goal-hierarchy deficits rather than new quantitative matching of specific error patterns in the current data. This produces moderate self-citation dependence on the interpretive claim while leaving the raw performance results grounded in fresh experiments.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Review based on abstract only; no explicit free parameters, new axioms, or invented entities are introduced in the provided text. The work relies on the pre-existing AICON framework and standard assumptions about the Tower of London test as a planning measure.

axioms (2)

domain assumption AICON operates without lookahead planning or knowledge of human cognition
Explicitly stated in abstract as the basis for applying the model to human data.
domain assumption Tower of London performance differences reflect planning capacity variations across clinical groups
Used to interpret model performance as predictive of human planning failures.

pith-pipeline@v0.9.0 · 5768 in / 1365 out tokens · 86377 ms · 2026-05-20T17:37:59.900161+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

[1]

Albrecht, R., Ragni, M.: Spatial planning: An ACT-R model for the Tower of Londontask.In:InternationalConferenceonSpatialCognition.pp.222–236(2014)

work page 2014
[2]

Behavioral and Brain Sciences20(4), 723–742 (1997)

Ballard, D.H., Hayhoe, M.M., Pook, P.K., Rao, R.P.: Deictic codes for the embod- iment of cognition. Behavioral and Brain Sciences20(4), 723–742 (1997)

work page 1997
[3]

Neuropharmacology37(4-5), 407– 419 (1998)

Balleine, B.W., Dickinson, A.: Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology37(4-5), 407– 419 (1998)

work page 1998
[4]

bioRxiv, 2024.06.20.599814 (2024)

Battaje, A., Godinez, A., Hanning, N.M., Rolfs, M., Brock, O.: An information processing pattern from robotics predicts properties of the human visual system. bioRxiv, 2024.06.20.599814 (2024)

work page 2024
[5]

Cognitive Brain Research20(3), 462–472 (2004) 12 M

Kaller, C.P., Unterrainer, J.M., Rahm, B., Halsband, U.: The impact of problem structure on planning: Insights from the Tower of London task. Cognitive Brain Research20(3), 462–472 (2004) 12 M. Migacev, V. Mengers, et al

work page 2004
[6]

Psychological Assessment24(1), 46 (2012)

Kaller,C.P.,Unterrainer,J.M.,Stahl,C.:AssessingplanningabilitywiththeTower of London task: psychometric properties of a structurally balanced problem set. Psychological Assessment24(1), 46 (2012)

work page 2012
[7]

Philosophy of Science78(4), 601–627 (2011)

Kaplan, D.M., Craver, C.F.: The explanatory force of dynamical and mathematical models in neuroscience: A mechanistic perspective. Philosophy of Science78(4), 601–627 (2011)

work page 2011
[8]

Biometrika30(1-2), 81–93 (1938)

Kendall, M.G.: A new measure of rank correlation. Biometrika30(1-2), 81–93 (1938)

work page 1938
[9]

Kirsh,D.,Maglio,P.:Ondistinguishingepistemicfrompragmaticaction.Cognitive Science18(4), 513–549 (1994)

work page 1994
[10]

Neuropsychologia75, 646–655 (2015)

Köstering, L., Schmidt, C.S., Egger, K., Amtage, F., Peter, J., Klöppel, S., Beume, L.A., Hoeren, M., Weiller, C., Kaller, C.P.: Assessment of planning performance in clinical samples: Reliability and validity of the Tower of London task (TOL-F). Neuropsychologia75, 646–655 (2015)

work page 2015
[11]

Archives of Clinical Neuropsychology31(7), 738–753 (2016)

Köstering, L., Schmidt, C.S., Weiller, C., Kaller, C.P.: Analyses of rule breaks and errors during planning in computerized tower tasks: Insights from neurological patients. Archives of Clinical Neuropsychology31(7), 738–753 (2016)

work page 2016
[12]

Journal of Neuroscience43(7), 1074–1088 (2023)

Levenstein, D., Alvarez, V.A., Amarasingham, A., Azab, H., Chen, Z.S., Gerkin, R.C., Hasenstaub, A., Iyer, R., Jolivet, R.B., Marzen, S., et al.: On the role of theory and modeling in neuroscience. Journal of Neuroscience43(7), 1074–1088 (2023)

work page 2023
[13]

The International Journal of Robotics Research 41(8), 741–777 (2022)

Martín-Martín, R., Brock, O.: Coupled recursive estimation for online interactive perception of articulated objects. The International Journal of Robotics Research 41(8), 741–777 (2022)

work page 2022
[14]

Neuron110(6), 914–934 (2022)

Mattar, M.G., Lengyel, M.: Planning in the brain. Neuron110(6), 914–934 (2022)

work page 2022
[15]

McKinlay, A., Kaller, C., Grace, R., Dalrymple-Alford, J., Anderson, T., Fink, J., Roger, D.: Planning in parkinson’s disease: A matter of problem structure? Neuropsychologia46(1), 384–389 (2008)

work page 2008
[16]

In: IEEE International Conference on Robotics and Automation

Mengers, V., Brock, O.: No plan but everything under control: Robustly solving se- quential tasks with dynamically composed gradient descent. In: IEEE International Conference on Robotics and Automation. pp. 90–96 (2025)

work page 2025
[17]

Scientific Reports14, 27314 (2024)

Mengers, V., Raoufi, M., Brock, O., Hamann, H., Romanczuk, P.: Leveraging un- certainty in collective opinion dynamics with heterogeneity. Scientific Reports14, 27314 (2024)

work page 2024
[18]

Journal of Vision25(2), 6 (2025)

Mengers, V., Roth, N., Brock, O., Obermayer, K., Rolfs, M.: A robotics-inspired scanpath model reveals the importance of uncertainty and semantic object cues for gaze guidance in dynamic scenes. Journal of Vision25(2), 6 (2025)

work page 2025
[19]

Philosophical Transactions of the Royal Society of London

Shallice, T.: Specific impairments of planning. Philosophical Transactions of the Royal Society of London. B, Biological Sciences298(1089), 199–209 (1982)

work page 1982
[20]

Artificial Intelligence125(1-2), 119–153 (2001)

Slaney, J., Thiébaux, S.: Blocks World revisited. Artificial Intelligence125(1-2), 119–153 (2001)

work page 2001
[21]

Quarterly Journal of Experimental Psychology 64(3), 485–503 (2011)

Waldron, S.M., Patrick, J., Duggan, G.B.: The influence of goal-state access cost on planning during problem solving. Quarterly Journal of Experimental Psychology 64(3), 485–503 (2011)

work page 2011
[22]

In: Proceedings of the Annual Meeting of the Cognitive Science Society

Zhang, C., Lipovetzky, N., Kemp, C.: Comparing AI planning algorithms with humans on the Tower of London task. In: Proceedings of the Annual Meeting of the Cognitive Science Society. vol. 45 (2023)

work page 2023
[23]

In: Proceedings of the Annual Meet- ing of the Cognitive Science Society

Zhang, C., Liu, Y., Kulic, D., Carreno-Medrano, P., Burke, M.: Modeling human sequential decision-making in the Tower of London: Incorporating individual differ- ences and timing-based replanning inference. In: Proceedings of the Annual Meet- ing of the Cognitive Science Society. vol. 47 (2025)

work page 2025

[1] [1]

Albrecht, R., Ragni, M.: Spatial planning: An ACT-R model for the Tower of Londontask.In:InternationalConferenceonSpatialCognition.pp.222–236(2014)

work page 2014

[2] [2]

Behavioral and Brain Sciences20(4), 723–742 (1997)

Ballard, D.H., Hayhoe, M.M., Pook, P.K., Rao, R.P.: Deictic codes for the embod- iment of cognition. Behavioral and Brain Sciences20(4), 723–742 (1997)

work page 1997

[3] [3]

Neuropharmacology37(4-5), 407– 419 (1998)

Balleine, B.W., Dickinson, A.: Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology37(4-5), 407– 419 (1998)

work page 1998

[4] [4]

bioRxiv, 2024.06.20.599814 (2024)

Battaje, A., Godinez, A., Hanning, N.M., Rolfs, M., Brock, O.: An information processing pattern from robotics predicts properties of the human visual system. bioRxiv, 2024.06.20.599814 (2024)

work page 2024

[5] [5]

Cognitive Brain Research20(3), 462–472 (2004) 12 M

Kaller, C.P., Unterrainer, J.M., Rahm, B., Halsband, U.: The impact of problem structure on planning: Insights from the Tower of London task. Cognitive Brain Research20(3), 462–472 (2004) 12 M. Migacev, V. Mengers, et al

work page 2004

[6] [6]

Psychological Assessment24(1), 46 (2012)

Kaller,C.P.,Unterrainer,J.M.,Stahl,C.:AssessingplanningabilitywiththeTower of London task: psychometric properties of a structurally balanced problem set. Psychological Assessment24(1), 46 (2012)

work page 2012

[7] [7]

Philosophy of Science78(4), 601–627 (2011)

Kaplan, D.M., Craver, C.F.: The explanatory force of dynamical and mathematical models in neuroscience: A mechanistic perspective. Philosophy of Science78(4), 601–627 (2011)

work page 2011

[8] [8]

Biometrika30(1-2), 81–93 (1938)

Kendall, M.G.: A new measure of rank correlation. Biometrika30(1-2), 81–93 (1938)

work page 1938

[9] [9]

Kirsh,D.,Maglio,P.:Ondistinguishingepistemicfrompragmaticaction.Cognitive Science18(4), 513–549 (1994)

work page 1994

[10] [10]

Neuropsychologia75, 646–655 (2015)

Köstering, L., Schmidt, C.S., Egger, K., Amtage, F., Peter, J., Klöppel, S., Beume, L.A., Hoeren, M., Weiller, C., Kaller, C.P.: Assessment of planning performance in clinical samples: Reliability and validity of the Tower of London task (TOL-F). Neuropsychologia75, 646–655 (2015)

work page 2015

[11] [11]

Archives of Clinical Neuropsychology31(7), 738–753 (2016)

Köstering, L., Schmidt, C.S., Weiller, C., Kaller, C.P.: Analyses of rule breaks and errors during planning in computerized tower tasks: Insights from neurological patients. Archives of Clinical Neuropsychology31(7), 738–753 (2016)

work page 2016

[12] [12]

Journal of Neuroscience43(7), 1074–1088 (2023)

Levenstein, D., Alvarez, V.A., Amarasingham, A., Azab, H., Chen, Z.S., Gerkin, R.C., Hasenstaub, A., Iyer, R., Jolivet, R.B., Marzen, S., et al.: On the role of theory and modeling in neuroscience. Journal of Neuroscience43(7), 1074–1088 (2023)

work page 2023

[13] [13]

The International Journal of Robotics Research 41(8), 741–777 (2022)

Martín-Martín, R., Brock, O.: Coupled recursive estimation for online interactive perception of articulated objects. The International Journal of Robotics Research 41(8), 741–777 (2022)

work page 2022

[14] [14]

Neuron110(6), 914–934 (2022)

Mattar, M.G., Lengyel, M.: Planning in the brain. Neuron110(6), 914–934 (2022)

work page 2022

[15] [15]

McKinlay, A., Kaller, C., Grace, R., Dalrymple-Alford, J., Anderson, T., Fink, J., Roger, D.: Planning in parkinson’s disease: A matter of problem structure? Neuropsychologia46(1), 384–389 (2008)

work page 2008

[16] [16]

In: IEEE International Conference on Robotics and Automation

Mengers, V., Brock, O.: No plan but everything under control: Robustly solving se- quential tasks with dynamically composed gradient descent. In: IEEE International Conference on Robotics and Automation. pp. 90–96 (2025)

work page 2025

[17] [17]

Scientific Reports14, 27314 (2024)

Mengers, V., Raoufi, M., Brock, O., Hamann, H., Romanczuk, P.: Leveraging un- certainty in collective opinion dynamics with heterogeneity. Scientific Reports14, 27314 (2024)

work page 2024

[18] [18]

Journal of Vision25(2), 6 (2025)

Mengers, V., Roth, N., Brock, O., Obermayer, K., Rolfs, M.: A robotics-inspired scanpath model reveals the importance of uncertainty and semantic object cues for gaze guidance in dynamic scenes. Journal of Vision25(2), 6 (2025)

work page 2025

[19] [19]

Philosophical Transactions of the Royal Society of London

Shallice, T.: Specific impairments of planning. Philosophical Transactions of the Royal Society of London. B, Biological Sciences298(1089), 199–209 (1982)

work page 1982

[20] [20]

Artificial Intelligence125(1-2), 119–153 (2001)

Slaney, J., Thiébaux, S.: Blocks World revisited. Artificial Intelligence125(1-2), 119–153 (2001)

work page 2001

[21] [21]

Quarterly Journal of Experimental Psychology 64(3), 485–503 (2011)

Waldron, S.M., Patrick, J., Duggan, G.B.: The influence of goal-state access cost on planning during problem solving. Quarterly Journal of Experimental Psychology 64(3), 485–503 (2011)

work page 2011

[22] [22]

In: Proceedings of the Annual Meeting of the Cognitive Science Society

Zhang, C., Lipovetzky, N., Kemp, C.: Comparing AI planning algorithms with humans on the Tower of London task. In: Proceedings of the Annual Meeting of the Cognitive Science Society. vol. 45 (2023)

work page 2023

[23] [23]

In: Proceedings of the Annual Meet- ing of the Cognitive Science Society

Zhang, C., Liu, Y., Kulic, D., Carreno-Medrano, P., Burke, M.: Modeling human sequential decision-making in the Tower of London: Incorporating individual differ- ences and timing-based replanning inference. In: Proceedings of the Annual Meet- ing of the Cognitive Science Society. vol. 47 (2025)

work page 2025