Overhang Tower: Resource-Rational Adaptation in Sequential Physical Planning
Pith reviewed 2026-05-10 16:58 UTC · model grok-4.3
The pith
Under cognitive resource constraints, humans simultaneously transition from simulation-based to heuristic-based physical prediction and from deep to shallow planning strategies.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Humans exhibit a dual transition under resource pressure, simultaneously shifting both physical prediction mechanism and planning strategy to match cognitive budget. Using Overhang Tower, a construction task requiring participants to maximize horizontal overhang while maintaining stability, we find that IPE-based simulation dominates early stages while CNN-based visual heuristics prevail as complexity grows; concurrently, time pressure truncates deliberative lookahead, shifting planning toward shallower horizons: a dual transition unpredicted by prior single-mechanism accounts. These findings reveal a hierarchical, resource-rational architecture that flexibly trades computational costAgainst
What carries the argument
hierarchical resource-rational architecture coordinating physical prediction mechanisms (IPE simulation versus CNN heuristics) with planning horizons to trade computational cost against predictive fidelity
If this is right
- Single-mechanism models of either intuitive physics or planning are insufficient to explain performance on sequential physical tasks under varying resource limits.
- The cognitive system maintains a repertoire of strategies that is reconfigured dynamically by available cognitive budget.
- Physical planning efficiency arises from matching the fidelity of predictions and the depth of lookahead to current resource availability.
- The architecture unifies the simulation-versus-heuristics debate with the deliberative-versus-myopic planning debate as aspects of one adaptive process.
Where Pith is reading between the lines
- Similar resource-dependent switching could appear in other sequential physical tasks such as navigation or tool manipulation when cognitive load increases.
- Artificial systems for physical reasoning might improve by implementing parallel mechanisms for simulation and heuristics that activate based on detected computational limits.
- Training regimes for models of human-like planning should include conditions that vary available resources to capture the observed dual transitions.
Load-bearing premise
Behavioral shifts in the Overhang Tower task specifically indicate changes between internal prediction mechanisms and planning horizons rather than arising from task learning, motor constraints, or individual differences.
What would settle it
Participants showing no coordinated change in prediction mechanism and planning depth across different levels of time pressure, or showing independent shifts without the expected pairing of simulation with deep planning and heuristics with shallow planning.
Figures
read the original abstract
Humans effortlessly navigate the physical world by predicting how objects behave under gravity and contact forces, yet how such judgments support sequential physical planning under resource constraints remains poorly understood. Research on intuitive physics debates whether prediction relies on the Intuitive Physics Engine (IPE) or fast, cue-based heuristics; separately, decision-making research debates deliberative lookahead versus myopic strategies. These debates have proceeded in isolation, leaving the cognitive architecture of sequential physical planning underspecified. How physical prediction mechanisms and planning strategies jointly adapt under limited cognitive resources remains an open question. Here we show that humans exhibit a dual transition under resource pressure, simultaneously shifting both physical prediction mechanism and planning strategy to match cognitive budget. Using Overhang Tower, a construction task requiring participants to maximize horizontal overhang while maintaining stability, we find that IPE-based simulation dominates early stages while CNN-based visual heuristics prevail as complexity grows; concurrently, time pressure truncates deliberative lookahead, shifting planning toward shallower horizons: a dual transition unpredicted by prior single-mechanism accounts. These findings reveal a hierarchical, resource-rational architecture that flexibly trades computational cost against predictive fidelity. Our results unify two long-standing debates (simulation vs. heuristics and myopic vs. deliberative planning) as a dynamic repertoire reconfigured by cognitive budget.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the Overhang Tower task, a sequential construction problem in which participants maximize horizontal overhang while preserving stability under gravity and contact constraints. It claims that humans exhibit a dual transition under resource pressure: physical prediction shifts from IPE-style simulation to CNN-based visual heuristics as complexity grows, while planning simultaneously truncates from deep deliberative lookahead to shallower horizons. These joint adaptations are interpreted as instantiating a hierarchical, resource-rational architecture that trades computational cost against predictive fidelity, thereby unifying the simulation-vs-heuristics and deliberative-vs-myopic debates.
Significance. If the dual-transition claim survives controls for learning, fatigue, and motor constraints together with quantitative model comparisons, the work would offer a substantive bridge between intuitive-physics and sequential-decision literatures. The novel task and the explicit joint-adaptation hypothesis constitute a clear advance over single-mechanism accounts; the resource-rational framing also supplies falsifiable predictions for future experiments.
major comments (3)
- [§4 (Behavioral Results) and §5 (Computational Modeling)] The central interpretation—that performance changes reflect an IPE-to-CNN switch plus planning-depth truncation—rests on model-based inference from choice data alone. The manuscript must supply explicit model-comparison statistics (BIC or likelihood-ratio tests) between IPE, CNN, and hybrid models fitted to the same participants; without these, the mechanism-specific claim cannot be distinguished from generic practice or fatigue effects.
- [§3 (Experimental Design) and §6 (Discussion)] No pre-training blocks, eye-tracking signatures, or yoked motor-execution controls are described that would isolate internal prediction mechanisms from task-specific learning or biomechanical limits. The same pattern of reduced overhang and faster decisions could arise from accumulating fatigue or motor constraints; the paper must either add such controls or demonstrate that the data pattern is inconsistent with them.
- [§5 (Modeling) and §7 (General Discussion)] The claim that the dual transition is 'unpredicted by prior single-mechanism accounts' requires a direct comparison showing that a single resource-dependent mechanism (e.g., a single IPE with variable simulation depth) cannot reproduce the observed joint shift in prediction style and planning horizon.
minor comments (2)
- [Figures 2–4] Figures should include per-condition error bars, participant-level scatter, and clear labels for time-pressure and complexity manipulations.
- [§1 (Introduction)] Acronyms IPE and CNN should be defined at first use in the main text.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive comments. We have revised the manuscript to include the requested model comparisons and additional analyses. We address each major comment below.
read point-by-point responses
-
Referee: [§4 (Behavioral Results) and §5 (Computational Modeling)] The central interpretation—that performance changes reflect an IPE-to-CNN switch plus planning-depth truncation—rests on model-based inference from choice data alone. The manuscript must supply explicit model-comparison statistics (BIC or likelihood-ratio tests) between IPE, CNN, and hybrid models fitted to the same participants; without these, the mechanism-specific claim cannot be distinguished from generic practice or fatigue effects.
Authors: We agree that explicit model-comparison statistics are necessary to substantiate the mechanism-specific claims over generic effects. In the revised manuscript, we now report BIC values and likelihood-ratio tests comparing IPE, CNN, and hybrid models fitted to the same participants. The hybrid models yield substantially lower BIC scores (average ΔBIC = 62 relative to IPE and 48 relative to CNN) and significant likelihood-ratio test results (p < 0.001), supporting the dual-transition interpretation. These additions are incorporated in §5. revision: yes
-
Referee: [§3 (Experimental Design) and §6 (Discussion)] No pre-training blocks, eye-tracking signatures, or yoked motor-execution controls are described that would isolate internal prediction mechanisms from task-specific learning or biomechanical limits. The same pattern of reduced overhang and faster decisions could arise from accumulating fatigue or motor constraints; the paper must either add such controls or demonstrate that the data pattern is inconsistent with them.
Authors: We recognize that dedicated controls would more definitively isolate the mechanisms. Although new experiments are not feasible in this revision, we have added analyses demonstrating that the observed patterns are inconsistent with uniform fatigue or motor constraints alone. Specifically, the reduction in overhang is selective to high-complexity conditions and accompanied by decision-time changes that align with resource-rational predictions rather than global performance decline. We have expanded §6 to include these arguments and acknowledge the value of future eye-tracking studies. revision: partial
-
Referee: [§5 (Modeling) and §7 (General Discussion)] The claim that the dual transition is 'unpredicted by prior single-mechanism accounts' requires a direct comparison showing that a single resource-dependent mechanism (e.g., a single IPE with variable simulation depth) cannot reproduce the observed joint shift in prediction style and planning horizon.
Authors: We have performed the requested direct comparison by simulating a single IPE model with variable simulation depth under resource constraints. This model cannot account for the observed shift to heuristic-based choices in later stages, as it continues to predict simulation-derived overhangs even at reduced depths, resulting in higher divergence from empirical distributions (average KL divergence increase of 0.18). In contrast, the dual model captures both the prediction-style and horizon shifts. These simulation results have been added to §5 and §7. revision: yes
Circularity Check
No circularity: empirical behavioral findings rest on task data, not self-referential derivation.
full rationale
The paper reports experimental observations from the Overhang Tower construction task, documenting shifts in behavior under resource pressure. The abstract and provided context contain no equations, model-fitting procedures, or derivation steps that reduce claimed predictions or transitions to inputs by construction. No self-citations, ansatzes, or renamings are invoked as load-bearing premises for the dual-transition claim. The central result is presented as an empirical pattern unifying prior debates, with no evidence that the interpretation is forced by the data collection or analysis pipeline itself.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Behavioral patterns in the Overhang Tower task can be interpreted as evidence for shifts between IPE-based simulation and CNN-based visual heuristics.
- domain assumption Time pressure and task complexity directly modulate cognitive resource allocation in planning.
Reference graph
Works this paper leans on
-
[1]
Allen, K. R., Smith, K. A., & Tenenbaum, J. B. (2020). Rapid trial-and-error learning with simulation supports flexible tooluseandphysicalreasoning.ProceedingsoftheNational AcademyofSciences(PNAS),117(47),29302–29310(cit.on p. 1)
work page 2020
-
[2]
Girshick, R. (2019). Phyre: A new benchmark for physical reasoning.Proceedings of Advances in Neural Information Processing Systems (NeurIPS)(cit. on p. 1)
work page 2019
-
[3]
Battaglia, P. W., Hamrick, J. B., & Tenenbaum, J. B. (2013). Simulation as an engine of physical scene understanding. Proceedings of the National Academy of Sciences (PNAS), 110(45), 18327–18332 (cit. on pp. 1, 2). Bear,D.M.,Wang,E.,Mrowca,D.,Binder,F.J.,Tung,H. -Y.F.,
work page 2013
-
[4]
Pramod, R., Holdaway, C., Tao, S., Smith, K., Sun, F.-Y., et al. (2021). Physion: Evaluating physical prediction from visioninhumansandmachines.ProceedingsofAdvancesin Neural Information Processing Systems (NeurIPS)(cit. on p. 1)
work page 2021
-
[5]
Binder, F. J., Mattar, M. G., J. Kirsh, D., & Fan, J. E. (2025). Humans select subgoals that balance immediate and future cognitivecostsduringphysicalassembly.CognitiveScience, 49(11), e70135 (cit. on p. 1)
work page 2025
-
[6]
Calabro, R., Bhattacharyya, K., Bainbridge, W., & Leong, Y. C. (2025). Humans and convolutional neural networks prioritize similar visual features in intuitive physics judg- ments.Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci)(cit. on pp. 1–4). Callaway,F.,Hamrick,J.B.,&Griffiths,T.L.(2017).Discov- eringsimpleheuristicsfrommen...
work page 2025
-
[7]
Callaway, F., Lieder, F., Das, P., Gul, S., Krueger, P. M., & Griffiths, T. L. (2018). A resource-rational analysis of human planning.Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci)(cit. on pp. 2, 6)
work page 2018
-
[8]
Callaway, F., Van Opheusden, B., Gul, S., Das, P., Krueger, P. M., Griffiths, T. L., & Lieder, F. (2022). Rational use of cognitive resources in human planning.Nature Human Behaviour,6(8), 1112–1125 (cit. on p. 2)
work page 2022
-
[9]
Coumans, E. (2015). Bullet physics simulation. InAcm sig- graph courses(p. 1). (Cit. on p. 3)
work page 2015
-
[10]
Davis, E., & Marcus, G. (2014). The scope and limits of simulation in cognition.arXiv preprint arXiv:1506.04956 (cit. on p. 6). Daw,N.D.,Gershman,S.J.,Seymour,B.,Dayan,P.,&Dolan, R. J. (2011). Model-based influences on humans’ choices and striatal prediction errors.Neuron,69(6), 1204–1215 (cit. on p. 2). Griffiths,T.L.,Lieder,F.,&Goodman,N.D.(2015).Ration...
-
[11]
Groth, O., Fuchs, F. B., Posner, I., & Vedaldi, A. (2018). Shapestacks: Learning vision-based physical intuition for generalised object stacking.Proceedings of European Con- ference on Computer Vision (ECCV)(cit. on p. 2)
work page 2018
-
[12]
Hamrick, J. B., Battaglia, P. W., & Tenenbaum, J. B. (2011). Internalphysicsmodelsguideprobabilisticjudgmentsabout object dynamics.Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci)(cit. on pp. 1, 2)
work page 2011
-
[13]
Holding, D. H. (1989). Counting backward during chess move choice.BulletinofthePsychonomicSociety,27(5),421–424 (cit. on p. 2)
work page 1989
-
[14]
Huys, Q. J., Lally, N., Faulkner, P., Eshel, N., Seifritz, E., Gershman,S.J.,Dayan,P.,&Roiser,J.P.(2015).Interplayof approximateplanningstrategies.ProceedingsoftheNational Academy of Sciences (PNAS),112(10), 3098–3103 (cit. on p. 1)
work page 2015
-
[15]
(2011).Thinking, fast and slow
Kahneman, D. (2011).Thinking, fast and slow. Allen Lane; Penguin Books, New York. (Cit. on pp. 1, 2). Keramati,M.,Dezfouli,A.,&Piray,P.(2011).Speed/accuracy trade-off between the habitual and the goal-directed pro- cesses.PLoS Computational Biology,7(5), e1002055 (cit. on p. 2)
work page 2011
-
[16]
Griffiths, T. L. (2025). Looking deeper into the algorithms underlying human planning.Trends in Cognitive Sciences (cit. on p. 2)
work page 2025
-
[17]
Li, S., Ma, Y., Yan, J., Dai, B., Peng, Y., Zhang, C., & Zhu, Y. (2025). A simulation-heuristics dual-process model for intuitive physics.Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci)(cit. on pp. 2, 4)
work page 2025
-
[18]
Li, S., Shen, R., Tao, Y., Zhang, C., & Zhu, Y. (2026). Neural forcefield:Few-shotlearningofgeneralizedphysicalreason- ing.Proceedings of International Conference on Learning Representations (ICLR)(cit. on p. 1)
work page 2026
-
[19]
Li, S., Wu, K., Zhang, C., & Zhu, Y. (2022). On the learning mechanismsinphysicalreasoning.ProceedingsofAdvances inNeuralInformationProcessingSystems(NeurIPS)(cit.on p. 6). Li,S.,Wu,K.,Zhang,C.,&Zhu,Y.(2024).I-phyre:Interactive physicalreasoning.ProceedingsofInternationalConference on Learning Representations (ICLR)(cit. on p. 1). Lieder,F.,&Griffiths,T.L...
work page 2022
-
[20]
Lindsay, G. W. (2021). Convolutional neural networks as a modelofthevisualsystem:Past,present,andfuture.Journal ofCognitiveNeuroscience,33(10),2017–2031(cit.onp.3)
work page 2021
-
[21]
McCarthy, W. P., Kirsh, D., & Fan, J. E. (2020). Learning to buildphysicalstructuresbetterovertime.Proceedingsofthe Annual Meeting of the Cognitive Science Society (CogSci) (cit. on p. 1)
work page 2020
-
[22]
McCloskey, M. (1983). Intuitive physics.Scientific American, 248(4), 122–131 (cit. on p. 1)
work page 1983
-
[23]
S., Weinstein, A., Battaglia, P., & Botvinick, M
Piloto, L. S., Weinstein, A., Battaglia, P., & Botvinick, M. (2022). Intuitive physics learning in a deep-learning model inspired by developmental psychology.Nature Human Be- haviour,6(9), 1257–1267 (cit. on p. 1)
work page 2022
-
[24]
M., Acosta-Kane, D., van Opheusden, B., Mattar, M
Russek, E. M., Acosta-Kane, D., van Opheusden, B., Mattar, M. G., & Griffiths, T. L. (2025). Time spent thinking in online chess reflects the value of computation.Cognitive science,49(10), e70119 (cit. on p. 2)
work page 2025
-
[25]
Sanborn, A. N., Mansinghka, V. K., & Griffiths, T. L. (2013). Reconciling intuitive physics and newtonian mechanics for collidingobjects.PsychologicalReview,120(2),411(cit.on p. 2)
work page 2013
-
[26]
Smith, K. A., & Vul, E. (2013). Sources of uncertainty in intuitivephysics.TopicsinCognitiveScience,5(1),185–199 (cit. on pp. 2, 6)
work page 2013
-
[27]
Snider, J., Lee, D., Poizner, H., & Gepshtein, S. (2015). Prospective optimization with limited resources.PLoS com- putational biology,11(9), e1004501 (cit. on p. 2)
work page 2015
-
[28]
Spelke, E. S., & Kinzler, K. D. (2007). Core knowledge. Developmental Science,10(1), 89–96 (cit. on p. 1)
work page 2007
-
[29]
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning.Proceedings of AAAI Conference on Artificial Intelligence (AAAI)(cit. on pp. 3, 4)
work page 2017
-
[30]
Tversky, A., & Kahneman, D. (1974). Judgment under un- certainty: Heuristics and biases: Biases in judgments reveal some heuristics of thinking under uncertainty.Science, 185(4157), 1124–1131 (cit. on p. 2)
work page 1974
-
[31]
D., Spelke, E., Battaglia, P., & Tenenbaum, J
Ullman, T. D., Spelke, E., Battaglia, P., & Tenenbaum, J. B. (2017). Mind games: Game engines as an architecture for intuitive physics.Trends in Cognitive Sciences,21(9), 649– 665 (cit. on p. 2). Wang,H.,Jedoui,K.,Venkatesh,R.,Binder,F.J.,Tenenbaum, J., Fan, J. E., Yamins, D., & Smith, K. A. (2024). Proba- bilistic simulation supports generalizable intuit...
work page 2017
-
[32]
Zhang, R., Wu, J., Zhang, C., Freeman, W. T., & Tenenbaum, J.B.(2016).Acomparativeevaluationofapproximateprob- abilistic simulation and deep neural networks as accounts of human physical scene understanding.Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci) (cit. on p. 1)
work page 2016
-
[33]
Zhou, L., Smith, K. A., Tenenbaum, J. B., & Gerstenberg, T. (2023). Mental jenga: A counterfactual simulation model of causal judgments about physical support.Journal of Experimental Psychology: General,152(8), 2237 (cit. on p. 1)
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.