Proximal State Nudging: Reducing Skill Atrophy from AI Assistance
Pith reviewed 2026-05-21 07:18 UTC · model grok-4.3
The pith
Proximal State Nudging directs AI-assisted users toward learnable states to reduce skill atrophy while preserving task performance.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Proximal State Nudging is a shared autonomy planner that selects actions moving the joint human-AI system into proximal states estimated to be most learnable for the human, thereby increasing unassisted performance gains up to sevenfold over blended shared autonomy while cutting collision rates by half relative to unaided practice in two CARLA driving tasks.
What carries the argument
Proximal State Nudging, the algorithm that identifies and nudges the human-AI pair toward states predicted to produce the largest future improvement in the human's unassisted reward.
If this is right
- Shared autonomy systems can maintain or increase human skill instead of eroding it.
- Task performance and learning objectives can be traded off inside a single planner rather than treated as separate goals.
- Simulated student models can serve as a useful filter before running expensive human-subject trials of learning-aware planners.
- Collision reduction relative to pure practice suggests the method improves safety during the training phase itself.
Where Pith is reading between the lines
- The same nudging principle could be tested in domains such as aircraft control or robotic surgery where skill retention under assistance is equally critical.
- Replacing the current learnability estimator with online adaptation from actual human performance data might further increase gains.
- Extending the method to multi-operator teams or longer time horizons could reveal whether proximal nudging scales beyond single-session driving tasks.
Load-bearing premise
The planner's estimates of which states are most learnable actually match the states that produce real skill gains for humans in the target tasks.
What would settle it
A replication study in which human drivers using Proximal State Nudging show no larger post-training improvement on unassisted trials than drivers using standard blended shared autonomy.
Figures
read the original abstract
Skill atrophy, the gradual decline of human capability under AI assistance, poses a safety risk in shared-control of semi-autonomous systems, where operators may be unable to distinguish their own inputs from autonomous corrections. We propose Proximal State Nudging (PSN), a shared autonomy algorithm that jointly optimizes for skill development and task performance by nudging users toward states estimated to be most learnable. We first show that PSN outperforms existing shared autonomy baselines in balancing student improvement in unassisted reward with overall shared performance, using simulated students in the classic LunarLander environment. We then present, to the best of our knowledge, the first human subject studies of a planner incorporating learning-compatible shared autonomy: across two driving tasks in the CARLA simulator (High Performance Racing and Parallel Parking, n = 60), PSN produces up to 7x larger gains in unassisted skill than standard blended shared autonomy, while incurring 50% fewer collisions than unassisted self-practice.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Proximal State Nudging (PSN), a shared autonomy algorithm that jointly optimizes for skill development and task performance by nudging users toward states estimated to be most learnable. It first validates the approach with simulated students in the LunarLander environment, showing better balance between unassisted reward improvement and shared performance than existing baselines. It then reports human subject studies (n=60) in the CARLA simulator across High Performance Racing and Parallel Parking tasks, claiming up to 7x larger gains in unassisted skill than standard blended shared autonomy while incurring 50% fewer collisions than unassisted self-practice.
Significance. If the results hold under scrutiny, this work addresses a practically important problem in human-AI shared control by mitigating skill atrophy, with direct relevance to safety in semi-autonomous driving and similar domains. The dual validation path (simulation then human experiments) is a strength, and the explicit focus on learning-compatible assistance distinguishes it from standard shared autonomy methods.
major comments (2)
- [Abstract] Abstract: the quantitative claims of 'up to 7x larger gains in unassisted skill' and '50% fewer collisions' are presented without any mention of statistical tests, error bars, per-condition participant counts, or the exact protocol used to measure post-assistance unassisted skill; these details are load-bearing for evaluating the reliability of the reported effect sizes that support the central empirical claim.
- [Human subject studies] Human subject studies: the transfer of the 'most learnable' state estimator (fitted on simulated students in LunarLander) to real human drivers in CARLA is not directly validated, e.g., via per-participant correlation between nudged-state exposure and individual skill gains; without such a check it remains unclear whether the planner's estimates causally explain the observed improvements or whether other factors are responsible.
minor comments (2)
- [Abstract] Abstract: specify the exact number of participants assigned to each condition (PSN, blended shared autonomy, self-practice) and the total duration of each driving task to permit assessment of statistical power.
- [Abstract] The abstract states 'to the best of our knowledge, the first human subject studies of a planner incorporating learning-compatible shared autonomy'; a brief comparison to the closest prior human studies on shared autonomy would strengthen this novelty claim.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which highlight important aspects of result presentation and mechanistic validation. We address each major comment below and indicate the changes planned for the revised manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the quantitative claims of 'up to 7x larger gains in unassisted skill' and '50% fewer collisions' are presented without any mention of statistical tests, error bars, per-condition participant counts, or the exact protocol used to measure post-assistance unassisted skill; these details are load-bearing for evaluating the reliability of the reported effect sizes that support the central empirical claim.
Authors: We agree that the abstract would be strengthened by referencing these supporting details. In the revision we will update the abstract to note that the reported gains are supported by statistical tests, that variability is shown via error bars in the accompanying figures, that the n=60 participants were allocated across conditions, and that post-assistance unassisted skill was measured in a dedicated evaluation phase without assistance. The full statistical procedures, participant breakdowns, and protocol remain described in the Methods and Results sections. revision: yes
-
Referee: [Human subject studies] Human subject studies: the transfer of the 'most learnable' state estimator (fitted on simulated students in LunarLander) to real human drivers in CARLA is not directly validated, e.g., via per-participant correlation between nudged-state exposure and individual skill gains; without such a check it remains unclear whether the planner's estimates causally explain the observed improvements or whether other factors are responsible.
Authors: We acknowledge that a direct per-participant correlation between nudged-state exposure and individual skill gains was not reported in the original submission. The manuscript instead demonstrates the benefit of PSN through comparative outcomes against baselines. To address the concern, we will add a post-hoc correlation analysis using the existing per-participant data, relating the number of proximal nudges received to each driver's measured unassisted skill improvement. We will report the correlation coefficient and discuss its implications for the causal role of the estimator while noting any limitations or alternative explanations. revision: yes
Circularity Check
No significant circularity in empirical comparisons
full rationale
The paper proposes the PSN algorithm for shared autonomy and validates it via direct empirical comparisons: simulated student improvement in LunarLander and human-subject metrics (unassisted skill gains, collisions) in CARLA driving tasks against baselines like blended shared autonomy and self-practice. No equations, fitted parameters, or self-citations are shown that reduce reported outcomes to inputs by construction. The planner's selection of 'most learnable' states is an algorithmic design choice whose effectiveness is assessed through independent experimental measurements rather than tautological re-derivation, making the central claims self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith.Cost.FunctionalEquationwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
PSN maintains an estimator ϕzpd,t(s) of each state’s “learnability” – how likely it is to improve a student’s unassisted skills – which feeds a beam search-based planner that optimizes for both learning and task reward
-
IndisputableMonolith.Foundation.ArithmeticFromLogicabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
across two driving tasks in the CARLA simulator (High Performance Racing and Parallel Parking, n = 60), PSN produces up to 7x larger gains in unassisted skill than standard blended shared autonomy
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
L. S. Vygotsky,Mind in society: The development of higher psycho- logical processes. Harvard university press, 1978, vol. 86
work page 1978
-
[2]
M. Sarofim, “Devil’s advocate: exploring the potential negative impacts of artificial intelligence on the field of surgery,”Journal of Medical Artificial Intelligence, vol. 7, 2024
work page 2024
-
[3]
The challenges of partially automated driving,
S. M. Casner, E. L. Hutchins, and D. Norman, “The challenges of partially automated driving,”Communications of the ACM, vol. 59, no. 5, pp. 70–77, 2016
work page 2016
-
[4]
Shared control versus traded control in driving: a debate around automation pitfalls,
J. C. de Winter, S. M. Petermeijer, and D. A. Abbink, “Shared control versus traded control in driving: a debate around automation pitfalls,” Ergonomics, vol. 66, no. 10, pp. 1494–1520, 2023
work page 2023
-
[5]
Chatgpt in education: transforming digital learning and assessment,
E. Dell’Aquila, F. Rossi, and M. Ronchetti, “Chatgpt in education: transforming digital learning and assessment,”Applied Sciences, vol. 13, no. 18, p. 10053, 2023
work page 2023
-
[6]
Autonomous driving systems: A preliminary naturalistic study of the tesla model s,
M. R. Endsley, “Autonomous driving systems: A preliminary naturalistic study of the tesla model s,”Journal of Cognitive Engineering and Decision Making, vol. 11, no. 3, pp. 225–238, 2017
work page 2017
-
[7]
T. Wada, “Simultaneous achievement of driver assistance and skill development in shared and cooperative controls,”Cognition, Technology & Work, vol. 21, no. 4, pp. 631–642, 2019
work page 2019
-
[8]
Fake it till you make it: Learning- compatible performance support,
J. Bragg and E. Brunskill, “Fake it till you make it: Learning- compatible performance support,” inProceedings of The 35th Uncertainty in Artificial Intelligence Conference, ser. Proceedings of Machine Learning Research, R. P. Adams and V . Gogate, Eds., vol
-
[9]
PMLR, 22–25 Jul 2020, pp. 915–924. [Online]. Available: https://proceedings.mlr.press/v115/bragg20a.html
work page 2020
-
[10]
Carla: An open urban driving simulator,
A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun, “Carla: An open urban driving simulator,” inProceedings of the 1st Annual Conference on Robot Learning (CoRL), 2017, pp. 1–16
work page 2017
-
[11]
A policy-blending formalism for shared control,
A. D. Dragan and S. S. Srinivasa, “A policy-blending formalism for shared control,”International Journal of Robotics Research (IJRR), vol. 32, pp. 790–805, 2013
work page 2013
-
[12]
Shared autonomy via hindsight optimization,
S. Javdani, S. Srinivasa, and J. A. Bagnell, “Shared autonomy via hindsight optimization,” inRobotics: Science and Systems, 2015
work page 2015
-
[13]
Shared Autonomy via Deep Reinforcement Learning
S. Reddy, A. D. Dragan, and S. Levine, “Shared autonomy via deep reinforcement learning,” inArxiv 1802.01744, 2018. [Online]. Available: https://arxiv.org/abs/1802.01744
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[14]
Highly parallelized data-driven mpc for minimal intervention shared control,
A. Broad, T. Murphey, and B. Argall, “Highly parallelized data-driven mpc for minimal intervention shared control,” inRobotics: science and systems, 2019
work page 2019
-
[15]
Task-based hybrid shared control for training through forceful interaction,
K. Fitzsimons, A. Kalinowska, J. Dewald, and T. Murphey, “Task-based hybrid shared control for training through forceful interaction,”The International Journal of Robotics Research, vol. 39, pp. 1138 – 1154, 2019
work page 2019
-
[16]
Assistive teaching of motor control tasks to humans,
M. Srivastava, E. Biyik, S. Mirchandani, N. Goodman, and D. Sadigh, “Assistive teaching of motor control tasks to humans,” inAdvances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 28 517–28 529
work page 2022
-
[17]
P. Agarwal and A. Deshpande, “A framework for adaptation of training task, assistance and feedback for optimizing motor (re)-learning with a robotic exoskeleton,”IEEE Robotics and Automation Letters, vol. 4, pp. 808–815, 2019
work page 2019
-
[18]
Towards modeling and influencing the dynamics of human learning,
R. Tian, A. Bajcsy, M. Tomizuka, and A. D. Dragan, “Towards modeling and influencing the dynamics of human learning,” inProceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction. Association for Computing Machinery, 2023, pp. 12–21
work page 2023
-
[19]
Shared autonomy for proximal teaching,
M. Srivastava, R. Iranmanesh, Y . Cui, D. Gopinath, E. S. Sumner, A. Silva, L. Dees, G. Rosman, and D. Sadigh, “Shared autonomy for proximal teaching,” inProceedings of the 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI 2025), 2025, pp. 232– 241
work page 2025
-
[20]
The impact of automation on pilot cognition and performance,
S. M. Casner, E. L. Hutchins, and D. Norman, “The impact of automation on pilot cognition and performance,”Human Factors, vol. 56, no. 1, pp. 1–17, 2014
work page 2014
-
[21]
Shared control versus traded control in driving: a debate around automation pitfalls,
J. C. F. de Winter, S. M. Petermeijer, and D. A. Abbink, “Shared control versus traded control in driving: a debate around automation pitfalls,”Ergonomics, vol. 66, no. 10, pp. 1494–1520, Oct. 2023
work page 2023
-
[22]
F. Parchmannet al., “Ai-induced deskilling in medicine: A mixed- method review and research agenda for healthcare and beyond,” Artificial Intelligence Review, 2024
work page 2024
-
[23]
Training llm agents to empower humans,
E. Ellis, V . Myers, J. Tuyls, S. Levine, A. Dragan, and B. Eysen- bach, “Training llm agents to empower humans,”arXiv preprint arXiv:2510.13709, 2025
-
[24]
Coach: Cooperative robot teaching,
C. Yu, Y . Xu, L. Li, and D. Hsu, “Coach: Cooperative robot teaching,” inProceedings of The 6th Conference on Robot Learning, ser. Proceedings of Machine Learning Research, K. Liu, D. Kulic, and J. Ichnowski, Eds., vol. 205. PMLR, 14–18 Dec 2023, pp. 1092–1103. [Online]. Available: https://proceedings.mlr.press/v205/yu23b.html
work page 2023
-
[25]
Decision-making in driver-automation shared control: A review and perspectives,
W. Wang, X. Na, D. Cao, J. Gong, J. Xi, Y . Xing, and F.-Y . Wang”, “Decision-making in driver-automation shared control: A review and perspectives,”IEEE/CAA Journal of Automatica Sinica, vol. 7, no. JAS-2020-0177, p. 1289, 2020
work page 2020
-
[26]
M. Steele and R. B. Gillespie, “Shared control between human and machine: Using a haptic steering wheel to aid in land vehicle guidance,”Proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol. 45, no. 23, pp. 1671–1675, 2001. [Online]. Available: https://doi.org/10.1177/154193120104502323
-
[27]
Deep reinforcement learning with double q-learning,
H. v. Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double q-learning,” inProceedings of the Thirtieth AAAI Conference on Artificial Intelligence, ser. AAAI’16. AAAI Press, 2016, p. 2094–2100
work page 2016
-
[28]
G. Brockman, V . Cheung, L. Pettersson,et al., “OpenAI gym,”arXiv preprint arXiv:1606.01540, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[29]
Learning from active human involvement through proxy value propagation,
Z. Li, Y . Sun, M. Tomizuka, and W. Zhan, “Learning from active human involvement through proxy value propagation,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 36, 2023
work page 2023
-
[30]
J. Costa, A. Morgan, H. Yasuda, E. S. Sumner, D. Gopinath, S. Chau, H. Nguyen, A. Best, G. Rosman, and T. L. Chen, “From dashboards to dialogue: Evaluating a conversational ai coach for performance driving skill development,” inProceedings of the 17th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, ser. Autom...
-
[31]
Injecting conflict situations in autonomous driving simulation using carla,
T. Mihaylova, S. Reitmann, E. A. Topp, and V . Kyrki, “Injecting conflict situations in autonomous driving simulation using carla,” in2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2025, pp. 1052–1056
work page 2025
-
[32]
Dreaming to assist: Learning to align with human objectives for shared control in high-speed racing,
J. DeCastro, A. Silva, D. Gopinath, E. Sumner, T. M. Balch, L. Dees, and G. Rosman, “Dreaming to assist: Learning to align with human objectives for shared control in high-speed racing,” 2024. [Online]. Available: https://arxiv.org/abs/2410.10062
-
[33]
Blending data-driven priors in dynamic games,
J. Lidard, H. Hu, A. Hancock, Z. Zhang, A. G. Contreras, V . Modi, J. DeCastro, D. Gopinath, G. Rosman, N. E. Leonard,et al., “Blending data-driven priors in dynamic games,”arXiv preprint arXiv:2402.14174, 2024
-
[34]
Simcoachcorpus: A naturalistic dataset with language and trajectories for embodied teaching,
E. Sumner, D. E. Gopinath, L. Dees, P. R. Gomez, X. Cui, A. Silva, J. Costa, A. Morgan, M. Schrum, T. L. Chen, A. Balachandran, and G. Rosman, “Simcoachcorpus: A naturalistic dataset with language and trajectories for embodied teaching,” 2025. [Online]. Available: https://arxiv.org/abs/2509.14548
-
[35]
Deep learning-based trajectory planning and control for autonomous ground vehicle parking maneuver,
R. Chai, D. Liu, T. Liu, A. Tsourdos, Y . Xia, and S. Chai, “Deep learning-based trajectory planning and control for autonomous ground vehicle parking maneuver,”IEEE Transactions on Automation Science and Engineering, vol. 20, no. 3, pp. 1633–1647, 2023
work page 2023
-
[36]
A. W ¨achter and L. T. Biegler, “On the implementation of a primal- dual interior-point filter line-search algorithm for large-scale nonlinear programming,”Mathematical Programming, vol. 106, no. 1, pp. 25–57, 2006
work page 2006
-
[37]
Do users write more insecure code with ai assistants?
N. Perry*, M. Srivastava*, D. Kumar, and D. Boneh, “Do users write more insecure code with ai assistants?” inProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2023
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.