pith. sign in

arxiv: 2605.20355 · v1 · pith:FWPDGNDNnew · submitted 2026-05-19 · 💻 cs.RO · cs.HC· cs.LG

Proximal State Nudging: Reducing Skill Atrophy from AI Assistance

Pith reviewed 2026-05-21 07:18 UTC · model grok-4.3

classification 💻 cs.RO cs.HCcs.LG
keywords shared autonomyskill atrophyhuman-AI collaborationproximal state nudgingCARLA simulatordriving assistancelearning-compatible planning
0
0 comments X

The pith

Proximal State Nudging directs AI-assisted users toward learnable states to reduce skill atrophy while preserving task performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Proximal State Nudging to address skill decline in shared-control systems where AI assistance can mask human inputs. It jointly optimizes for immediate task success and long-term human learning by nudging operators into states the planner judges most beneficial for skill gains. Experiments with simulated agents in LunarLander and with sixty human drivers in CARLA show larger unassisted skill improvements than standard blending methods, plus fewer collisions than pure self-practice. A sympathetic reader would care because many real-world systems, from vehicles to robots, rely on humans who must remain competent when assistance fails.

Core claim

Proximal State Nudging is a shared autonomy planner that selects actions moving the joint human-AI system into proximal states estimated to be most learnable for the human, thereby increasing unassisted performance gains up to sevenfold over blended shared autonomy while cutting collision rates by half relative to unaided practice in two CARLA driving tasks.

What carries the argument

Proximal State Nudging, the algorithm that identifies and nudges the human-AI pair toward states predicted to produce the largest future improvement in the human's unassisted reward.

If this is right

  • Shared autonomy systems can maintain or increase human skill instead of eroding it.
  • Task performance and learning objectives can be traded off inside a single planner rather than treated as separate goals.
  • Simulated student models can serve as a useful filter before running expensive human-subject trials of learning-aware planners.
  • Collision reduction relative to pure practice suggests the method improves safety during the training phase itself.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same nudging principle could be tested in domains such as aircraft control or robotic surgery where skill retention under assistance is equally critical.
  • Replacing the current learnability estimator with online adaptation from actual human performance data might further increase gains.
  • Extending the method to multi-operator teams or longer time horizons could reveal whether proximal nudging scales beyond single-session driving tasks.

Load-bearing premise

The planner's estimates of which states are most learnable actually match the states that produce real skill gains for humans in the target tasks.

What would settle it

A replication study in which human drivers using Proximal State Nudging show no larger post-training improvement on unassisted trials than drivers using standard blended shared autonomy.

Figures

Figures reproduced from arXiv: 2605.20355 by Andrew Silva, Deepak Gopinath, Dorsa Sadigh, Emily Sumner, Eric Zhou, Guy Rosman, Jonathan Ouyang, Megha Srivastava, Yuchen Cui.

Figure 1
Figure 1. Figure 1: Overview. We propose PROXIMAL STATE NUDGING (PSN), a learning-aware shared autonomy algorithm that minimizes human skill atrophy in rapid-control domains, such as High Performance Racing (A). Inspired by the Zone of Proximal Development theory from cognitive psychology [1], PSN maintains an estimator ϕzpd,t(s) of each state’s “learnability” – how likely it is to improve a student’s unassisted skills – whic… view at source ↗
Figure 2
Figure 2. Figure 2: Lunar Lander Task and Results (Left) Heatmap showing estimated proximality scores for states in Lunar Lander, estimated over states drawn from a test set after training for 100 episodes. States directly below the lander (too easy) or too far to the right (too challenging) are estimated to be less “learnable”; a student agent will face similar expected reward regardless of receiving or not receiving assista… view at source ↗
Figure 3
Figure 3. Figure 3: High Performance Racing Results (Left) Heatmap ϕzpd, t=1 estimates shows that assistance is predicted to most strongly support student learning at states near high-speed tight turns, and that predicted learnability is sensitive to both location and velocity. (Right) PROXIMAL STATE NUDGING balances the trade-off between learning gains (e.g. improved similarity to expert, lap time, and control jerk) and safe… view at source ↗
Figure 4
Figure 4. Figure 4: Parallel Parking Results (Top Left) Estimated learnability ϕzpd,t=1 over discretized approach states near the target parking spot; green indicates higher. (Bar Plots) PROXIMAL STATE NUDGING balances the trade-off between learning gains (e.g. improved time and collisions) and safety (total # of collisions during practice). in comparison to both blended shared autonomy and unassisted self-practice. Bars show… view at source ↗
read the original abstract

Skill atrophy, the gradual decline of human capability under AI assistance, poses a safety risk in shared-control of semi-autonomous systems, where operators may be unable to distinguish their own inputs from autonomous corrections. We propose Proximal State Nudging (PSN), a shared autonomy algorithm that jointly optimizes for skill development and task performance by nudging users toward states estimated to be most learnable. We first show that PSN outperforms existing shared autonomy baselines in balancing student improvement in unassisted reward with overall shared performance, using simulated students in the classic LunarLander environment. We then present, to the best of our knowledge, the first human subject studies of a planner incorporating learning-compatible shared autonomy: across two driving tasks in the CARLA simulator (High Performance Racing and Parallel Parking, n = 60), PSN produces up to 7x larger gains in unassisted skill than standard blended shared autonomy, while incurring 50% fewer collisions than unassisted self-practice.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes Proximal State Nudging (PSN), a shared autonomy algorithm that jointly optimizes for skill development and task performance by nudging users toward states estimated to be most learnable. It first validates the approach with simulated students in the LunarLander environment, showing better balance between unassisted reward improvement and shared performance than existing baselines. It then reports human subject studies (n=60) in the CARLA simulator across High Performance Racing and Parallel Parking tasks, claiming up to 7x larger gains in unassisted skill than standard blended shared autonomy while incurring 50% fewer collisions than unassisted self-practice.

Significance. If the results hold under scrutiny, this work addresses a practically important problem in human-AI shared control by mitigating skill atrophy, with direct relevance to safety in semi-autonomous driving and similar domains. The dual validation path (simulation then human experiments) is a strength, and the explicit focus on learning-compatible assistance distinguishes it from standard shared autonomy methods.

major comments (2)
  1. [Abstract] Abstract: the quantitative claims of 'up to 7x larger gains in unassisted skill' and '50% fewer collisions' are presented without any mention of statistical tests, error bars, per-condition participant counts, or the exact protocol used to measure post-assistance unassisted skill; these details are load-bearing for evaluating the reliability of the reported effect sizes that support the central empirical claim.
  2. [Human subject studies] Human subject studies: the transfer of the 'most learnable' state estimator (fitted on simulated students in LunarLander) to real human drivers in CARLA is not directly validated, e.g., via per-participant correlation between nudged-state exposure and individual skill gains; without such a check it remains unclear whether the planner's estimates causally explain the observed improvements or whether other factors are responsible.
minor comments (2)
  1. [Abstract] Abstract: specify the exact number of participants assigned to each condition (PSN, blended shared autonomy, self-practice) and the total duration of each driving task to permit assessment of statistical power.
  2. [Abstract] The abstract states 'to the best of our knowledge, the first human subject studies of a planner incorporating learning-compatible shared autonomy'; a brief comparison to the closest prior human studies on shared autonomy would strengthen this novelty claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight important aspects of result presentation and mechanistic validation. We address each major comment below and indicate the changes planned for the revised manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the quantitative claims of 'up to 7x larger gains in unassisted skill' and '50% fewer collisions' are presented without any mention of statistical tests, error bars, per-condition participant counts, or the exact protocol used to measure post-assistance unassisted skill; these details are load-bearing for evaluating the reliability of the reported effect sizes that support the central empirical claim.

    Authors: We agree that the abstract would be strengthened by referencing these supporting details. In the revision we will update the abstract to note that the reported gains are supported by statistical tests, that variability is shown via error bars in the accompanying figures, that the n=60 participants were allocated across conditions, and that post-assistance unassisted skill was measured in a dedicated evaluation phase without assistance. The full statistical procedures, participant breakdowns, and protocol remain described in the Methods and Results sections. revision: yes

  2. Referee: [Human subject studies] Human subject studies: the transfer of the 'most learnable' state estimator (fitted on simulated students in LunarLander) to real human drivers in CARLA is not directly validated, e.g., via per-participant correlation between nudged-state exposure and individual skill gains; without such a check it remains unclear whether the planner's estimates causally explain the observed improvements or whether other factors are responsible.

    Authors: We acknowledge that a direct per-participant correlation between nudged-state exposure and individual skill gains was not reported in the original submission. The manuscript instead demonstrates the benefit of PSN through comparative outcomes against baselines. To address the concern, we will add a post-hoc correlation analysis using the existing per-participant data, relating the number of proximal nudges received to each driver's measured unassisted skill improvement. We will report the correlation coefficient and discuss its implications for the causal role of the estimator while noting any limitations or alternative explanations. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical comparisons

full rationale

The paper proposes the PSN algorithm for shared autonomy and validates it via direct empirical comparisons: simulated student improvement in LunarLander and human-subject metrics (unassisted skill gains, collisions) in CARLA driving tasks against baselines like blended shared autonomy and self-practice. No equations, fitted parameters, or self-citations are shown that reduce reported outcomes to inputs by construction. The planner's selection of 'most learnable' states is an algorithmic design choice whose effectiveness is assessed through independent experimental measurements rather than tautological re-derivation, making the central claims self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are described. The method implicitly relies on an unstated model of 'learnable states' and simulated student behavior, but details are absent.

pith-pipeline@v0.9.0 · 5728 in / 1141 out tokens · 30075 ms · 2026-05-21T07:18:10.719373+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · 2 internal anchors

  1. [1]

    L. S. Vygotsky,Mind in society: The development of higher psycho- logical processes. Harvard university press, 1978, vol. 86

  2. [2]

    Devil’s advocate: exploring the potential negative impacts of artificial intelligence on the field of surgery,

    M. Sarofim, “Devil’s advocate: exploring the potential negative impacts of artificial intelligence on the field of surgery,”Journal of Medical Artificial Intelligence, vol. 7, 2024

  3. [3]

    The challenges of partially automated driving,

    S. M. Casner, E. L. Hutchins, and D. Norman, “The challenges of partially automated driving,”Communications of the ACM, vol. 59, no. 5, pp. 70–77, 2016

  4. [4]

    Shared control versus traded control in driving: a debate around automation pitfalls,

    J. C. de Winter, S. M. Petermeijer, and D. A. Abbink, “Shared control versus traded control in driving: a debate around automation pitfalls,” Ergonomics, vol. 66, no. 10, pp. 1494–1520, 2023

  5. [5]

    Chatgpt in education: transforming digital learning and assessment,

    E. Dell’Aquila, F. Rossi, and M. Ronchetti, “Chatgpt in education: transforming digital learning and assessment,”Applied Sciences, vol. 13, no. 18, p. 10053, 2023

  6. [6]

    Autonomous driving systems: A preliminary naturalistic study of the tesla model s,

    M. R. Endsley, “Autonomous driving systems: A preliminary naturalistic study of the tesla model s,”Journal of Cognitive Engineering and Decision Making, vol. 11, no. 3, pp. 225–238, 2017

  7. [7]

    Simultaneous achievement of driver assistance and skill development in shared and cooperative controls,

    T. Wada, “Simultaneous achievement of driver assistance and skill development in shared and cooperative controls,”Cognition, Technology & Work, vol. 21, no. 4, pp. 631–642, 2019

  8. [8]

    Fake it till you make it: Learning- compatible performance support,

    J. Bragg and E. Brunskill, “Fake it till you make it: Learning- compatible performance support,” inProceedings of The 35th Uncertainty in Artificial Intelligence Conference, ser. Proceedings of Machine Learning Research, R. P. Adams and V . Gogate, Eds., vol

  9. [9]

    PMLR, 22–25 Jul 2020, pp. 915–924. [Online]. Available: https://proceedings.mlr.press/v115/bragg20a.html

  10. [10]

    Carla: An open urban driving simulator,

    A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun, “Carla: An open urban driving simulator,” inProceedings of the 1st Annual Conference on Robot Learning (CoRL), 2017, pp. 1–16

  11. [11]

    A policy-blending formalism for shared control,

    A. D. Dragan and S. S. Srinivasa, “A policy-blending formalism for shared control,”International Journal of Robotics Research (IJRR), vol. 32, pp. 790–805, 2013

  12. [12]

    Shared autonomy via hindsight optimization,

    S. Javdani, S. Srinivasa, and J. A. Bagnell, “Shared autonomy via hindsight optimization,” inRobotics: Science and Systems, 2015

  13. [13]

    Shared Autonomy via Deep Reinforcement Learning

    S. Reddy, A. D. Dragan, and S. Levine, “Shared autonomy via deep reinforcement learning,” inArxiv 1802.01744, 2018. [Online]. Available: https://arxiv.org/abs/1802.01744

  14. [14]

    Highly parallelized data-driven mpc for minimal intervention shared control,

    A. Broad, T. Murphey, and B. Argall, “Highly parallelized data-driven mpc for minimal intervention shared control,” inRobotics: science and systems, 2019

  15. [15]

    Task-based hybrid shared control for training through forceful interaction,

    K. Fitzsimons, A. Kalinowska, J. Dewald, and T. Murphey, “Task-based hybrid shared control for training through forceful interaction,”The International Journal of Robotics Research, vol. 39, pp. 1138 – 1154, 2019

  16. [16]

    Assistive teaching of motor control tasks to humans,

    M. Srivastava, E. Biyik, S. Mirchandani, N. Goodman, and D. Sadigh, “Assistive teaching of motor control tasks to humans,” inAdvances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 28 517–28 529

  17. [17]

    A framework for adaptation of training task, assistance and feedback for optimizing motor (re)-learning with a robotic exoskeleton,

    P. Agarwal and A. Deshpande, “A framework for adaptation of training task, assistance and feedback for optimizing motor (re)-learning with a robotic exoskeleton,”IEEE Robotics and Automation Letters, vol. 4, pp. 808–815, 2019

  18. [18]

    Towards modeling and influencing the dynamics of human learning,

    R. Tian, A. Bajcsy, M. Tomizuka, and A. D. Dragan, “Towards modeling and influencing the dynamics of human learning,” inProceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction. Association for Computing Machinery, 2023, pp. 12–21

  19. [19]

    Shared autonomy for proximal teaching,

    M. Srivastava, R. Iranmanesh, Y . Cui, D. Gopinath, E. S. Sumner, A. Silva, L. Dees, G. Rosman, and D. Sadigh, “Shared autonomy for proximal teaching,” inProceedings of the 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI 2025), 2025, pp. 232– 241

  20. [20]

    The impact of automation on pilot cognition and performance,

    S. M. Casner, E. L. Hutchins, and D. Norman, “The impact of automation on pilot cognition and performance,”Human Factors, vol. 56, no. 1, pp. 1–17, 2014

  21. [21]

    Shared control versus traded control in driving: a debate around automation pitfalls,

    J. C. F. de Winter, S. M. Petermeijer, and D. A. Abbink, “Shared control versus traded control in driving: a debate around automation pitfalls,”Ergonomics, vol. 66, no. 10, pp. 1494–1520, Oct. 2023

  22. [22]

    Ai-induced deskilling in medicine: A mixed- method review and research agenda for healthcare and beyond,

    F. Parchmannet al., “Ai-induced deskilling in medicine: A mixed- method review and research agenda for healthcare and beyond,” Artificial Intelligence Review, 2024

  23. [23]

    Training llm agents to empower humans,

    E. Ellis, V . Myers, J. Tuyls, S. Levine, A. Dragan, and B. Eysen- bach, “Training llm agents to empower humans,”arXiv preprint arXiv:2510.13709, 2025

  24. [24]

    Coach: Cooperative robot teaching,

    C. Yu, Y . Xu, L. Li, and D. Hsu, “Coach: Cooperative robot teaching,” inProceedings of The 6th Conference on Robot Learning, ser. Proceedings of Machine Learning Research, K. Liu, D. Kulic, and J. Ichnowski, Eds., vol. 205. PMLR, 14–18 Dec 2023, pp. 1092–1103. [Online]. Available: https://proceedings.mlr.press/v205/yu23b.html

  25. [25]

    Decision-making in driver-automation shared control: A review and perspectives,

    W. Wang, X. Na, D. Cao, J. Gong, J. Xi, Y . Xing, and F.-Y . Wang”, “Decision-making in driver-automation shared control: A review and perspectives,”IEEE/CAA Journal of Automatica Sinica, vol. 7, no. JAS-2020-0177, p. 1289, 2020

  26. [26]

    Shared control between human and machine: Using a haptic steering wheel to aid in land vehicle guidance,

    M. Steele and R. B. Gillespie, “Shared control between human and machine: Using a haptic steering wheel to aid in land vehicle guidance,”Proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol. 45, no. 23, pp. 1671–1675, 2001. [Online]. Available: https://doi.org/10.1177/154193120104502323

  27. [27]

    Deep reinforcement learning with double q-learning,

    H. v. Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double q-learning,” inProceedings of the Thirtieth AAAI Conference on Artificial Intelligence, ser. AAAI’16. AAAI Press, 2016, p. 2094–2100

  28. [28]

    OpenAI Gym

    G. Brockman, V . Cheung, L. Pettersson,et al., “OpenAI gym,”arXiv preprint arXiv:1606.01540, 2016

  29. [29]

    Learning from active human involvement through proxy value propagation,

    Z. Li, Y . Sun, M. Tomizuka, and W. Zhan, “Learning from active human involvement through proxy value propagation,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 36, 2023

  30. [30]

    From dashboards to dialogue: Evaluating a conversational ai coach for performance driving skill development,

    J. Costa, A. Morgan, H. Yasuda, E. S. Sumner, D. Gopinath, S. Chau, H. Nguyen, A. Best, G. Rosman, and T. L. Chen, “From dashboards to dialogue: Evaluating a conversational ai coach for performance driving skill development,” inProceedings of the 17th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, ser. Autom...

  31. [31]

    Injecting conflict situations in autonomous driving simulation using carla,

    T. Mihaylova, S. Reitmann, E. A. Topp, and V . Kyrki, “Injecting conflict situations in autonomous driving simulation using carla,” in2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2025, pp. 1052–1056

  32. [32]

    Dreaming to assist: Learning to align with human objectives for shared control in high-speed racing,

    J. DeCastro, A. Silva, D. Gopinath, E. Sumner, T. M. Balch, L. Dees, and G. Rosman, “Dreaming to assist: Learning to align with human objectives for shared control in high-speed racing,” 2024. [Online]. Available: https://arxiv.org/abs/2410.10062

  33. [33]

    Blending data-driven priors in dynamic games,

    J. Lidard, H. Hu, A. Hancock, Z. Zhang, A. G. Contreras, V . Modi, J. DeCastro, D. Gopinath, G. Rosman, N. E. Leonard,et al., “Blending data-driven priors in dynamic games,”arXiv preprint arXiv:2402.14174, 2024

  34. [34]

    Simcoachcorpus: A naturalistic dataset with language and trajectories for embodied teaching,

    E. Sumner, D. E. Gopinath, L. Dees, P. R. Gomez, X. Cui, A. Silva, J. Costa, A. Morgan, M. Schrum, T. L. Chen, A. Balachandran, and G. Rosman, “Simcoachcorpus: A naturalistic dataset with language and trajectories for embodied teaching,” 2025. [Online]. Available: https://arxiv.org/abs/2509.14548

  35. [35]

    Deep learning-based trajectory planning and control for autonomous ground vehicle parking maneuver,

    R. Chai, D. Liu, T. Liu, A. Tsourdos, Y . Xia, and S. Chai, “Deep learning-based trajectory planning and control for autonomous ground vehicle parking maneuver,”IEEE Transactions on Automation Science and Engineering, vol. 20, no. 3, pp. 1633–1647, 2023

  36. [36]

    On the implementation of a primal- dual interior-point filter line-search algorithm for large-scale nonlinear programming,

    A. W ¨achter and L. T. Biegler, “On the implementation of a primal- dual interior-point filter line-search algorithm for large-scale nonlinear programming,”Mathematical Programming, vol. 106, no. 1, pp. 25–57, 2006

  37. [37]

    Do users write more insecure code with ai assistants?

    N. Perry*, M. Srivastava*, D. Kumar, and D. Boneh, “Do users write more insecure code with ai assistants?” inProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2023