pith. sign in

arxiv: 2606.22040 · v1 · pith:R4GM26RJnew · submitted 2026-06-20 · 💻 cs.RO

Deep RL- Tuned Mo del-Free Adaptive Control for Lower-Limb Exoskeletons During Sit-to-Stand Transitions

Pith reviewed 2026-06-26 12:13 UTC · model grok-4.3

classification 💻 cs.RO
keywords exoskeleton controlsit-to-stand transitionmodel-free adaptive controlreinforcement learningtrajectory trackingradial basis functionTD3backstepping control
0
0 comments X

The pith

Integrating a TD3 deep reinforcement learning agent with model-free adaptive control yields 0.078 degree average joint tracking error for lower-limb exoskeletons in sit-to-stand transitions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that a model-free adaptive backstepping controller using an ultra-local second-order model and radial basis function neural network estimation, supervised by a TD3 reinforcement learning gain scheduler, delivers the most accurate joint trajectory tracking during sit-to-stand motion. This approach addresses the challenge of time-varying human-exoskeleton interactions and inter-subject variability that make explicit modeling impractical. Accurate assistance during sit-to-stand transitions matters because these movements impose high joint loads on elderly users who rely on exoskeletons for mobility support. Co-simulation benchmarks show the full design outperforms PID, MFAC, LQR, and SMC baselines by 42 to 60 percent in root mean square error while the TD3 component alone cuts additional error at each joint.

Core claim

The central claim is that the proposed controller, which combines radial basis function neural network estimation of unknown dynamics within a model-free adaptive backstepping framework and uses a Twin Delayed Deep Deterministic Policy Gradient agent to schedule gains across sit-to-stand phases, achieves an average RMSE of 0.078 degrees across all joints. This performance improves on proportional-integral-derivative control by 60.2 percent, standalone model-free adaptive control by 54.4 percent, linear quadratic regulator by 48.7 percent, and sliding-mode control by 42.6 percent. The TD3 scheduler further lowers tracking error by 35 percent at the hip, 33 percent at the knee, and 79 percent

What carries the argument

TD3 reinforcement learning agent acting as supervisory gain scheduler for the ultra-local second-order model with RBF neural network estimation in the adaptive backstepping controller

If this is right

  • The integrated controller records the lowest average RMSE of 0.078 degrees across hip, knee, and ankle joints.
  • TD3 gain scheduling reduces hip tracking error by 35 percent, knee error by 33 percent, and ankle error by 79 percent versus the RBF-MFAC baseline.
  • The design maintains phase-aware performance without requiring explicit system identification.
  • The approach outperforms four standard controllers by 42.6 to 60.2 percent in tracking accuracy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Hardware deployment on physical exoskeletons could test whether the simulated error reductions hold under real sensor noise and actuator limits.
  • The same TD3 scheduling structure might extend to other periodic exoskeleton tasks such as level walking or stair ascent with minimal redesign.
  • Reduced need for subject-specific models could lower the barrier to clinical trials across diverse user populations.
  • Online gain adaptation may improve robustness when users change posture or load during assistance.

Load-bearing premise

The MATLAB/Simulink and Simscape Multibody co-simulation using OpenSim-derived trajectories sufficiently captures real time-varying human-exoskeleton interaction dynamics and inter-subject variability during sit-to-stand transitions.

What would settle it

A physical experiment on multiple human subjects wearing the exoskeleton that records actual joint angle errors during repeated sit-to-stand movements and checks whether average RMSE remains near 0.078 degrees.

Figures

Figures reproduced from arXiv: 2606.22040 by Appaso M. Gadade, Ashish Singla, Rajmeet Singh, Ranjeet Kumbhar, Ravinder Kumar.

Figure 1
Figure 1. Figure 1: Lower-limb exoskeleton mechanical structure: (a) CAD model illustrating link parameters and [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Sequence of STS motion in the human exoskeleton system. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: OpenSim-derived reference joint kinematics for the STS motion: (a) joint angle trajectories; (b) [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Co-simulation framework for human exoskeleton system. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Overall workflow of the proposed lower-limb exoskeleton control framework. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Proposed control design architecture. 3.1 Problem Statement and Ultra-Local Model The coupled human–exoskeleton system performing STS motion can be mathematically formalized as an interconnected multi-body rigid system. By applying the Euler–Lagrange equations of motion, the overall system dynamics for three joints j (j = 1: hip, j = 2: knee, j = 3: ankle) are expressed as: M(qj ) ¨qj + C(qj , q˙j ) ˙qj + … view at source ↗
Figure 7
Figure 7. Figure 7: Architecture of the Gaussian RBF neural network used for online lumped dynamics estimation. [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Twin Delayed Deep Deterministic Policy Gradient (TD3) architecture. [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Hip joint angle tracking performance during STS motion: (a) joint angle tracking; (b) tracking [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Knee joint angle tracking performance during STS motion: (a) joint angle tracking; (b) tracking [PITH_FULL_IMAGE:figures/full_fig_p016_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Ankle joint angle tracking performance during STS motion: (a) joint angle tracking; (b) tracking [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Control torque profiles applied by each controller at the joints during STS motion: (a) hip; (b) [PITH_FULL_IMAGE:figures/full_fig_p018_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Joint-wise RMSE comparison across hip, knee, and ankle for all five evaluated control strategies. [PITH_FULL_IMAGE:figures/full_fig_p019_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: TD3 agent training convergence curves for the proposed controller showing episode reward and [PITH_FULL_IMAGE:figures/full_fig_p020_14.png] view at source ↗
read the original abstract

Sit-to-stand (STS) transitions impose significant joint-loading demands on elderly individuals, making them a primary target for lower-limb exoskeleton assistance. However, accurate trajectory tracking during STS is challenging due to complex, time-varying human exoskeleton interaction dynamics and inter-subject variability that render model-based control approaches difficult to apply in practice. This paper presents an intelligent model free adaptive backstepping control strategy for a bilateral lower-limb exoskeleton during STS motion. The proposed controller design uses an ultra-local second-order model to avoid explicit system identification, while a Gaussian radial basis function (RBF) neural network estimates the unknown lumped dynamics online. To further improve phase-aware tracking performance, a Twin Delayed Deep Deterministic Policy Gradient (TD3) reinforcement learning agent is integrated as a supervisory gain scheduler that adaptively adjusts controller gains across the distinct phases of STS motion. The proposed controller is evaluated through co-simulation in MATLAB/Simulink and Simscape Multibody using OpenSim-derived reference trajectories and benchmarked against state-of-the-art controllers. Results demonstrate that the proposed controller achieves the lowest average RMSE of 0.078 degree across all joints, representing improvements of 60.2%, 54.4%, 48.7%, and 42.6% over proportional integral derivative (PID), model-free adaptive control (MFAC), linear quadratic regulator (LQR), and sliding-mode control (SMC), respectively. TD3 integration further reduces tracking error by 35%, 33%, and 79% at the hip, knee, and ankle joints compared to the standalone RBF-MFAC baseline. These results demonstrate the effectiveness and robustness of the proposed controller design for assistive exoskeleton control during STS transitions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes an intelligent model-free adaptive backstepping controller for bilateral lower-limb exoskeletons during sit-to-stand (STS) transitions. It uses an ultra-local second-order model with online RBF neural network estimation of lumped dynamics, augmented by a TD3 RL agent as a supervisory gain scheduler for phase-aware performance. The controller is evaluated exclusively via MATLAB/Simulink + Simscape Multibody co-simulation driven by OpenSim reference trajectories and benchmarked against PID, MFAC, LQR, and SMC, claiming an average RMSE of 0.078° with 42.6–60.2% improvements over baselines and additional 33–79% error reductions from TD3 at individual joints.

Significance. If the simulation results translate to hardware, the combination of model-free adaptation with deep RL for adaptive gain scheduling could offer a practical approach to handling time-varying human-exoskeleton dynamics and inter-subject variability in assistive robotics applications.

major comments (2)
  1. [Evaluation / Results] Evaluation section (as described in the abstract): The central quantitative claims—average RMSE of 0.078°, percentage improvements of 60.2/54.4/48.7/42.6% over PID/MFAC/LQR/SMC, and 35/33/79% further reductions at hip/knee/ankle from TD3—are obtained solely from co-simulation. No hardware validation, sensor-noise injection, or sensitivity analysis to unmodeled effects (soft-tissue compliance, variable ground reaction forces) is reported, leaving the conclusion that the results demonstrate effectiveness for real assistive exoskeleton control unsupported.
  2. [Methods (TD3 integration)] Methods (TD3 integration, as referenced in the abstract): The manuscript provides no details on TD3 training procedure, reward design, hyperparameter values, number of episodes, or statistical tests underlying the reported RMSE and improvement percentages. This renders the performance numbers difficult to reproduce or assess for robustness.
minor comments (1)
  1. [Abstract] Abstract: The reported average RMSE of 0.078 degree does not specify whether it is computed across joints, trials, or subjects, nor does it include standard deviation or confidence intervals.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We address each major point below, acknowledging the simulation-only nature of the study while proposing targeted revisions to improve clarity, reproducibility, and transparency.

read point-by-point responses
  1. Referee: [Evaluation / Results] Evaluation section (as described in the abstract): The central quantitative claims—average RMSE of 0.078°, percentage improvements of 60.2/54.4/48.7/42.6% over PID/MFAC/LQR/SMC, and 35/33/79% further reductions at hip/knee/ankle from TD3—are obtained solely from co-simulation. No hardware validation, sensor-noise injection, or sensitivity analysis to unmodeled effects (soft-tissue compliance, variable ground reaction forces) is reported, leaving the conclusion that the results demonstrate effectiveness for real assistive exoskeleton control unsupported.

    Authors: We agree that all quantitative results are derived from co-simulation using MATLAB/Simulink, Simscape Multibody, and OpenSim trajectories, which provides a standardized, repeatable testbed for comparing controllers under consistent dynamics. The manuscript does not include hardware experiments, sensor noise injection, or explicit sensitivity analysis to soft-tissue effects or variable ground reactions. We will revise the abstract, results, and conclusion to explicitly qualify all claims as 'in simulation' and add a new Limitations and Future Work subsection that discusses these gaps, including plans for hardware validation on a physical exoskeleton platform. This addresses the concern without overstating the current evidence. revision: partial

  2. Referee: [Methods (TD3 integration)] Methods (TD3 integration, as referenced in the abstract): The manuscript provides no details on TD3 training procedure, reward design, hyperparameter values, number of episodes, or statistical tests underlying the reported RMSE and improvement percentages. This renders the performance numbers difficult to reproduce or assess for robustness.

    Authors: We acknowledge the lack of these implementation details in the submitted version. The TD3 agent used a reward function combining negative tracking error, control effort penalty, and phase-transition smoothness, with actor and critic networks of two hidden layers (256 units each), learning rates of 3e-4, discount factor 0.99, target update rate 0.005, and training over 10,000 episodes with a replay buffer of 1e6. Results reflect a single converged policy; no multi-seed statistical tests were performed. We will insert a dedicated TD3 Implementation subsection in Methods with these specifications, the full reward equation, and a note on the single-run nature as a limitation to improve reproducibility. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation or performance claims

full rationale

The paper describes a controller architecture (ultra-local model + RBF NN + TD3 gain scheduler) and reports simulation RMSE values obtained by direct numerical comparison against standard baselines (PID, MFAC, LQR, SMC) in MATLAB/Simulink co-simulation. No equation or result reduces the reported tracking errors or percentage improvements to quantities defined by the paper's own fitted parameters, self-citations, or ansatzes; the performance numbers are independent empirical outputs of the simulation runs rather than tautological re-statements of inputs.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The approach depends on the standard domain assumption that an ultra-local model plus online neural estimation can replace explicit system identification for time-varying human-exoskeleton dynamics; no new physical entities are introduced and free parameters are limited to those internal to the RL agent and network training.

free parameters (2)
  • TD3 agent hyperparameters and reward weights
    The reinforcement learning policy and value networks require training parameters and a reward function whose specific form and values are not stated in the abstract.
  • RBF network learning rates and basis widths
    Online estimation uses a Gaussian RBF network whose adaptation gains and kernel parameters must be chosen or tuned but are not reported.
axioms (1)
  • domain assumption An ultra-local second-order model plus online RBF estimation is sufficient to capture the essential lumped dynamics without explicit identification of human-exoskeleton interaction.
    Invoked to justify the model-free design and avoid system identification as described in the abstract.

pith-pipeline@v0.9.1-grok · 5869 in / 1600 out tokens · 44580 ms · 2026-06-26T12:13:11.637004+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references

  1. [1]

    World Population Prospects: The 2017 Revision , institution =

  2. [2]

    and Bassement, J

    Shukla, B. and Bassement, J. and Vijay, V. and Yadav, S. and Hewson, D. , title =. Bioengineering , volume =

  3. [3]

    and Huang, Y

    Wang, X. and Huang, Y. and Chen, Y. and Yang, T. and Su, W. and Chen, X. and Yan, F. and Han, L. and Ma, Y. , title =. Journal of Neurology , volume =

  4. [4]

    Feigin, V. L. and Brainin, M. and Norrving, B. and Martins, S. and Sacco, R. L. and Hacke, W. and Fisher, M. and Pandian, J. and Lindsay, P. , title =. International Journal of Stroke , volume =

  5. [5]

    Young, A. J. and Ferris, D. P. , title =. IEEE Transactions on Neural Systems and Rehabilitation Engineering , volume =

  6. [6]

    Dall, P. M. and Kerr, A. , title =. Applied Ergonomics , volume =

  7. [7]

    and Losa-Reyna, J

    Alcazar, J. and Losa-Reyna, J. and Rodriguez-Lopez, C. and Alfaro-Acha, A. and Rodriguez-Ma. The Sit-to-Stand Muscle Power Test: An Easy, Inexpensive, and Portable Procedure to Assess Muscle Power in Older People , journal =

  8. [8]

    , title =

    Pransky, J. , title =. Industrial Robot , volume =

  9. [9]

    Banala, S. K. and Agrawal, S. K. and Kim, S. H. and Scholz, J. P. , title =. IEEE/ASME Transactions on Mechatronics , volume =

  10. [10]

    and Frey, M

    Bernhardt, M. and Frey, M. and Colombo, G. and Riener, R. , title =. Proceedings of the IEEE International Conference on Rehabilitation Robotics , pages =. 2005 , doi =

  11. [11]

    and Ma, W

    Li, Z. and Ma, W. and Yin, Z. and Guo, H. , title =. ISA Transactions , volume =

  12. [12]

    and Ramli, R

    Aliman, N. and Ramli, R. and Haris, S. M. , title =. Robotics and Autonomous Systems , volume =

  13. [13]

    Shepherd, M. K. and Rouse, E. J. , title =. IEEE/ASME Transactions on Mechatronics , volume =

  14. [14]

    and Tanghe, K

    Vantilt, J. and Tanghe, K. and Afschrift, M. and Bruijnes, A. K. B. D. and Junius, K. and Geeroms, J. and Aertbeli. Model-Based Control for Exoskeletons with Series Elastic Actuators Evaluated on Sit-to-Stand Movements , journal =

  15. [15]

    and Moon, H

    Huo, W. and Moon, H. and Alouane, M. A. and Bonnet, V. and Huang, J. and Amirat, Y. and Vaidyanathan, R. and Mohammed, S. , title =. IEEE Transactions on Robotics , volume =

  16. [16]

    Roelker, S. A. and Schmitt, L. C. and Chaudhari, A. M. W. and Siston, R. A. , title =. PLoS One , volume =

  17. [17]

    and Erickson, E

    Fernandez-Montoya, M. and Erickson, E. J. and Gallego, J. A. and Aguirre, M. E. , title =. ASME Journal of Mechanisms and Robotics , volume =

  18. [18]

    and Huang, Y

    Cheng, G. and Huang, Y. and Zhang, X. , title =. Nonlinear Dynamics , volume =

  19. [19]

    Robust Nonsingular Fast Terminal Sliding-Mode Control for Sit-to-Stand Task Using a Mobile Lower Limb Exoskeleton , journal =

    Hern. Robust Nonsingular Fast Terminal Sliding-Mode Control for Sit-to-Stand Task Using a Mobile Lower Limb Exoskeleton , journal =

  20. [20]

    and Abbas, M

    Narayan, J. and Abbas, M. and Dwivedy, S. K. , title =. Transactions of the Institute of Measurement and Control , volume =

  21. [21]

    and Gaur, P

    Sharma, R. and Gaur, P. and Bhatt, S. and Joshi, D. , title =. Applied Soft Computing , volume =

  22. [22]

    and Han, J

    Yang, S. and Han, J. and Xia, L. and Chen, Y.-H. , title =. Mechanical Systems and Signal Processing , volume =

  23. [23]

    and Zhou, Z

    Liu, X. and Zhou, Z. and Mai, J. and Wang, Q. , title =. Robotics and Autonomous Systems , volume =

  24. [24]

    and Hommel, G

    Fleischer, C. and Hommel, G. , title =. IEEE Transactions on Robotics , volume =

  25. [25]

    and Fu, R

    Yu, S. and Fu, R. and Ye, C. and Li, H. , title =. ASME Journal of Mechanisms and Robotics , volume =

  26. [26]

    and Join, C

    Fliess, M. and Join, C. , title =. International Journal of Control , volume =

  27. [27]

    and Kenas, F

    Amara, Y. and Kenas, F. , title =. Journal of the Brazilian Society of Mechanical Sciences and Engineering , volume =

  28. [28]

    Khan, S. G. and Tufail, M. and Shah, S. H. and Ullah, I. , title =. Advanced Robotics , volume =

  29. [29]

    Delp, S. L. and Anderson, F. C. and Arnold, A. S. and Loan, P. and Habib, A. and John, C. T. and Guendelman, E. and Thelen, D. G. , title =. IEEE Transactions on Biomedical Engineering , volume =

  30. [30]

    and Singh, R

    Kumbhar, R. and Singh, R. and Gadade, A. M. and Singla, A. and Hussain, I. , title =. arXiv preprint , year =

  31. [31]

    and Abbas, M

    Narayan, J. and Abbas, M. and Patel, B. and Dwivedy, S. K. , title =. Intelligent Service Robotics , volume =

  32. [32]

    and Sandberg, I

    Park, J. and Sandberg, I. W. , title =. Neural Computation , volume =

  33. [33]

    and Aguilar-Ibanez, C

    Moran-Armenta, M. and Aguilar-Ibanez, C. and Moreno-Valenzuela, J. , title =. Cybernetics and Systems , pages =

  34. [34]

    and Hoof, H

    Fujimoto, S. and Hoof, H. and Meger, D. , title =. Proceedings of the 35th International Conference on Machine Learning (ICML 2018) , address =