arxiv: 2507.13662 · v2 · submitted 2025-07-18 · 💻 cs.RO

Iteratively Learning Muscle Memory for Legged Robots to Master Adaptive and High Precision Locomotion

Jing Cheng , Yasser G. Alqaham , Zhenyu Gan , Amit K. Sanyal This is my paper

Pith reviewed 2026-05-19 04:46 UTC · model grok-4.3

classification 💻 cs.RO

keywords legged locomotioniterative learning controltorque libraryadaptive controltrajectory trackingbipedal robotquadrupedal robotmuscle memory

0 comments p. Extension

The pith

A torque library built by iterative learning lets legged robots cut joint errors by 85 percent and adapt to new speeds and slopes without retraining each time.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that combining iterative learning control with a stored torque library produces precise and adaptive locomotion for legged robots. The library acts like muscle memory by holding control profiles that the robot can reuse across changing conditions. This approach would matter because it removes the need for heavy online computation or starting over when speed, terrain, or gravity shifts, making reliable walking feasible in real environments. The method is tested on both a biped and a quadruped through simulation and hardware runs.

Core claim

The authors establish that a generalized torque library stores control profiles learned through iterative learning control applied to a hybrid-system physics model. Once built, the library supplies the corrections needed for model uncertainties and disturbances, allowing the robot to track trajectories accurately on both periodic and nonperiodic gaits while adapting to slopes and uneven ground.

What carries the argument

The generalized torque library that stores learned control profiles and supplies them for rapid reuse across different speeds, terrains, and gravitational conditions.

If this is right

Both periodic and nonperiodic gaits become reliable on bipedal and quadrupedal platforms.
Slope traversal and terrain adaptation occur without repeated full learning cycles.
Online computation during execution drops enough to support control rates over 30 times higher than existing methods.
The same learned profiles transfer across robots of different leg counts when the library is generalized.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same library structure could support higher-level planners that previously ran too slowly under tight timing constraints.
Extending the library to include brief disturbance responses might further improve robustness on fully unstructured outdoor ground.
Because the method works on both Cassie and A1, similar torque storage may apply to other legged morphologies with minimal redesign.

Load-bearing premise

A single stored torque library can supply accurate corrections for new speeds, terrains, and gravity without the robot having to learn the profiles again from scratch.

What would settle it

Run the robot on a slope or speed change it has not encountered before and check whether joint tracking error still falls by up to 85 percent within a few seconds and whether control updates remain more than 30 times faster than standard whole-body controllers.

Figures

Figures reproduced from arXiv: 2507.13662 by Amit K. Sanyal, Jing Cheng, Yasser G. Alqaham, Zhenyu Gan.

**Figure 1.** Figure 1: Conceptual illustration of the ILC framework. The ILC process iteratively refines the feedforward control inputs by leveraging data from previous iterations to minimize tracking errors. The figure highlights the interplay between feedback and feedforward components, showcasing how the control scheme adapts to improve trajectory tracking over successive iterations. to converge, and difficulties in transferr… view at source ↗

**Figure 2.** Figure 2: The kinematic configurations of (a) the quadrupedal A1 (Alqaham et al., 2024) and (b) the bipedal Cassie (Gong et al., 2019) platforms used in this study. The generalized coordinates for both robot platforms are defined as: q = qB qL ∈ R nB+nL . where qB = [qx, qy, qz, qyaw, qpitch, qroll] T ∈ R nB denotes the base (torso) position and orientation in 3D space, and qL ∈ R nL contains the actuated and pa… view at source ↗

**Figure 3.** Figure 3: The proposed control architecture includes trajectory planning (green), feedback control (blue), and feedforward control (orange) modules. An iterative policy further improves stability by refining torques applied to the thigh joints. Zero-phase filtering ensures smooth and phase-consistent control signals. Key computations align with the equations presented in this section. where Kb P and Kb D are the pro… view at source ↗

**Figure 5.** Figure 5: Tracking improvements for the A1 robot’s calf joints during pronking under lunar and high-gravity conditions. The implementation of ILC achieves significant error reductions in both scenarios, highlighting its adaptability and robustness across diverse gravitational environments. these improvements, demonstrating the adaptability of the controller across extreme gravitational environments. 4.2 Hardware Tes… view at source ↗

**Figure 6.** Figure 6: A1 robot performing locomotion using the hybrid control scheme across diverse terrains: (a) indoor carpet, (b) wet outdoor concrete, (c) natural grass, (d) snow-covered surface, and (e) inclined ground. optimal policy a = π(z), enabling precise regulation of torso orientation throughout the learning process. 4.2.1 Locomotion Across Natural Terrains To assess the robustness of the proposed control framework… view at source ↗

**Figure 7.** Figure 7: Tracking performance of A1 during pronking at 0.4 m/s. Subfigures (a)–(b) show control torque evolution before and after ILC activation at t = 10.6 s. Subfigures (c)–(d) compare tracking errors under PD-only and ILC-based control, with calf and thigh RMSE reduced by 58.3% and 25.0%, respectively [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: Hardware experiment on the Cassie robot showing tracking improvements for the left hip joint q3 and knee joint q4. The activation of ILC at t = 8.6 s resulted in up to 80% reduction in tracking error within 3–5 strides. thigh joint, when compared to baseline PD control. These results demonstrate the adaptability and generalizability of ILC in accommodating terrain-induced disturbances without requiring man… view at source ↗

**Figure 9.** Figure 9: Learned feedforward torque profiles for the A1 robot at various average speeds, organized within the TL. These profiles are directly used during online execution to provide predictive feedforward control without retraining. Panel (a) shows the rear thigh joint torque profiles, while panel (b) presents the rear calf joint. τ k(s), as defined in (13). Once the convergence criterion in (18) is satisfied, the … view at source ↗

**Figure 12.** Figure 12: Tracking performance of the A1 robot at interpolated speeds (−0.35 m/s, 0.43 m/s, and 0.55 m/s) using the TL. The interpolated feedforward torques significantly reduced convergence time, with the robot achieving steady-state tracking within just two strides [PITH_FULL_IMAGE:figures/full_fig_p014_12.png] view at source ↗

**Figure 11.** Figure 11: The average torque profiles at 0.5 m/s for A1 model and 0.8 m/s for Cassie model alongside multiple trial curves for: (a) rear thigh joint in A1, (b) rear calf joint in A1, (c) hip pitch joint in Cassie, and (d) knee joint in Cassie. (a) and (b) present the averaged torque profiles computed across 12 trials, while (c) and (d) display the averaged results from 20 trials. Individual trial curves are shown i… view at source ↗

**Figure 14.** Figure 14: A1 hardware comparison of tracking performance and computation time: (a) Tracking using the TL-based controller with 57.7% RMSE reduction; (c) Tracking using WBC with 50.4% RMSE reduction. (b) TL function call time: 0.0065 ms; (d) WBC function call time: 0.2274 ms. The TL-based controller not only delivers over 35 times faster computation but also achieves better trajectory tracking than WBC, making it hi… view at source ↗

**Figure 15.** Figure 15: Improvement in jumping performance with learned torque. (a) Jumping distance before learning is approximately 0.09 m; (b) calf joint tracking prior to ILC; (c) improved tracking after ILC results in a final jump distance of 0.39 m. touchdown. After applying the ILC process over three iterations, the controller significantly improves tracking performance, as illustrated in [PITH_FULL_IMAGE:figures/full_fi… view at source ↗

read the original abstract

This paper presents a scalable and adaptive control framework for legged robots that integrates Iterative Learning Control (ILC) with a biologically inspired torque library (TL), analogous to muscle memory. The proposed method addresses key challenges in robotic locomotion, including accurate trajectory tracking under unmodeled dynamics and external disturbances. By leveraging the repetitive nature of periodic gaits and extending ILC to nonperiodic tasks, the framework enhances accuracy and generalization across diverse locomotion scenarios. The control architecture is data-enabled, combining a physics-based model derived from hybrid-system trajectory optimization with real-time learning to compensate for model uncertainties and external disturbances. A central contribution is the development of a generalized TL that stores learned control profiles and enables rapid adaptation to changes in speed, terrain, and gravitational conditions-eliminating the need for repeated learning and significantly reducing online computation. The approach is validated on the bipedal robot Cassie and the quadrupedal robot A1 through extensive simulations and hardware experiments. Results demonstrate that the proposed framework reduces joint tracking errors by up to 85% within a few seconds and enables reliable execution of both periodic and nonperiodic gaits, including slope traversal and terrain adaptation. Compared to state-of-the-art whole-body controllers, the learned skills eliminate the need for online computation during execution and achieve control update rates exceeding 30x those of existing methods. These findings highlight the effectiveness of integrating ILC with torque memory as a highly data-efficient and practical solution for legged locomotion in unstructured and dynamic environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The torque library paired with ILC is the actual new piece here for quick adaptation on legged robots, but the out-of-distribution checks are the weak link.

read the letter

The main takeaway is that this paper pairs iterative learning control with a stored torque library to handle both periodic gaits and non-repeating tasks like slope walking or terrain shifts on robots such as Cassie and A1. They start from a physics-based model, use ILC to learn corrections, and store the results so the system can pull a profile for new speeds or gravity without starting from zero each time. That setup is what lets them claim big tracking improvements and much higher control rates once the library is built.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a scalable control framework for legged robots integrating Iterative Learning Control (ILC) with a biologically inspired generalized torque library (TL). It claims to achieve up to 85% reduction in joint tracking errors within seconds, reliable execution of periodic and nonperiodic gaits (including slope traversal and terrain adaptation), and control update rates exceeding 30x those of existing whole-body controllers, validated via simulations and hardware experiments on the Cassie biped and A1 quadruped.

Significance. If the generalization claims for the torque library hold under rigorous out-of-distribution testing, the work could meaningfully advance practical legged locomotion by minimizing online computation while retaining adaptability, with the dual-platform hardware validation serving as a concrete strength.

major comments (2)

[Validation experiments] Validation experiments: the reported 85% joint tracking error reduction and 30x control rate improvement lack accompanying error bars, statistical significance tests, data exclusion criteria, and explicit experimental protocols, which are load-bearing for substantiating the central performance claims.
[Torque library section] Torque library section: the indexing/retrieval function (similarity metric or interpolation) for the generalized TL is not formalized, and no separate ablation isolates library-only performance on truly unseen gravitational or terrain parameters; this directly undermines the claim of rapid adaptation without repeated learning from scratch.

minor comments (2)

[Notation] The notation for ILC updates and TL storage could be clarified with a dedicated symbol table to improve readability.
[Figures] Figures depicting adaptation trajectories would benefit from explicit legends distinguishing periodic vs. nonperiodic cases and library retrieval vs. online ILC phases.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We appreciate the referee's detailed feedback on our manuscript. We address each of the major comments below and have revised the manuscript accordingly to improve clarity and rigor.

read point-by-point responses

Referee: [Validation experiments] Validation experiments: the reported 85% joint tracking error reduction and 30x control rate improvement lack accompanying error bars, statistical significance tests, data exclusion criteria, and explicit experimental protocols, which are load-bearing for substantiating the central performance claims.

Authors: We agree with this assessment and have revised the manuscript to include the requested elements. Specifically, we now report mean and standard deviation across multiple trials (n=10 for simulation, n=5 for hardware) with error bars in all relevant figures and tables. Statistical significance is assessed using paired t-tests, with p-values reported (all <0.01 for the 85% reduction claim). Data exclusion criteria are detailed (e.g., exclusion of trials with communication loss, affecting <3% of data). Experimental protocols are now explicitly described in Section 5, including step-by-step procedures for ILC iterations and hardware setup. revision: yes
Referee: [Torque library section] Torque library section: the indexing/retrieval function (similarity metric or interpolation) for the generalized TL is not formalized, and no separate ablation isolates library-only performance on truly unseen gravitational or terrain parameters; this directly undermines the claim of rapid adaptation without repeated learning from scratch.

Authors: We acknowledge that the formalization of the retrieval function was insufficiently detailed in the original manuscript. We have added a precise mathematical definition in the revised Section 3.4, specifying the similarity metric as the Euclidean distance in a normalized feature space (including joint positions, velocities, and estimated external forces) and using nearest-neighbor lookup with linear interpolation for non-exact matches. For the ablation study, we have included a new experiment in the revision where the torque library is queried on out-of-distribution parameters (e.g., slopes of 20 degrees not seen during learning and gravity variations of ±20%), demonstrating that adaptation occurs without re-learning from scratch, with tracking errors reduced by 70% on average within 3 iterations. This addresses the concern regarding generalization. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper describes an integrated framework that combines Iterative Learning Control (ILC) with a torque library and a physics-based hybrid-system model, using real-time data to compensate for uncertainties. Performance claims (error reduction, adaptation to speed/terrain/gravity, high update rates) are presented as outcomes of simulations and hardware validation on Cassie and A1, rather than as mathematical predictions derived from the inputs themselves. No equations, definitions, or steps reduce the central results to fitted parameters renamed as predictions, self-citations that bear the load of uniqueness, or ansatzes smuggled via prior work. The approach is self-contained through explicit empirical testing against external benchmarks and does not rely on any load-bearing self-referential construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the repetitive nature of gaits allowing ILC extension, the existence of a storable generalized torque library that generalizes without retraining, and the hybrid-system model providing a sufficient base for learning compensation.

axioms (1)

domain assumption Repetitive nature of periodic gaits allows extension of ILC to nonperiodic tasks
Invoked in the abstract to justify the framework's generalization capability.

invented entities (1)

Generalized torque library (TL) no independent evidence
purpose: Stores learned control profiles for rapid adaptation across conditions without repeated learning
New construct introduced to eliminate online computation during execution

pith-pipeline@v0.9.0 · 5810 in / 1253 out tokens · 30982 ms · 2026-05-19T04:46:56.663463+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages · 1 internal anchor

[1]

In: Precup D and Teh YW (eds.) Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, volume 70

Achiam J, Held D, Tamar A and Abbeel P (2017) Constrained policy optimization. In: Precup D and Teh YW (eds.) Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, volume 70. PMLR, pp. 22--31. ://proceedings.mlr.press/v70/achiam17a.html

work page 2017
[2]

https://agilityrobotics.com/

Agility Robotics (2025) Cassie Bipedal Robot . https://agilityrobotics.com/. Accessed: 2025-03-30

work page 2025
[3]

IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 37(6): 1099--1121

Ahn HS, Chen Y and Moore KL (2007) Iterative learning control: Brief survey and categorization. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 37(6): 1099--1121

work page 2007
[4]

IEEE Robotics and Automation Letters 9(10): 8386--8393

Alqaham YG, Cheng J and Gan Z (2024) Energy-optimal asymmetrical gait selection for quadrupedal robots. IEEE Robotics and Automation Letters 9(10): 8386--8393. doi:10.1109/LRA.2024.3443589

work page doi:10.1109/lra.2024.3443589 2024
[5]

In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Bledt G, Katz B, Di Carlo J, Wensing PM and Kim S (2018) Mit cheetah 3: Design and control of a robust, dynamic quadruped robot. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 2245--2252. doi:10.1109/IROS.2018.8593885

work page doi:10.1109/iros.2018.8593885 2018
[6]

://iit-dlslab.github.io/papers/bratta21irim.pdf

Bratta A, Rathod N, Zanon M, Villarreal O, Bemporad A, Semini C and Focchi M (2021) Towards a nonlinear model predictive control for quadrupedal locomotion on rough terrain. ://iit-dlslab.github.io/papers/bratta21irim.pdf

work page 2021
[7]

IEEE Control Systems Magazine 26(3): 96--114

Bristow DA, Tharayil M and Alleyne AG (2006) A survey of iterative learning control. IEEE Control Systems Magazine 26(3): 96--114. doi:10.1109/MCS.2006.1636313

work page doi:10.1109/mcs.2006.1636313 2006
[8]

IEEE Robotics and Automation Letters 5(4): 6318--6325

Chadwick M, Kolvenbach H, Dubois F, Lau HF and Hutter M (2020) Vitruvio: An open-source leg design optimization toolbox for walking robots. IEEE Robotics and Automation Letters 5(4): 6318--6325. doi:10.1109/LRA.2020.3013913

work page doi:10.1109/lra.2020.3013913 2020
[9]

://arxiv.org/abs/2203.05194

Chen S, Zhang B, Mueller MW, Rai A and Sreenath K (2023) Learning torque control for quadrupedal locomotion. ://arxiv.org/abs/2203.05194

work page arXiv 2023
[10]

In: 2023 American Control Conference (ACC)

Cheng J, Alqaham YG, Sanyal AK and Gan Z (2023) Practice makes perfect: an iterative approach to achieve precise tracking for legged robots. In: 2023 American Control Conference (ACC). pp. 2165--2170. doi:10.23919/ACC55779.2023.10156623

work page doi:10.23919/acc55779.2023.10156623 2023
[11]

In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems

Chilian A, Hirschmüller H and Görner M (2011) Multisensor data fusion for robust pose estimation of a six-legged walking robot. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. pp. 2497--2504. doi:10.1109/IROS.2011.6094484

work page doi:10.1109/iros.2011.6094484 2011
[12]

Lyapunov-based Safe Policy Optimization for Continuous Control

Chow Y, Nachum O, Faust A, Duenez-Guzman E and Ghavamzadeh M (2019) Lyapunov-based safe policy optimization for continuous control. ://arxiv.org/abs/1901.10031

work page internal anchor Pith review Pith/arXiv arXiv 2019
[13]

IEEE Access 4: 3469--3478

Da X, Harib O, Hartley R, Griffin B and Grizzle JW (2016) From 2d design of underactuated bipedal gaits to 3d implementation: Walking with speed tracking. IEEE Access 4: 3469--3478. doi:10.1109/ACCESS.2016.2582731

work page doi:10.1109/access.2016.2582731 2016
[14]

In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Di Carlo J, Wensing PM, Katz B, Bledt G and Kim S (2018) Dynamic locomotion in the mit cheetah 3 through convex model-predictive control. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 1--9. doi:10.1109/IROS.2018.8594448

work page doi:10.1109/iros.2018.8594448 2018
[15]

In: Precup D and Teh YW (eds.) Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, volume 70

Finn C, Abbeel P and Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Precup D and Teh YW (eds.) Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research, volume 70. PMLR, pp. 1126--1135. ://proceedings.mlr.press/v70/finn17a.html

work page 2017
[16]

In: 2019 American Control Conference (ACC)

Gong Y, Hartley R, Da X, Hereid A, Harib O, Huang JK and Grizzle J (2019) Feedback control of a cassie bipedal robot: Walking, standing, and riding a segway. In: 2019 American Control Conference (ACC). pp. 4559--4566. doi:10.23919/ACC.2019.8814833

work page doi:10.23919/acc.2019.8814833 2019
[17]

7666–7673

Grandia R, Farshidian F, Ranftl R and Hutter M (2019) Feedback mpc for torque-controlled legged robots. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 4730--4737. doi:10.1109/IROS40897.2019.8968251

work page doi:10.1109/iros40897.2019.8968251 2019
[18]

In: 2017 IEEE International Conference on Robotics and Automation (ICRA)

Gu S, Holly E, Lillicrap T and Levine S (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE International Conference on Robotics and Automation (ICRA). pp. 3389--3396. doi:10.1109/ICRA.2017.7989385

work page doi:10.1109/icra.2017.7989385 2017
[19]

In: Proceedings of the 35th International Conference on Machine Learning (ICML), volume 80

Haarnoja T, Zhou A, Abbeel P and Levine S (2018) Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Proceedings of the 35th International Conference on Machine Learning (ICML), volume 80. PMLR, pp. 1861--1870. ://proceedings.mlr.press/v80/haarnoja18b/haarnoja18b.pdf

work page 2018
[20]

://arxiv.org/abs/2007.04309

Hansen N, Jangir R, Sun Y, Alenyà G, Abbeel P, Efros AA, Pinto L and Wang X (2021) Self-supervised policy adaptation during deployment. ://arxiv.org/abs/2007.04309

work page arXiv 2021
[21]

In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Hereid A and Ames AD (2017) Frost*: Fast robot optimization and simulation toolkit. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 719--726. doi:10.1109/IROS.2017.8202230

work page doi:10.1109/iros.2017.8202230 2017
[22]

Gait and Posture 4(3): 222--223

Hof AL (1996) Scaling gait data to body size. Gait and Posture 4(3): 222--223. doi:10.1016/0966-6362(95)01057-2

work page doi:10.1016/0966-6362(95)01057-2 1996
[23]

In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Kajita S, Kanehiro F, Kaneko K, Yokoi K and Hirukawa H (2001) The 3d linear inverted pendulum mode: A simple modeling for a biped walking pattern generation. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 239--246

work page 2001
[24]

://arxiv.org/abs/2207.10465

Kang D, Vincenti FD and Coros S (2022) Nonlinear model predictive control for quadrupedal locomotion using second-order sensitivity analysis. ://arxiv.org/abs/2207.10465

work page arXiv 2022
[25]

In: 2019 International Conference on Robotics and Automation (ICRA)

Katz B, Carlo JD and Kim S (2019) Mini cheetah: A platform for pushing the limits of dynamic quadruped control. In: 2019 International Conference on Robotics and Automation (ICRA). pp. 6295--6301. doi:10.1109/ICRA.2019.8793865

work page doi:10.1109/icra.2019.8793865 2019
[26]

://arxiv.org/abs/1909.06586

Kim D, Carlo JD, Katz B, Bledt G and Kim S (2019) Highly dynamic quadruped locomotion via whole-body impulse control and model predictive control. ://arxiv.org/abs/1909.06586

work page arXiv 2019
[27]

Koenig and A

Koenig N and Howard A (2004) Design and use paradigms for gazebo, an open-source multi-robot simulator. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), volume 3. pp. 2149--2154 vol.3. doi:10.1109/IROS.2004.1389727

work page doi:10.1109/iros.2004.1389727 2004
[28]

IEEE Transactions on Robotics 40: 1617--1629

Le Cleac'h S, Howell TA, Yang S, Lee CY, Zhang J, Bishop A, Schwager M and Manchester Z (2024) Fast contact-implicit model predictive control. IEEE Transactions on Robotics 40: 1617--1629. doi:10.1109/TRO.2024.3351554

work page doi:10.1109/tro.2024.3351554 2024
[29]

Science Robotics 5(47): eabc5986

Lee J, Hwangbo J, Wellhausen L, Koltun V and Hutter M (2020) Learning quadrupedal locomotion over challenging terrain. Science Robotics 5(47): eabc5986. doi:10.1126/scirobotics.abc5986

work page doi:10.1126/scirobotics.abc5986 2020
[30]

Miller, K

Melon O, Geisert M, Surovik D, Havoutis I and Fallon M (2020) Reliable trajectories for dynamic quadrupeds using analytical costs and learned initializations. In: 2020 IEEE International Conference on Robotics and Automation (ICRA). pp. 1410--1416. doi:10.1109/ICRA40945.2020.9196562

work page doi:10.1109/icra40945.2020.9196562 2020
[31]

In: ACM SIGGRAPH 2010 Papers, SIGGRAPH '10

Mordatch I, de Lasa M and Hertzmann A (2010) Robust physics-based locomotion using low-dimensional planning. In: ACM SIGGRAPH 2010 Papers, SIGGRAPH '10. New York, NY, USA: Association for Computing Machinery. ISBN 9781450302104. doi:10.1145/1833349.1778808

work page doi:10.1145/1833349.1778808 2010
[32]

In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS'18

Nachum O, Gu S, Lee H and Levine S (2018) Data-efficient hierarchical reinforcement learning. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS'18. Red Hook, NY, USA: Curran Associates Inc., p. 3307–3317

work page 2018
[33]

IEEE Robotics and Automation Letters 3(3): 1458--1465

Neunert M, Farshidian F, Wermelinger M, St \"a uble A and Buchli J (2018) Whole-body nonlinear model predictive control through contacts for quadrupeds. IEEE Robotics and Automation Letters 3(3): 1458--1465

work page 2018
[34]

://arxiv.org/abs/2408.02619

Nguyen C, Bao L and Nguyen Q (2024) Mastering agile jumping skills from simple practices with iterative learning control. ://arxiv.org/abs/2408.02619

work page arXiv 2024
[35]

In: 2018 IEEE International Conference on Robotics and Automation (ICRA)

Peng X, Andrychowicz M, Zaremba W and Abbeel P (2018) Sim-to-real transfer of robotic control with dynamics randomization. In: 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, p. 3803–3810. doi:10.1109/icra.2018.8460528

work page doi:10.1109/icra.2018.8460528 2018
[36]

MIT press

Raibert MH (1986) Legged robots that balance. MIT press

work page 1986
[37]

IEEE Access 9: 145710–145727

Rathod N, Bratta A, Focchi M, Zanon M, Villarreal O, Semini C and Bemporad A (2021) Model predictive control with environment adaptation for legged locomotion. IEEE Access 9: 145710–145727. doi:10.1109/access.2021.3118957. ://dx.doi.org/10.1109/ACCESS.2021.3118957

work page doi:10.1109/access.2021.3118957 2021
[38]

://arxiv.org/abs/2105.08328

Siekmann J, Green K, Warila J, Fern A and Hurst J (2021) Blind bipedal stair traversal via sim-to-real reinforcement learning. ://arxiv.org/abs/2105.08328

work page arXiv 2021
[39]

In: Robotics: Science and Systems XIV, RSS2018

Tan J, Zhang T, Coumans E, Iscen A, Bai Y, Hafner D, Bohez S and Vanhoucke V (2018) Sim-to-real: Learning agile locomotion for quadruped robots. In: Robotics: Science and Systems XIV, RSS2018. Robotics: Science and Systems Foundation. ://dx.doi.org/10.15607/rss.2018.xiv.010

work page doi:10.15607/rss.2018.xiv.010 2018
[40]

://underactuated.mit.edu/dp.html

Tedrake R (2022) Underactuated Robotics: Algorithms for Walking, Running, Swimming, Flying, and Manipulation. ://underactuated.mit.edu/dp.html. Course notes for MIT 6.832, Chapter 6: Dynamic Programming

work page 2022
[41]

In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Tobin J, Fong R, Ray A, Schneider J, Zaremba W and Abbeel P (2017) Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 23--30. doi:10.1109/IROS.2017.8202133

work page doi:10.1109/iros.2017.8202133 2017
[42]

https://www.unitree.com/a1/

Unitree Robotics (2020) Unitree A1 Quadruped Robot . https://www.unitree.com/a1/. Accessed: 2025-03-29

work page 2020
[43]

In: 2024 European Control Conference (ECC)

Weiss M, Stirling A, Pawluchin A, Lehmann D, Hannemann Y, Seel T and Boblan I (2024) Achieving velocity tracking despite model uncertainty for a quadruped robot with a pd-ilc controller. In: 2024 European Control Conference (ECC). pp. 134--140. doi:10.23919/ECC64448.2024.10590932

work page doi:10.23919/ecc64448.2024.10590932 2024
[44]

CRC Press

Westervelt ER, Grizzle JW, Chevallereau C, Choi JH and Morris B (2007) Feedback Control of Dynamic Bipedal Robot Locomotion. CRC Press. ://web.eecs.umich.edu/ grizzle/papers/Westervelt_biped_control_book_15_May_2007.pdf

work page 2007
[45]

In: 2006 6th IEEE-RAS International Conference on Humanoid Robots

Wieber Pb (2006) Trajectory free linear model predictive control for stable walking in the presence of strong perturbations. In: 2006 6th IEEE-RAS International Conference on Humanoid Robots. pp. 137--142. doi:10.1109/ICHR.2006.321375

work page doi:10.1109/ichr.2006.321375 2006
[46]

IEEE Robotics and Automation Letters 3(3): 1560--1567

Winkler AW, Bellicoso CD, Hutter M and Buchli J (2018) Gait and trajectory optimization for legged systems through phase-based end-effector parameterization. IEEE Robotics and Automation Letters 3(3): 1560--1567. doi:10.1109/LRA.2018.2798285

work page doi:10.1109/lra.2018.2798285 2018
[47]

In: LaValle SM, O'Kane JM, Otte M, Sadigh D and Tokekar P (eds.) Algorithmic Foundations of Robotics XV

Xie Z, Da X, Babich B, Garg A and de Panne Mv (2023) Glide: Generalizable quadrupedal locomotion in diverse environments with a centroidal model. In: LaValle SM, O'Kane JM, Otte M, Sadigh D and Tokekar P (eds.) Algorithmic Foundations of Robotics XV. Cham: Springer International Publishing. ISBN 978-3-031-21090-7, pp. 523--539

work page 2023
[48]

In: Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS)

Yang A, Hwangbo J, Margolis C and Hutter M (2020) Data-efficient reinforcement learning for legged robots. In: Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS). pp. 1--12. ://proceedings.mlr.press/v100/yang20a/yang20a.pdf

work page 2020
[49]

://arxiv.org/abs/2203.02638

Yang TY, Zhang T, Luu L, Ha S, Tan J and Yu W (2022) Safe reinforcement learning for legged locomotion. ://arxiv.org/abs/2203.02638

work page arXiv 2022
[50]

In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI)

Zhao W, Queralta JP and Westerlund T (2020) Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI). pp. 737--744. doi:10.1109/SSCI47803.2020.9308468

work page doi:10.1109/ssci47803.2020.9308468 2020
[51]

, " * write output.state after.block = add.period write newline

ENTRY address author booktitle chapter doi edition editor eid howpublished institution isbn journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid...

work page
[52]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize ":" * " " *...

work page
[53]

, " * write output.state after.block = add.period write newline

ENTRY address archive author booktitle chapter doi edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school series title type url volume year label INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence := #2 'af...

work page
[54]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page