pith. sign in

arxiv: 2507.16481 · v3 · pith:EZMQJG74new · submitted 2025-07-22 · 💻 cs.RO · cs.SY· eess.SY

Guided Reinforcement Learning for Omnidirectional 3D Jumping in Quadruped Robots

Pith reviewed 2026-05-22 00:10 UTC · model grok-4.3

classification 💻 cs.RO cs.SYeess.SY
keywords quadruped robotsreinforcement learningjumping controlBézier curvesomnidirectional motionguided RLphysical models3D locomotion
0
0 comments X

The pith

Guided reinforcement learning combines Bézier curves and accelerated motion models for efficient omnidirectional 3D jumping in quadruped robots.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a method that guides reinforcement learning for quadruped robots to perform jumps in any direction in three dimensions by incorporating physical models. It uses Bézier curves to shape the jumping trajectories and a uniformly accelerated rectilinear motion model to describe the dynamics. This guidance aims to reduce the number of training samples needed and make the learned behaviors more predictable and safer than those from standard reinforcement learning approaches. Traditional methods either require detailed knowledge of the robot and environment or suffer from high sample complexity and lack of explainability. A sympathetic reader would care because successful jumping is key for robots operating in challenging terrains, and this could lead to more reliable real-world performance.

Core claim

By combining Bézier curves with a Uniformly Accelerated Rectilinear Motion (UARM) model to guide the reinforcement learning process, the approach achieves more efficient training and more predictable jumping motions for quadruped robots, as shown through simulations and real experiments that outperform existing methods.

What carries the argument

The guided reinforcement learning framework that integrates Bézier curve trajectory planning with the UARM motion model to inject physical intuition into the learning process.

If this is right

  • Lower sample complexity for training jumping policies compared to end-to-end RL.
  • Greater predictability in the final jumping motions, aiding safety certification.
  • Superior performance in both simulation and hardware experiments over alternative approaches.
  • Reduced need for extensive robot and terrain parameter knowledge in controller design.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This guidance technique could be adapted to other agile locomotion tasks such as running or vaulting.
  • Similar physical model integration might help bridge the gap between simulation and real-world robot deployment.
  • The method opens possibilities for certifying safety in dynamic robot behaviors more systematically.

Load-bearing premise

The physical models of Bézier curves and uniformly accelerated motion provide accurate enough guidance to improve RL without adding harmful biases or requiring detailed parameter knowledge.

What would settle it

A direct comparison experiment where the guided approach requires as many or more training episodes than standard RL or yields jumping motions that cannot be more easily predicted or certified would falsify the central claim.

read the original abstract

Jumping poses a significant challenge for quadruped robots, despite being crucial for many operational scenarios. While optimisation methods exist for controlling such motions, they are often time-consuming and demand extensive knowledge of robot and terrain parameters, making them less robust in real-world scenarios. Reinforcement learning (RL) is emerging as a viable alternative, yet conventional end-to-end approaches lack efficiency in terms of sample complexity, requiring extensive training in simulations, and predictability of the final motion, which makes it difficult to certify the safety of the final motion. To overcome these limitations, this paper introduces a novel guided reinforcement learning approach that leverages physical intuition for efficient and explainable jumping, by combining B\'ezier curves with a Uniformly Accelerated Rectilinear Motion (UARM) model. Extensive simulation and experimental results clearly demonstrate the advantages of our approach over existing alternatives.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a guided reinforcement learning framework for omnidirectional 3D jumping in quadruped robots. It combines Bézier curves for trajectory planning with a Uniformly Accelerated Rectilinear Motion (UARM) model to inject physical intuition, with the goal of achieving lower sample complexity and more predictable, explainable motions than end-to-end RL or parameter-heavy optimization methods. The abstract states that extensive simulation and experimental results demonstrate clear advantages over existing alternatives.

Significance. If the quantitative results and ablation studies hold, the work could offer a practical middle ground between model-based control and pure learning for dynamic locomotion, potentially improving training efficiency and safety certification for jumping behaviors in real-world quadruped deployments.

major comments (2)
  1. [Method / UARM model definition] The central claim that the Bézier + UARM guidance supplies accurate, low-bias priors that reduce sample complexity without extensive robot/terrain parameter knowledge rests on the fidelity of the UARM model. The manuscript should explicitly compare the UARM-predicted trajectories against the actual stance-to-flight transitions and gravity-dominated parabolic arcs observed in the robot's dynamics (e.g., in the results or dynamics section); without such validation, the guidance risks introducing systematic bias rather than improving predictability.
  2. [Abstract and Results] Abstract claims 'extensive simulation and experimental results clearly demonstrate the advantages' yet the provided description contains no quantitative metrics, baseline comparisons (e.g., sample efficiency curves, success rates, or energy metrics), or error analysis. The results section must include these to substantiate the efficiency and explainability claims; otherwise the central advantage over end-to-end RL remains unverified.
minor comments (2)
  1. [Method] Clarify how the Bézier curve parameters are chosen or adapted online versus fixed from the UARM model, and whether any additional robot-specific parameters are still required.
  2. [Figures] Ensure all figures showing trajectories or learned policies include direct overlays of the UARM reference and measured robot motion for visual assessment of guidance fidelity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which helps clarify the presentation of our guided RL framework. We address each major comment below and commit to revisions that strengthen the validation and quantitative support without altering the core contributions.

read point-by-point responses
  1. Referee: [Method / UARM model definition] The central claim that the Bézier + UARM guidance supplies accurate, low-bias priors that reduce sample complexity without extensive robot/terrain parameter knowledge rests on the fidelity of the UARM model. The manuscript should explicitly compare the UARM-predicted trajectories against the actual stance-to-flight transitions and gravity-dominated parabolic arcs observed in the robot's dynamics (e.g., in the results or dynamics section); without such validation, the guidance risks introducing systematic bias rather than improving predictability.

    Authors: We agree that explicit validation of the UARM approximation is necessary to substantiate the low-bias claim. The UARM model is specifically chosen to capture the dominant vertical acceleration under gravity during flight, while Bézier curves handle the horizontal and transition phases. In the revised manuscript we will add a dedicated comparison subsection (in Results) that overlays UARM-predicted vertical and horizontal trajectories against both simulation data and hardware recordings of stance-to-flight transitions. This will include quantitative error metrics (e.g., RMSE) to demonstrate fidelity and any residual bias. revision: yes

  2. Referee: [Abstract and Results] Abstract claims 'extensive simulation and experimental results clearly demonstrate the advantages' yet the provided description contains no quantitative metrics, baseline comparisons (e.g., sample efficiency curves, success rates, or energy metrics), or error analysis. The results section must include these to substantiate the efficiency and explainability claims; otherwise the central advantage over end-to-end RL remains unverified.

    Authors: The full results section already reports quantitative metrics, including sample-efficiency curves, success rates across omnidirectional jumps, and energy comparisons versus end-to-end RL and optimization baselines, together with ablation studies on the guidance components. To address the concern directly, we will (i) revise the abstract to include one or two key quantitative highlights and (ii) expand the results section with additional error analysis and clearer baseline tables if any gaps exist in the current presentation. revision: partial

Circularity Check

0 steps flagged

Guided RL framework relies on independent physical models with no circular reduction

full rationale

The paper presents a guided reinforcement learning method that combines Bézier curves for trajectory planning with a Uniformly Accelerated Rectilinear Motion (UARM) model to supply physical intuition, thereby reducing sample complexity and improving explainability compared to end-to-end RL. No derivation step in the abstract or described approach reduces a claimed prediction or result to a quantity defined by the paper's own fitted parameters, self-citations, or ansatz smuggled in via prior work. The physical models are invoked as external guidance inputs applied to the RL process rather than being derived from or equivalent to the learned policy outputs. The central claims rest on simulation and experimental validation against alternatives, rendering the framework self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard domain assumptions in robotics and RL; no explicit free parameters, new entities, or ad-hoc axioms are detailed in the abstract beyond the core guidance premise.

axioms (1)
  • domain assumption Physical models such as Bézier curves and UARM can effectively guide RL to achieve lower sample complexity and higher explainability for jumping motions.
    Invoked as the foundation for the novel guided approach in the abstract.

pith-pipeline@v0.9.0 · 5687 in / 1258 out tokens · 65840 ms · 2026-05-22T00:10:05.096695+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. LineRides: Line-Guided Reinforcement Learning for Bicycle Robot Stunts

    cs.RO 2026-05 unverdicted novelty 6.0

    LineRides enables a bicycle robot to learn five commandable stunts from spatial guidelines and key orientations via RL without demonstrations or timing.

  2. LineRides: Line-Guided Reinforcement Learning for Bicycle Robot Stunts

    cs.RO 2026-05 unverdicted novelty 6.0

    LineRides enables commandable bicycle robot stunts via line-guided RL that uses spatial guidelines, a tracking margin for feasibility, distance-based progress, and sparse key-orientations.

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · cited by 1 Pith paper · 1 internal anchor

  1. [1]

    Journal of Field Robotics41(6), 1829–1842 (2024)

    Amatucci, L., Turrisi, G., Bratta, A., Barasuol, V., Semini, C.: Vero: A vacuum- cleaner-equipped quadruped robot for efficient litter removal. Journal of Field Robotics41(6), 1829–1842 (2024)

  2. [2]

    Ain Shams Engineering Journal12(2), 2017–2031 (2021)

    Biswal, P., Mohanty, P.K.: Development of quadruped walking robots: A review. Ain Shams Engineering Journal12(2), 2017–2031 (2021)

  3. [3]

    IEEE Transactions on Robotics38(6), 3395–3413 (2022) https://doi.org/10.1109/TRO.2022.3186804

    Jenelten, F., Grandia, R., Farshidian, F., Hutter, M.: Tamols: Terrain-aware motion optimization for legged systems. IEEE Transactions on Robotics38(6), 3395–3413 (2022) https://doi.org/10.1109/TRO.2022.3186804

  4. [4]

    IEEE Robotics and Automation Letters8(11), 7210–7217 (2023) https://doi.org/10.1109/LRA.2023.3313919

    Roscia, F., Focchi, M., Prete, A.D., Caldwell, D.G., Semini, C.: Reactive landing controller for quadruped robots. IEEE Robotics and Automation Letters8(11), 7210–7217 (2023) https://doi.org/10.1109/LRA.2023.3313919

  5. [5]

    The International Journal of Robotics Research36(2), 167–192 (2017) https://doi.org/10.1177/0278364917694244

    Park, H.-W., Wensing, P.M., Kim, S.: High-speed bounding with the mit chee- tah 2: Control design and experiments. The International Journal of Robotics Research36(2), 167–192 (2017) https://doi.org/10.1177/0278364917694244

  6. [6]

    IEEE Robotics and Automation Letters5(2), 3422–3429 (2020) https://doi.org/10.1109/LRA.2020

    Yim, J.K., Singh, B.R.P., Wang, E.K., Featherstone, R., Fearing, R.S.: Preci- sion robotic leaping and landing using stance-phase balance. IEEE Robotics and Automation Letters5(2), 3422–3429 (2020) https://doi.org/10.1109/LRA.2020. 2976597

  7. [7]

    In: 2022 IEEE/RSJ Inter- national Conference on Intelligent Robots and Systems (IROS), pp

    Nguyen, C., Nguyen, Q.: Contact-timing and trajectory optimization for 3d jumping on quadruped robots. In: 2022 IEEE/RSJ International Conference on 39 Intelligent Robots and Systems (IROS), pp. 11994–11999 (2022). https://doi.org/ 10.1109/IROS47612.2022.9981284

  8. [8]

    In: 2019 International Conference on Robotics and Automation (ICRA), pp

    Katz, B., Di Carlo, J., Kim, S.: Mini cheetah: A platform for pushing the limits of dynamic quadruped control. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 6295–6301 (2019). IEEE

  9. [9]

    In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp

    Chignoli, M., Kim, S.: Online trajectory optimization for dynamic aerial motions of a quadruped robot. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 7693–7699 (2021). IEEE

  10. [10]

    VINSEval: Evaluation Framework for Unified Testing of Consistency and Robustness of Visual-Inertial Navigation System Algorithms,

    Garc´ ıa, G., Griffin, R., Pratt, J.: Time-varying model predictive control for highly dynamic motions of quadrupedal robots. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 7344–7349 (2021). https://doi.org/10. 1109/ICRA48506.2021.9561913

  11. [11]

    Varadarajan, A

    Chignoli, M., Morozov, S., Kim, S.: Rapid and reliable quadruped motion planning with omnidirectional jumping. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 6621–6627 (2022). https://doi.org/10. 1109/ICRA46639.2022.9812088

  12. [12]

    In: 2022 IEEE/RSJ Inter- national Conference on Intelligent Robots and Systems (IROS), pp

    Song, Z., Yue, L., Sun, G., Ling, Y., Wei, H., Gui, L., Liu, Y.-H.: An optimal motion planning framework for quadruped jumping. In: 2022 IEEE/RSJ Inter- national Conference on Intelligent Robots and Systems (IROS), pp. 11366–11373 (2022). https://doi.org/10.1109/IROS47612.2022.9981642

  13. [13]

    IEEE Transactions on Robotics 41, 837–856 (2025) https://doi.org/10.1109/TRO.2024.3504132

    Li, H., Wensing, P.M.: Cafe-mpc: A cascaded-fidelity model predictive control framework with tuning-free whole-body control. IEEE Transactions on Robotics 41, 837–856 (2025) https://doi.org/10.1109/TRO.2024.3504132

  14. [14]

    IEEE Transactions on Robotics (2024)

    Ding, J., Atanassov, V., Panichi, E., Kober, J., Della Santina, C.: Robust quadrupedal jumping with impact-aware landing: Exploiting parallel elasticity. IEEE Transactions on Robotics (2024)

  15. [15]

    ArXiv (2022)

    Mastalli, C., Merkt, W., Xin, G., Shim, J., Mistry, M., Havoutis, I., Vijayakumar, S.: Agile maneuvers in legged robots:a predictive control approach. ArXiv (2022)

  16. [16]

    Li, H., Wensing, P.M.: Cafe-Mpc: A Cascaded-Fidelity Model Predictive Control Framework with Tuning-Free Whole-Body Control (2024)

  17. [17]

    arXiv preprint arXiv:2403.06954 (2024)

    Bellegarda, G., Shafiee, M., ¨Ozberk, M.E., Ijspeert, A.: Quadruped-frog: Rapid online optimization of continuous quadruped jumping. arXiv preprint arXiv:2403.06954 (2024)

  18. [18]

    Continuous control with deep reinforcement learning

    Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N.M.O., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. CoRR abs/1509.02971(2015) 40

  19. [19]

    IEEE Robotics & Automation Magazine23(1), 34–43 (2016)

    Gehring, C., Coros, S., Hutter, M., Bellicoso, C.D., Heijnen, H., Diethelm, R., Bloesch, M., Fankhauser, P., Hwangbo, J., Hoepflinger, M.,et al.: Practice makes perfect: An optimization-based approach to controlling agile motions for a quadruped robot. IEEE Robotics & Automation Magazine23(1), 34–43 (2016)

  20. [20]

    Science Robotics4(26), 5872 (2019)

    Hwangbo, J., Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V., Hutter, M.: Learning agile and dynamic motor skills for legged robots. Science Robotics4(26), 5872 (2019)

  21. [21]

    Robotics: Science and Systems (2020) https://doi.org/10.15607/RSS.2020.XVI.064

    Peng, X., Coumans, E., Zhang, T., Lee, T.-W., Tan, J., Levine, S.: Learning agile robotic locomotion skills by imitating animals. Robotics: Science and Systems (2020) https://doi.org/10.15607/RSS.2020.XVI.064

  22. [22]

    IEEE Robotics and Automation Letters7(2), 4630–4637 (2022)

    Ji, G., Mun, J., Kim, H., Hwangbo, J.: Concurrent training of a control policy and a state estimator for dynamic and robust legged locomotion. IEEE Robotics and Automation Letters7(2), 4630–4637 (2022)

  23. [23]

    In: Conference on Robot Learning, pp

    Rudin, N., Hoeller, D., Reist, P., Hutter, M.: Learning to walk in minutes using massively parallel deep reinforcement learning. In: Conference on Robot Learning, pp. 91–100 (2022). PMLR

  24. [24]

    In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp

    Fankhauser, P., Hutter, M., Gehring, C., Bloesch, M., Hoepflinger, M.A., Sieg- wart, R.: Reinforcement learning of single legged locomotion. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 188–193 (2013). IEEE

  25. [25]

    Science Robotics9(88), 7566 (2024)

    Hoeller, D., Rudin, N., Sako, D., Hutter, M.: Anymal parkour: Learning agile navigation for quadrupedal robots. Science Robotics9(88), 7566 (2024)

  26. [26]

    https://spinningup.openai.com/en/latest/spinningup/bench.html# benchmarks-for-spinning-up-implementations [Accessed: 26/02/2023] (2022)

    OpenAI, I.: Benchmarks for Spinning Up Implementations. https://spinningup.openai.com/en/latest/spinningup/bench.html# benchmarks-for-spinning-up-implementations [Accessed: 26/02/2023] (2022)

  27. [27]

    In: Matni, N., Morari, M., Pappas, G.J

    Yang, Y., Meng, X., Yu, W., Zhang, T., Tan, J., Boots, B.: Continuous versatile jumping using learned action residuals. In: Matni, N., Morari, M., Pappas, G.J. (eds.) Proceedings of The 5th Annual Learning for Dynamics and Control Confer- ence. Proceedings of Machine Learning Research, vol. 211, pp. 770–782. PMLR, ??? (2023)

  28. [28]

    ArXivabs/2110.01411(2021)

    Majid, A.Y., Saaybi, S., Rietbergen, T., Fran¸ cois-Lavet, V., Prasad, R.V., Verho- even, C.: Deep reinforcement learning versus evolution strategies: A comparative survey. ArXivabs/2110.01411(2021)

  29. [29]

    Frontiers in Robotics and AI9, 854212 (2022) 41

    Bogdanovic, M., Khadiv, M., Righetti, L.: Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization. Frontiers in Robotics and AI9, 854212 (2022) 41

  30. [30]

    Bellegarda, G., Nguyen, C., Nguyen, Q.: Robust Quadruped Jumping via Deep Reinforcement Learning (2023)

  31. [31]

    IEEE Robotics and Automation Letters8(6), 3318–3325 (2023)

    Grandesso, G., Alboni, E., Papini, G.P.R., Wensing, P.M., Del Prete, A.: Cacto: Continuous actor-critic with trajectory optimization—towards global optimality. IEEE Robotics and Automation Letters8(6), 3318–3325 (2023)

  32. [32]

    ACM SIGGRAPH / Eurographics Symposium on Computer Animation (2017)

    Peng, X.B., Panne, M.: Learning locomotion skills using deeprl: Does the choice of action space matter? In: Proc. ACM SIGGRAPH / Eurographics Symposium on Computer Animation (2017)

  33. [33]

    In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp

    Bellegarda, G., Byl, K.: Training in task space to speed up and guide reinforce- ment learning. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2693–2699 (2019). https://doi.org/10.1109/IROS40897. 2019.8967995

  34. [34]

    In: 2023 IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids), pp

    Chen, S., Zhang, B., Mueller, M.W., Rai, A., Sreenath, K.: Learning torque control for quadrupedal locomotion. In: 2023 IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids), pp. 1–8 (2023). IEEE

  35. [35]

    scientific Reports13(1), 11945 (2023)

    Aractingi, M., L´ eziart, P.-A., Flayols, T., Perez, J., Silander, T., Sou` eres, P.: Con- trolling the solo12 quadruped robot with deep reinforcement learning. scientific Reports13(1), 11945 (2023)

  36. [36]

    In: 2024 IEEE International Conference on Robotics and Automation (ICRA), pp

    Shafiee, M., Bellegarda, G., Ijspeert, A.: Manyquadrupeds: Learning a single locomotion policy for diverse quadruped robots. In: 2024 IEEE International Conference on Robotics and Automation (ICRA), pp. 3471–3477 (2024). IEEE

  37. [37]

    In: Matni, N., Morari, M., Pappas, G.J

    Yang, Y., Meng, X., Yu, W., Zhang, T., Tan, J., Boots, B.: Continuous versatile jumping using learned action residuals. In: Matni, N., Morari, M., Pappas, G.J. (eds.) Proceedings of The 5th Annual Learning for Dynamics and Control Confer- ence. Proceedings of Machine Learning Research, vol. 211, pp. 770–782. PMLR, ??? (2023). https://proceedings.mlr.press...

  38. [38]

    In: 2024 IEEE International Conference on Robotics and Automation (ICRA), pp

    Vezzi, F., Ding, J., Raffin, A., Kober, J., Della Santina, C.: Two-stage learning of highly dynamic motions with rigid and articulated soft quadrupeds. In: 2024 IEEE International Conference on Robotics and Automation (ICRA), pp. 9720–9726 (2024). IEEE

  39. [39]

    MacDorman, and Norri Kageki

    Atanassov, V., Ding, J., Kober, J., Havoutis, I., Santina, C.D.: Curriculum-based reinforcement learning for quadrupedal jumping: A reference-free design. IEEE Robotics & Automation Magazine, 2–15 (2024) https://doi.org/10.1109/MRA. 2024.3487325

  40. [40]

    IEEE Robotics & Automation Magazine30(2), 67–85 (2022) 42

    Eßer, J., Bach, N., Jestel, C., Urbann, O., Kerner, S.: Guided reinforcement learn- ing: A review and evaluation for efficient and effective real-world robotics [survey]. IEEE Robotics & Automation Magazine30(2), 67–85 (2022) 42

  41. [41]

    Sensors24(15), 4981 (2024)

    Bussola, R., Focchi, M., Del Prete, A., Fontanelli, D., Palopoli, L.: Efficient reinforcement learning for 3d jumping monopods. Sensors24(15), 4981 (2024)

  42. [42]

    Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal Policy Optimization Algorithms (2017)

  43. [43]

    In: International Confer- ence on Machine Learning, pp

    Fujita, Y., Maeda, S.-i.: Clipped action policy gradient. In: International Confer- ence on Machine Learning, pp. 1597–1606 (2018). PMLR

  44. [44]

    IEEE Robotics and Automation Letters8(6), 3740– 3747 (2023)

    Mittal, M., Yu, C., Yu, Q., Liu, J., Rudin, N., Hoeller, D., Yuan, J.L., Singh, R., Guo, Y., Mazhar, H.,et al.: Orbit: A unified simulation framework for interactive robot learning environments. IEEE Robotics and Automation Letters8(6), 3740– 3747 (2023)

  45. [45]

    In: Climbing and Walking Robots Conference, pp

    Focchi, M., Roscia, F., Semini, C.: Locosim: an open-source cross-platform robotics framework. In: Climbing and Walking Robots Conference, pp. 395–406 (2023). Springer

  46. [46]

    In: Liu, K., Kulic, D., Ichnowski, J

    Feng, G., Zhang, H., Li, Z., Peng, X.B., Basireddy, B., Yue, L., SONG, Z., Yang, L., Liu, Y., Sreenath, K., Levine, S.: Genloco: Generalized locomotion controllers for quadrupedal robots. In: Liu, K., Kulic, D., Ichnowski, J. (eds.) Proceedings of The 6th Conference on Robot Learning. Proceedings of Machine Learning Research, vol. 205, pp. 1893–1903. PMLR...

  47. [47]

    The International Journal of Robotics Research (2025)

    Ordo˜ nez-Apraez, D., Turrisi, G., Kostic, V., Martin, M., Agudo, A., Moreno- Noguer, F., Pontil, M., Semini, C., Mastalli, C.: Morphological symmetries in robotics. The International Journal of Robotics Research (2025)

  48. [48]

    In: 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp

    Su, Z., Huang, X., Ordo˜ nez-Apraez, D., Li, Y., Li, Z., Liao, Q., Turrisi, G., Pontil, M., Semini, C., Wu, Y., Sreenath, K.: Leveraging symmetry in rl-based legged locomotion control. In: 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6899–6906 (2024)

  49. [49]

    SIAM Review7(1), 151–152 (1965) https://doi.org/10.1137/1007028 43

    Greenstein, D.S.: Interpolation and approximation. SIAM Review7(1), 151–152 (1965) https://doi.org/10.1137/1007028 43