SRL: Combining SLIP Model and Reinforcement Learning for Agile Robotic Jumping

Chenyue Shao; Linqi Ye; Qingdu Li; Rankun Li; Xiaowen Hu; Yan Peng; Yudi Zhu

arxiv: 2606.18625 · v1 · pith:WNWEG7RJnew · submitted 2026-06-17 · 💻 cs.RO

SRL: Combining SLIP Model and Reinforcement Learning for Agile Robotic Jumping

Xiaowen Hu , Linqi Ye , Yudi Zhu , Chenyue Shao , Rankun Li , Qingdu Li , Yan Peng This is my paper

Pith reviewed 2026-06-26 21:13 UTC · model grok-4.3

classification 💻 cs.RO

keywords robotic jumpingSLIP modelreinforcement learninghybrid controlbipedal locomotionquadrupedal locomotionsim-to-real transfer

0 comments

The pith

A hybrid controller fuses the SLIP spring-mass model with reinforcement learning to produce stable robotic jumps on irregular terrain after far less training than pure RL.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Spring-loaded Reinforcement Learning (SRL) that supplies a feedforward trajectory from the SLIP model to guide an RL policy whose role is real-time correction. Because the SLIP component encodes biologically plausible hopping dynamics, the RL agent explores a narrower space and converges faster while still adapting when contact or joint assumptions break on stairs or uneven ground. Simulations on bipedal and quadrupedal platforms, plus sim-to-sim and sim-to-real transfers, show average position error below 0.1 m and velocity error within plus or minus 3 percent of target values. A reader cares because the approach suggests that embedding simplified physics inside learning loops can cut the data hunger that currently limits agile locomotion in search-and-rescue or logistics robots.

Core claim

SRL integrates SLIP-based feedforward control signals with RL-driven real-time feedback, enabling continuous optimization of robotic jumping that yields more stable performance with substantially reduced training time relative to baseline RL methods.

What carries the argument

The SRL hybrid that adds SLIP-derived feedforward signals to an RL policy so the policy learns only the residual corrections needed on real terrain.

If this is right

SRL produces stable jumps on stairs and uneven ground where pure SLIP fails and pure RL trains slowly.
Position and velocity tracking remain within the stated error bounds across bipedal and quadrupedal morphologies.
Sim-to-real transfer succeeds without additional retraining beyond the reported protocol.
The same hybrid pattern could shorten training for other periodic locomotion tasks once a suitable template model exists.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method may extend to running or bounding gaits if a suitable template model replaces SLIP.
Hardware implementations could test whether the feedforward term still helps when actuator delays or sensor noise exceed simulation levels.
If the SLIP template is replaced by a different low-dimensional model, the same training-time reduction might appear in other domains such as manipulation.

Load-bearing premise

The idealized SLIP contact and joint assumptions stay close enough to real robot dynamics that the feedforward signal does not systematically mislead the RL policy on irregular terrain.

What would settle it

Real-robot trials on highly irregular surfaces that produce either position tracking error above 0.1 m or training times comparable to unguided RL would falsify the performance claim.

Figures

Figures reproduced from arXiv: 2606.18625 by Chenyue Shao, Linqi Ye, Qingdu Li, Rankun Li, Xiaowen Hu, Yan Peng, Yudi Zhu.

**Figure 1.** Figure 1: Overview of the SRL control framework for robot jumping tasks, integrating the SLIP model, RL, and simulation environment. SRL combines the physically grounded motion dynamics of the SLIP model with the adaptability of RL to optimize jumping performance in complex environments such as flat ground with varying disturbances, stairs, and boxes. 3. Methods 3.1. SRL Structure This study proposes a novel robot j… view at source ↗

**Figure 2.** Figure 2: A six-state FSM constructed based on the SLIP model, where each state represents a key stage in the jump cycle. 3.2.2. Ground Contact Phase (State 0) In the ground contact phase, the robot’s foot makes contact with the ground, and the leg compresses like a spring, generating force 𝐹0 to push the CoM upward. The magnitude of the force is determined by the displacement Δ𝐿 between the CoM and the foot, and th… view at source ↗

**Figure 3.** Figure 3: Left: Reward evolution for the biped robot’s random-distance jump; Right: Learning efficiency comparison of RL-only and SRL in fixed-distance jumping. 4. Experiments 4.1. Simulation Setup In this study, we use the Unity engine and ML-Agents toolkit for simulation and training experiments. The simulation runs with a time step of 0.01 seconds, corresponding to a control frequency of 100Hz. To accelerate the … view at source ↗

**Figure 4.** Figure 4: Performance of the biped robot during fixed-distance jumping, showing trunk velocity, absolute tracking errors of the trunk position, and relative tracking errors of the ankle. feedforward joint references for bounding locomotion. This time-driven formulation replaces state-dependent phase switching with a fixed-frequency rhythmic pattern. Without adaptive phase modulation, the controller relies on a fixed… view at source ↗

**Figure 5.** Figure 5: Performance of the quadruped robot during fixed-distance and random-distance jumping tasks, showing trunk velocity, absolute tracking errors of the trunk position, and relative tracking errors of the foot [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Fixed-distance jumping simulations of the biped and quadruped robots. Orange and green denote the target and actual states, respectively. accuracy. Overall, across both tasks, the biped robot demonstrated strong capability in tracking the target velocity, keeping body tracking errors within 0.2 m over a 10-second period. Although the foot-effectors occasionally exhibited periodic error spikes during jumpin… view at source ↗

**Figure 7.** Figure 7: Simulation of the biped (left) and quadruped (right) robot in random-distance jumping tasks, illustrating the motion trajectories of the body and ankle/foot. leg joints experienced maximum explosive force and its body velocity reached its peak. Due to the system’s increased dynamic sensitivity, even small position errors were amplified. Despite the relatively large error spikes shown in the graph, these er… view at source ↗

**Figure 8.** Figure 8: Performance of the quadruped robot during the box-jumping experiment. The left shows absolute body position errors and relative foot errors, while the right shows the height of the body and feet [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗

**Figure 9.** Figure 9: Simulation of the quadruped robot during the box-jumping experiment. terrain, the robot was still able to maintain the relative foot errors mostly within 0.05 m. Moreover, the height (ycoordinate) of the body and feet during the jumping process indicates that the robot successfully adjusted its posture and extended its legs to manage the vertical displacement when stepping up. The trajectories of the four… view at source ↗

**Figure 10.** Figure 10: The X02-lite humanoid robot (Shanghai Droid Robotics) used in the experiment, featuring a height of 168 cm and a total mass of 28 kg. (a) Real robot. (b) Mujoco model. (c) Simplified link model (focusing on the 10 DOF legs) [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗

**Figure 11.** Figure 11: Performance comparison of the training policy in simulation (Isaac Gym and Mujoco) and real-world experiments, demonstrating successful transfer. The robot achieved a maximum jump height of 15 cm, distance of 20 cm, and peak velocity of 2 m/s. provide superior estimation performance compared to standard quaternion-based EKFs. The estimated velocity serves as input to the RL policy during real robot deploy… view at source ↗

**Figure 12.** Figure 12: Average body height comparison over one gait cycle: reference vs. simulation vs. real-world [PITH_FULL_IMAGE:figures/full_fig_p016_12.png] view at source ↗

read the original abstract

Robotic jumping is pivotal in applications such as search and rescue and logistics, where crossing obstacles and enhancing mobility efficiency are critical. The Spring-Loaded Inverted Pendulum (SLIP) model leverages simplified spring-mass dynamics that naturally encode biologically plausible hopping motions, yet its performance degrades on irregular terrain due to idealized assumptions regarding contact and joint dynamics. Meanwhile, Reinforcement Learning (RL) can adapt to diverse and complex environments but often requires extensive data from unguided exploration. The complementary strengths of SLIP's physically grounded baseline and RL's adaptive capabilities motivate a hybrid framework that overcomes these individual limitations. We therefore propose Spring-loaded Reinforcement Learning (SRL), which integrates SLIP-based feedforward control signals with RL-driven real-time feedback, enabling continuous optimization of robotic jumping. Experimental results demonstrate that SRL can achieve more stable jumps with much less training time than the baseline method, maintaining an average position tracking error below 0.1 m and velocity tracking errors within +/-3% of the target values. Through bipedal and quadrupedal simulations of ground and stair jumping, as well as sim-to-sim and sim-to-real validations, SRL exhibits robust adaptability to various task requirements and environmental complexities, underscoring its potential for real-world deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SRL shows a workable SLIP-plus-RL hybrid for jumping that cuts training time in the reported sims and transfers to hardware, but the advantage rests on SLIP staying close enough to reality.

read the letter

The paper's main contribution is a concrete hybrid: SLIP supplies a feedforward signal for the nominal hopping motion while RL learns corrections in real time. They run bipedal and quadrupedal simulations on flat ground and stairs, report position tracking under 0.1 m and velocity errors inside ±3 %, and show shorter training than a pure-RL baseline plus some sim-to-sim and sim-to-real checks.

That setup is useful. The quantitative numbers and the hardware transfer give practitioners a template they can try without starting from scratch. The fact that they test two morphologies and two terrain types adds a bit of breadth.

The soft spot is exactly the one the stress-test flags. The abstract itself says SLIP degrades when contact and joint assumptions break, yet the performance claims depend on the feedforward remaining helpful rather than misleading. No results appear for highly irregular surfaces, so it is not clear whether the hybrid still wins when the model mismatch grows. Without ablations or error bars in the provided text it is also hard to judge how much of the reported gain comes from the SLIP term versus reward shaping or network details.

This is for groups already working on legged locomotion who need a quick way to blend model-based priors with learning. The empirical record is specific enough that a referee could check the claims directly.

I would send it to review; the experiments are grounded enough to be worth the time even if the advance is incremental.

Referee Report

3 major / 2 minor

Summary. The paper proposes SRL, a hybrid controller that augments RL policies with feedforward signals derived from the SLIP model for bipedal and quadrupedal jumping tasks. It claims that the combination yields more stable jumps, substantially shorter training times than pure RL baselines, position tracking error below 0.1 m, and velocity tracking errors within ±3 % of targets, supported by ground/stair simulations plus sim-to-sim and sim-to-real transfer.

Significance. If the performance gap is reproducible and the SLIP feedforward remains beneficial rather than harmful under terrain mismatch, the work would provide concrete evidence that model-based priors can reduce sample complexity in agile locomotion without sacrificing adaptability. The sim-to-real results would be a useful data point for hybrid control in robotics.

major comments (3)

[§4, §5] §4 (Method) and §5 (Experiments): the central claim that SRL reduces training time while improving tracking rests on the assumption that SLIP-derived feedforward remains a net-positive signal on irregular terrain. The manuscript notes SLIP degradation on irregular surfaces yet provides no quantitative ablation measuring how large a mismatch between SLIP contact/joint assumptions and robot dynamics can be tolerated before the hybrid policy underperforms the pure-RL baseline.
[Table 2, Figure 7] Table 2 and Figure 7: the reported position error <0.1 m and velocity error ±3 % are given without error bars, number of random seeds, or statistical tests against the baseline. It is therefore impossible to determine whether the observed gap is statistically reliable or sensitive to hyper-parameter choices.
[§5.3] §5.3 (Sim-to-real): the sim-to-real validation uses only a single terrain type and a limited set of initial conditions. No systematic stress test (e.g., added sensor noise, mass variation, or stair height outside the training distribution) is reported to probe whether the SLIP prior introduces persistent bias when dynamics deviate.

minor comments (2)

[Abstract, §4] The abstract states quantitative results but the methods section does not specify the exact RL algorithm, network architecture, or reward weights used for the baseline comparison.
[§3] Notation for the SLIP feedforward torque and the RL policy output is introduced without an explicit equation linking the two signals (e.g., total torque = τ_SLIP + π_RL(s)).

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the constructive feedback. We address each major comment below and indicate the changes we will make to strengthen the manuscript.

read point-by-point responses

Referee: [§4, §5] §4 (Method) and §5 (Experiments): the central claim that SRL reduces training time while improving tracking rests on the assumption that SLIP-derived feedforward remains a net-positive signal on irregular terrain. The manuscript notes SLIP degradation on irregular surfaces yet provides no quantitative ablation measuring how large a mismatch between SLIP contact/joint assumptions and robot dynamics can be tolerated before the hybrid policy underperforms the pure-RL baseline.

Authors: We agree that a quantitative ablation on mismatch tolerance would strengthen the central claim. The revised manuscript will add an ablation study that varies terrain irregularity and reports the SRL vs. pure-RL performance gap as a function of mismatch severity, identifying the point at which the SLIP prior ceases to be net-positive. revision: yes
Referee: [Table 2, Figure 7] Table 2 and Figure 7: the reported position error <0.1 m and velocity error ±3 % are given without error bars, number of random seeds, or statistical tests against the baseline. It is therefore impossible to determine whether the observed gap is statistically reliable or sensitive to hyper-parameter choices.

Authors: We will rerun the experiments with at least five random seeds, add error bars to Table 2 and Figure 7, and include statistical tests (e.g., paired t-tests) against the baseline in the revised version. revision: yes
Referee: [§5.3] §5.3 (Sim-to-real): the sim-to-real validation uses only a single terrain type and a limited set of initial conditions. No systematic stress test (e.g., added sensor noise, mass variation, or stair height outside the training distribution) is reported to probe whether the SLIP prior introduces persistent bias when dynamics deviate.

Authors: The existing sim-to-real results were intended as an initial proof of concept on representative hardware. In revision we will expand the section with additional simulation-based stress tests that include sensor noise, mass variation, and out-of-distribution stair heights to assess potential bias from the SLIP prior. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical hybrid method with independent validation

full rationale

The paper proposes SRL as an integration of the standard SLIP model (external, not derived here) for feedforward with RL for feedback, then reports simulation and real-robot experimental outcomes on tracking error and training time. No equations, parameters, or uniqueness claims are presented that reduce by construction to fitted inputs, self-definitions, or self-citation chains; the performance claims rest on direct empirical comparison to a pure-RL baseline rather than any internal derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit free parameters, axioms, or invented entities; the hybrid rests on the unstated premise that SLIP dynamics remain useful as a feedforward prior.

pith-pipeline@v0.9.1-grok · 5763 in / 1110 out tokens · 20688 ms · 2026-06-26T21:13:20.825810+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

47 extracted references · 44 canonical work pages · 1 internal anchor

[1]

X.Mo,W.Ge,M.Miraglia,F.Inglese,D.Zhao,C.Stefanini,D.Romano,Jumpinglocomotionstrategies:Fromanimalstobioinspiredrobots, Applied Sciences 10 (23) (2020) 8607.doi:10.3390/app10238607

work page doi:10.3390/app10238607 2020
[2]

J.-S. Koh, E. Yang, G.-P. Jung, S.-P. Jung, J. H. Son, S.-I. Lee, P. G. Jablonski, R. J. Wood, H.-Y. Kim, K.-J. Cho, Jumping on water: Surface tension–dominated jumping of water striders and robotic insects, Science 349 (6247) (2015) 517–521.doi:10.1126/science.aab1637

work page doi:10.1126/science.aab1637 2015
[3]

C. Yi, X. Chen, Y. Zhang, Z. Yu, H. Qi, Y. Liu, Q. Huang, Simulating the grf of humanoid robot vertical jumping using a simplified model with a foot structure for foot design, Journal of Bionic Engineering 21 (1) (2024) 112–125.doi:10.1007/s42235-023-00429-8

work page doi:10.1007/s42235-023-00429-8 2024
[4]

X.Wang,W.Guo,Z.He,R.Li,F.Zha,L.Sun,Bionicjumpingofhumanoidrobotviaonlinecentroidtrajectoryoptimizationandhighdynamic motion controller, Journal of Bionic Engineering 21 (6) (2024) 2759–2778.doi:10.1007/s42235-024-00586-4

work page doi:10.1007/s42235-024-00586-4 2024
[5]

Z. Zhao, S. Sun, H. Huang, Q. Gao, W. Xu, Design and control of continuous jumping gaits for humanoid robots based on motion function and reinforcement learning, Procedia Computer Science 250 (2024) 51–57.doi:10.1016/j.procs.2024.11.008

work page doi:10.1016/j.procs.2024.11.008 2024
[6]

Y.Liu,X.Chen,Z.Yu,H.Qi,C.Yi,Singlesequentialtrajectoryoptimizationwithcentroidaldynamicsandwhole-bodykinematicsforvertical jump of humanoid robot, Biomimetics 9 (5) (2024) 274.doi:10.3390/biomimetics9050274

work page doi:10.3390/biomimetics9050274 2024
[7]

Ribak, Insect-inspired jumping robots: challenges and solutions to jump stability, Current Opinion in Insect Science 42 (2020) 32–38

G. Ribak, Insect-inspired jumping robots: challenges and solutions to jump stability, Current Opinion in Insect Science 42 (2020) 32–38. doi:10.1016/j.cois.2020.09.001

work page doi:10.1016/j.cois.2020.09.001 2020
[8]

675–676.doi:10.1109/URAI.2017.7992792

K.Y.Su,J.Z.Gul,K.H.Choi,Abiomimeticjumpinglocomotionoffunctionallygradedfrogsoftrobot,in:201714thInternationalConference on Ubiquitous Robots and Ambient Intelligence (URAI), IEEE, 2017, pp. 675–676.doi:10.1109/URAI.2017.7992792

work page doi:10.1109/urai.2017.7992792 2017
[9]

Afschrift, E

M. Afschrift, E. Van Asseldonk, M. Van Mierlo, C. Bayon, A. Keemink, L. D’Hondt, H. Van Der Kooij, F. De Groote, Assisting walking balance using a bio-inspired exoskeleton controller, Journal of Neuroengineering and Rehabilitation 20 (1) (2023) 82.doi:10.1186/ s12984-023-01205-9

2023
[10]

Ezekiel, R

D. Ezekiel, R. Samikannu, O. Matsebe, Bio-inspired jumping spider optimization for controller tuning/parameter estimation of an uncertain aerodynamic mimo system, Chaos Theory and Applications 6 (3) (2024) 205–217.doi:10.51537/chaos.1396823

work page doi:10.51537/chaos.1396823 2024
[11]

Elliott, X

H. Elliott, X. An, M. Wang, A bio-inspired jumping robot: Design, modelling and experimental tests, in: Annual Conference Towards Autonomous Robotic Systems, Springer, 2024, pp. 164–170.doi:10.1007/978-3-031-72062-8_15

work page doi:10.1007/978-3-031-72062-8_15 2024
[12]

Kabir, A

M. Kabir, A. Anand, P. Sundaravadivel, Hop-bot: a bio-inspired approach to locomotion and stability in modular robotics, in: 2024 IEEE International Conference on Electro Information Technology (EIT), IEEE, 2024, pp. 285–290.doi:10.1109/eIT60633.2024.10609916

work page doi:10.1109/eit60633.2024.10609916 2024
[13]

J. Hong, C. Yeo, S. Bae, J. Hong, S. Oh, Slip embodied robust quadruped robot control, in: 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2024, pp. 14219–14224.doi:10.1109/IROS58592.2024.10802545

work page doi:10.1109/iros58592.2024.10802545 2024
[14]

doi:10.1177/0278364914552112

G.Piovan,K.Byl,Reachability-basedcontrolfortheactiveslipmodel,TheInternationalJournalofRoboticsResearch34(3)(2015)270–287. doi:10.1177/0278364914552112

work page doi:10.1177/0278364914552112 2015
[15]

Piovan, K

G. Piovan, K. Byl, Approximation and control of the slip model dynamics via partial feedback linearization and two-element leg actuation strategy, IEEE Transactions on Robotics 32 (2) (2016) 399–412.doi:10.1109/TRO.2016.2529649

work page doi:10.1109/tro.2016.2529649 2016
[16]

Hamzaçebi, I

H. Hamzaçebi, I. Uyanik, Ö. Morgül, On the analysis and control of a bipedal legged locomotion model via partial feedback linearization, Bioinspiration & Biomimetics 19 (5) (2024) 056004.doi:10.1088/1748-3190/ad5cb6

work page doi:10.1088/1748-3190/ad5cb6 2024
[17]

H.-W. Park, P. M. Wensing, S. Kim, Jumping over obstacles with mit cheetah 2, Robotics and Autonomous Systems 136 (2021) 103703. doi:10.1016/j.robot.2020.103703

work page doi:10.1016/j.robot.2020.103703 2021
[18]

:Preprint submitted to Elsevier Page 16 of 17

D.Ahn,B.-K.Cho,Onlinejumpingmotiongenerationviamodelpredictivecontrol,IEEETransactionsonIndustrialElectronics69(5)(2021) 4957–4965.doi:10.1109/TIE.2021.3078396. :Preprint submitted to Elsevier Page 16 of 17

work page doi:10.1109/tie.2021.3078396 2021
[19]

G. Ji, J. Mun, H. Kim, J. Hwangbo, Concurrent training of a control policy and a state estimator for dynamic and robust legged locomotion, IEEE Robotics and Automation Letters 7 (2) (2022) 4630–4637.doi:10.1109/LRA.2022.3151396

work page doi:10.1109/lra.2022.3151396 2022
[20]

Z. He, J. Wu, J. Zhang, S. Zhang, Y. Shi, H. Liu, L. Sun, Y. Su, X. Leng, Cdm-mpc: An integrated dynamic planning and control framework for bipedal robots jumping, IEEE Robotics and Automation Letters 9 (7) (2024) 6672–6679.doi:10.1109/LRA.2024.3408487

work page doi:10.1109/lra.2024.3408487 2024
[21]

Z. Xu, J. Xie, K. Hashimoto, Human-inspired gait and jumping motion generation for bipedal robots using model predictive control, Biomimetics 10 (1) (2025) 17.doi:10.3390/biomimetics10010017

work page doi:10.3390/biomimetics10010017 2025
[22]

Z.Fu,Z.Yu,X.Chen,L.Han,P.Gergondet,J.Zhang,Q.Huang,Continuousbipedaljumpingviasliding-moderegularizedpredictivecontrol, IEEE/ASME Transactions on Mechatronics (2024).doi:10.1109/TMECH.2024.3515151

work page doi:10.1109/tmech.2024.3515151 2024
[23]

J.Kober,J.A.Bagnell,J.Peters,Reinforcementlearninginrobotics:Asurvey,TheInternationalJournalofRoboticsResearch32(11)(2013) 1238–1274.doi:10.1177/0278364913495721

work page doi:10.1177/0278364913495721 2013
[24]

C. Tao, M. Li, F. Cao, Z. Gao, Z. Zhang, A multiobjective collaborative deep reinforcement learning algorithm for jumping optimization of bipedal robot, Advanced Intelligent Systems 6 (1) (2024) 2300352.doi:10.1002/aisy.202300352

work page doi:10.1002/aisy.202300352 2024
[25]

4934–4939.doi:10.1109/IROS.2010.5651461

M.Hutter,C.D.Remy,M.A.Höpflinger,R.Siegwart,Sliprunningwithanarticulatedroboticleg,in:2010IEEE/RSJInternationalConference on Intelligent Robots and Systems, IEEE, 2010, pp. 4934–4939.doi:10.1109/IROS.2010.5651461

work page doi:10.1109/iros.2010.5651461 2010
[26]

X.He,X.Li,X.Wang,F.Meng,X.Guan,Z.Jiang,L.Yuan,K.Ba,G.Ma,B.Yu,Runninggaitandcontrolofquadrupedrobotbasedonslip model, Biomimetics 9 (1) (2024).doi:10.3390/biomimetics9010024

work page doi:10.3390/biomimetics9010024 2024
[27]

P. M. Wensing, D. E. Orin, Control of humanoid hopping based on a slip model, Advances in Mechanisms, Robotics and Design Education and Research (2013) 265–274doi:10.1007/978-3-319-00398-6_21

work page doi:10.1007/978-3-319-00398-6_21 2013
[28]

Proximal Policy Optimization Algorithms

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, O. Klimov, Proximal policy optimization algorithms, arXiv preprint arXiv:1707.06347 (2017).doi:10.48550/arXiv.1707.06347

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1707.06347 2017
[29]

C.Yu,A.Velu,E.Vinitsky,J.Gao,Y.Wang,A.Bayen,Y.Wu,Thesurprisingeffectivenessofppoincooperativemulti-agentgames,Advances in Neural Information Processing Systems 35 (2022) 24611–24624

2022
[30]

Y. Zhao, T. Wu, Y. Zhu, X. Lu, J. Wang, H. Bou-Ammar, X. Zhang, P. Du, Zsl-rppo: Zero-shot learning for quadrupedal locomotion in challengingterrainsusingrecurrentproximalpolicyoptimization,arXivpreprintarXiv:2403.01928(2024).doi:/10.48550/arXiv.2403. 01928

work page doi:10.48550/arxiv.2403 2024
[31]

Zhang, J

Z. Zhang, J. Zhao, H. Chen, D. Chen, A survey of bioinspired jumping robot: takeoff, air posture adjustment, and landing buffer, Applied Bionics and Biomechanics 2017 (1) (2017) 4780160.doi:10.1155/2017/4780160

work page doi:10.1155/2017/4780160 2017
[32]

Zhang, W

C. Zhang, W. Zou, L. Ma, Z. Wang, Biologically inspired jumping robots: A comprehensive review, Robotics and Autonomous Systems 124 (2020) 103362.doi:10.1016/j.robot.2019.103362

work page doi:10.1016/j.robot.2019.103362 2020
[33]

Garofalo, C

G. Garofalo, C. Ott, A. Albu-Schäffer, Walking control of fully actuated robots based on the bipedal slip model, in: 2012 IEEE International Conference on Robotics and Automation, IEEE, 2012, pp. 1456–1463.doi:10.1109/ICRA.2012.6225272

work page doi:10.1109/icra.2012.6225272 2012
[34]

Shahbazi, R

M. Shahbazi, R. Babuška, G. A. Lopes, Unified modeling and control of walking and running on the spring-loaded inverted pendulum, IEEE Transactions on Robotics 32 (5) (2016) 1178–1195.doi:10.1109/TRO.2016.2593483

work page doi:10.1109/tro.2016.2593483 2016
[35]

Rummel, Y

J. Rummel, Y. Blum, A. Seyfarth, Robust and efficient walking with spring-like legs, Bioinspiration & Biomimetics 5 (4) (2010) 046004. doi:10.1088/1748-3182/5/4/046004

work page doi:10.1088/1748-3182/5/4/046004 2010
[36]

S. Xie, X. Li, H. Zhong, C. Hu, L. Gao, Compliant bipedal walking based on variable spring-loaded inverted pendulum model with finite- sized foot, in: 2021 6th IEEE International Conference on Advanced Robotics and Mechatronics (ICARM), IEEE, 2021, pp. 667–672. doi:10.1109/ICARM52023.2021.9536096

work page doi:10.1109/icarm52023.2021.9536096 2021
[37]

H.Sang,S.Wang,Lunarleaprobot:3marchitecture–enhanceddeepreinforcementlearningmethodforquadrupedrobotjumpinginlow-gravity environment, Journal of Aerospace Engineering 37 (6) (2024) 04024076.doi:10.1061/JAEEEZ.ASENG-5619

work page doi:10.1061/jaeeez.aseng-5619 2024
[38]

Bellegarda, C

G. Bellegarda, C. Nguyen, Q. Nguyen, Robust quadruped jumping via deep reinforcement learning, Robotics and Autonomous Systems 182 (2024) 104799.doi:10.1016/j.robot.2024.104799

work page doi:10.1016/j.robot.2024.104799 2024
[39]

RoboAgent: Generalization and efficiency in robot manipulation via semantic augmentations and action chunking,

G. Bellegarda, M. Shafiee, M. E. Özberk, A. Ijspeert, Quadruped-frog: Rapid online optimization of continuous quadruped jumping, in: 2024 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2024, pp. 1443–1450.doi:10.1109/ICRA57147.2024. 10610141

work page doi:10.1109/icra57147.2024 2024
[40]

G.Bellegarda,A.Ijspeert,Cpg-rl:Learningcentralpatterngeneratorsforquadrupedlocomotion,IEEERoboticsandAutomationLetters7(4) (2022) 12547–12554.doi:10.1109/LRA.2022.3218167

work page doi:10.1109/lra.2022.3218167 2022
[41]

X. B. Peng, M. Andrychowicz, W. Zaremba, P. Abbeel, Sim-to-real transfer of robotic control with dynamics randomization, in: 2018 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2018, pp. 3803–3810.doi:10.1109/ICRA.2018.8460528

work page doi:10.1109/icra.2018.8460528 2018
[42]

Q. Zhou, G. Li, R. Tang, Y. Xu, H. Wen, Q. Shi, Stable jumping control based on deep reinforcement learning for a locust-inspired robot, Biomimetics 9 (9) (2024) 548.doi:10.3390/biomimetics9090548

work page doi:10.3390/biomimetics9090548 2024
[43]

R. J. Full, D. E. Koditschek, Templates and anchors: neuromechanical hypotheses of legged locomotion on land, Journal of Experimental Biology 202 (23) (1999) 3325–3332.doi:10.1242/jeb.202.23.3325

work page doi:10.1242/jeb.202.23.3325 1999
[44]

Geyer, U

H. Geyer, U. Saranli, Gait based on the spring-loaded inverted pendulum, in: A. Goswami, P. Vadakkepat (Eds.), Humanoid Robotics: A Reference, Springer, Dordrecht, 2019, pp. 923–947.doi:10.1007/978-94-007-6046-2_43

work page doi:10.1007/978-94-007-6046-2_43 2019
[45]

L. Ye, Y. Cheng, J. Li, X. Wang, B. Liang, Y. Peng, From knowing to doing: learning diverse motor skills through instruction learning, Biomimetic Intelligence and Robotics (2026) 100286

2026
[46]

Hartley, M

R. Hartley, M. Ghaffari, R. M. Eustice, J. W. Grizzle, Contact-aided invariant extended kalman filtering for robot state estimation, The International Journal of Robotics Research 39 (4) (2020) 402–430.doi:10.1177/0278364919894385

work page doi:10.1177/0278364919894385 2020
[47]

Humanoid-gym: Reinforcement learning for humanoid robot with zero-shot sim2real transfer,

X. Gu, Y.-J. Wang, J. Chen, Humanoid-gym: Reinforcement learning for humanoid robot with zero-shot sim2real transfer, arXiv preprint arXiv:2404.05695 (2024).doi:10.48550/arXiv.2404.05695. :Preprint submitted to Elsevier Page 17 of 17

work page doi:10.48550/arxiv.2404.05695 2024

[1] [1]

X.Mo,W.Ge,M.Miraglia,F.Inglese,D.Zhao,C.Stefanini,D.Romano,Jumpinglocomotionstrategies:Fromanimalstobioinspiredrobots, Applied Sciences 10 (23) (2020) 8607.doi:10.3390/app10238607

work page doi:10.3390/app10238607 2020

[2] [2]

J.-S. Koh, E. Yang, G.-P. Jung, S.-P. Jung, J. H. Son, S.-I. Lee, P. G. Jablonski, R. J. Wood, H.-Y. Kim, K.-J. Cho, Jumping on water: Surface tension–dominated jumping of water striders and robotic insects, Science 349 (6247) (2015) 517–521.doi:10.1126/science.aab1637

work page doi:10.1126/science.aab1637 2015

[3] [3]

C. Yi, X. Chen, Y. Zhang, Z. Yu, H. Qi, Y. Liu, Q. Huang, Simulating the grf of humanoid robot vertical jumping using a simplified model with a foot structure for foot design, Journal of Bionic Engineering 21 (1) (2024) 112–125.doi:10.1007/s42235-023-00429-8

work page doi:10.1007/s42235-023-00429-8 2024

[4] [4]

X.Wang,W.Guo,Z.He,R.Li,F.Zha,L.Sun,Bionicjumpingofhumanoidrobotviaonlinecentroidtrajectoryoptimizationandhighdynamic motion controller, Journal of Bionic Engineering 21 (6) (2024) 2759–2778.doi:10.1007/s42235-024-00586-4

work page doi:10.1007/s42235-024-00586-4 2024

[5] [5]

Z. Zhao, S. Sun, H. Huang, Q. Gao, W. Xu, Design and control of continuous jumping gaits for humanoid robots based on motion function and reinforcement learning, Procedia Computer Science 250 (2024) 51–57.doi:10.1016/j.procs.2024.11.008

work page doi:10.1016/j.procs.2024.11.008 2024

[6] [6]

Y.Liu,X.Chen,Z.Yu,H.Qi,C.Yi,Singlesequentialtrajectoryoptimizationwithcentroidaldynamicsandwhole-bodykinematicsforvertical jump of humanoid robot, Biomimetics 9 (5) (2024) 274.doi:10.3390/biomimetics9050274

work page doi:10.3390/biomimetics9050274 2024

[7] [7]

Ribak, Insect-inspired jumping robots: challenges and solutions to jump stability, Current Opinion in Insect Science 42 (2020) 32–38

G. Ribak, Insect-inspired jumping robots: challenges and solutions to jump stability, Current Opinion in Insect Science 42 (2020) 32–38. doi:10.1016/j.cois.2020.09.001

work page doi:10.1016/j.cois.2020.09.001 2020

[8] [8]

675–676.doi:10.1109/URAI.2017.7992792

K.Y.Su,J.Z.Gul,K.H.Choi,Abiomimeticjumpinglocomotionoffunctionallygradedfrogsoftrobot,in:201714thInternationalConference on Ubiquitous Robots and Ambient Intelligence (URAI), IEEE, 2017, pp. 675–676.doi:10.1109/URAI.2017.7992792

work page doi:10.1109/urai.2017.7992792 2017

[9] [9]

Afschrift, E

M. Afschrift, E. Van Asseldonk, M. Van Mierlo, C. Bayon, A. Keemink, L. D’Hondt, H. Van Der Kooij, F. De Groote, Assisting walking balance using a bio-inspired exoskeleton controller, Journal of Neuroengineering and Rehabilitation 20 (1) (2023) 82.doi:10.1186/ s12984-023-01205-9

2023

[10] [10]

Ezekiel, R

D. Ezekiel, R. Samikannu, O. Matsebe, Bio-inspired jumping spider optimization for controller tuning/parameter estimation of an uncertain aerodynamic mimo system, Chaos Theory and Applications 6 (3) (2024) 205–217.doi:10.51537/chaos.1396823

work page doi:10.51537/chaos.1396823 2024

[11] [11]

Elliott, X

H. Elliott, X. An, M. Wang, A bio-inspired jumping robot: Design, modelling and experimental tests, in: Annual Conference Towards Autonomous Robotic Systems, Springer, 2024, pp. 164–170.doi:10.1007/978-3-031-72062-8_15

work page doi:10.1007/978-3-031-72062-8_15 2024

[12] [12]

Kabir, A

M. Kabir, A. Anand, P. Sundaravadivel, Hop-bot: a bio-inspired approach to locomotion and stability in modular robotics, in: 2024 IEEE International Conference on Electro Information Technology (EIT), IEEE, 2024, pp. 285–290.doi:10.1109/eIT60633.2024.10609916

work page doi:10.1109/eit60633.2024.10609916 2024

[13] [13]

J. Hong, C. Yeo, S. Bae, J. Hong, S. Oh, Slip embodied robust quadruped robot control, in: 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, 2024, pp. 14219–14224.doi:10.1109/IROS58592.2024.10802545

work page doi:10.1109/iros58592.2024.10802545 2024

[14] [14]

doi:10.1177/0278364914552112

G.Piovan,K.Byl,Reachability-basedcontrolfortheactiveslipmodel,TheInternationalJournalofRoboticsResearch34(3)(2015)270–287. doi:10.1177/0278364914552112

work page doi:10.1177/0278364914552112 2015

[15] [15]

Piovan, K

G. Piovan, K. Byl, Approximation and control of the slip model dynamics via partial feedback linearization and two-element leg actuation strategy, IEEE Transactions on Robotics 32 (2) (2016) 399–412.doi:10.1109/TRO.2016.2529649

work page doi:10.1109/tro.2016.2529649 2016

[16] [16]

Hamzaçebi, I

H. Hamzaçebi, I. Uyanik, Ö. Morgül, On the analysis and control of a bipedal legged locomotion model via partial feedback linearization, Bioinspiration & Biomimetics 19 (5) (2024) 056004.doi:10.1088/1748-3190/ad5cb6

work page doi:10.1088/1748-3190/ad5cb6 2024

[17] [17]

H.-W. Park, P. M. Wensing, S. Kim, Jumping over obstacles with mit cheetah 2, Robotics and Autonomous Systems 136 (2021) 103703. doi:10.1016/j.robot.2020.103703

work page doi:10.1016/j.robot.2020.103703 2021

[18] [18]

:Preprint submitted to Elsevier Page 16 of 17

D.Ahn,B.-K.Cho,Onlinejumpingmotiongenerationviamodelpredictivecontrol,IEEETransactionsonIndustrialElectronics69(5)(2021) 4957–4965.doi:10.1109/TIE.2021.3078396. :Preprint submitted to Elsevier Page 16 of 17

work page doi:10.1109/tie.2021.3078396 2021

[19] [19]

G. Ji, J. Mun, H. Kim, J. Hwangbo, Concurrent training of a control policy and a state estimator for dynamic and robust legged locomotion, IEEE Robotics and Automation Letters 7 (2) (2022) 4630–4637.doi:10.1109/LRA.2022.3151396

work page doi:10.1109/lra.2022.3151396 2022

[20] [20]

Z. He, J. Wu, J. Zhang, S. Zhang, Y. Shi, H. Liu, L. Sun, Y. Su, X. Leng, Cdm-mpc: An integrated dynamic planning and control framework for bipedal robots jumping, IEEE Robotics and Automation Letters 9 (7) (2024) 6672–6679.doi:10.1109/LRA.2024.3408487

work page doi:10.1109/lra.2024.3408487 2024

[21] [21]

Z. Xu, J. Xie, K. Hashimoto, Human-inspired gait and jumping motion generation for bipedal robots using model predictive control, Biomimetics 10 (1) (2025) 17.doi:10.3390/biomimetics10010017

work page doi:10.3390/biomimetics10010017 2025

[22] [22]

Z.Fu,Z.Yu,X.Chen,L.Han,P.Gergondet,J.Zhang,Q.Huang,Continuousbipedaljumpingviasliding-moderegularizedpredictivecontrol, IEEE/ASME Transactions on Mechatronics (2024).doi:10.1109/TMECH.2024.3515151

work page doi:10.1109/tmech.2024.3515151 2024

[23] [23]

J.Kober,J.A.Bagnell,J.Peters,Reinforcementlearninginrobotics:Asurvey,TheInternationalJournalofRoboticsResearch32(11)(2013) 1238–1274.doi:10.1177/0278364913495721

work page doi:10.1177/0278364913495721 2013

[24] [24]

C. Tao, M. Li, F. Cao, Z. Gao, Z. Zhang, A multiobjective collaborative deep reinforcement learning algorithm for jumping optimization of bipedal robot, Advanced Intelligent Systems 6 (1) (2024) 2300352.doi:10.1002/aisy.202300352

work page doi:10.1002/aisy.202300352 2024

[25] [25]

4934–4939.doi:10.1109/IROS.2010.5651461

M.Hutter,C.D.Remy,M.A.Höpflinger,R.Siegwart,Sliprunningwithanarticulatedroboticleg,in:2010IEEE/RSJInternationalConference on Intelligent Robots and Systems, IEEE, 2010, pp. 4934–4939.doi:10.1109/IROS.2010.5651461

work page doi:10.1109/iros.2010.5651461 2010

[26] [26]

X.He,X.Li,X.Wang,F.Meng,X.Guan,Z.Jiang,L.Yuan,K.Ba,G.Ma,B.Yu,Runninggaitandcontrolofquadrupedrobotbasedonslip model, Biomimetics 9 (1) (2024).doi:10.3390/biomimetics9010024

work page doi:10.3390/biomimetics9010024 2024

[27] [27]

P. M. Wensing, D. E. Orin, Control of humanoid hopping based on a slip model, Advances in Mechanisms, Robotics and Design Education and Research (2013) 265–274doi:10.1007/978-3-319-00398-6_21

work page doi:10.1007/978-3-319-00398-6_21 2013

[28] [28]

Proximal Policy Optimization Algorithms

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, O. Klimov, Proximal policy optimization algorithms, arXiv preprint arXiv:1707.06347 (2017).doi:10.48550/arXiv.1707.06347

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1707.06347 2017

[29] [29]

C.Yu,A.Velu,E.Vinitsky,J.Gao,Y.Wang,A.Bayen,Y.Wu,Thesurprisingeffectivenessofppoincooperativemulti-agentgames,Advances in Neural Information Processing Systems 35 (2022) 24611–24624

2022

[30] [30]

Y. Zhao, T. Wu, Y. Zhu, X. Lu, J. Wang, H. Bou-Ammar, X. Zhang, P. Du, Zsl-rppo: Zero-shot learning for quadrupedal locomotion in challengingterrainsusingrecurrentproximalpolicyoptimization,arXivpreprintarXiv:2403.01928(2024).doi:/10.48550/arXiv.2403. 01928

work page doi:10.48550/arxiv.2403 2024

[31] [31]

Zhang, J

Z. Zhang, J. Zhao, H. Chen, D. Chen, A survey of bioinspired jumping robot: takeoff, air posture adjustment, and landing buffer, Applied Bionics and Biomechanics 2017 (1) (2017) 4780160.doi:10.1155/2017/4780160

work page doi:10.1155/2017/4780160 2017

[32] [32]

Zhang, W

C. Zhang, W. Zou, L. Ma, Z. Wang, Biologically inspired jumping robots: A comprehensive review, Robotics and Autonomous Systems 124 (2020) 103362.doi:10.1016/j.robot.2019.103362

work page doi:10.1016/j.robot.2019.103362 2020

[33] [33]

Garofalo, C

G. Garofalo, C. Ott, A. Albu-Schäffer, Walking control of fully actuated robots based on the bipedal slip model, in: 2012 IEEE International Conference on Robotics and Automation, IEEE, 2012, pp. 1456–1463.doi:10.1109/ICRA.2012.6225272

work page doi:10.1109/icra.2012.6225272 2012

[34] [34]

Shahbazi, R

M. Shahbazi, R. Babuška, G. A. Lopes, Unified modeling and control of walking and running on the spring-loaded inverted pendulum, IEEE Transactions on Robotics 32 (5) (2016) 1178–1195.doi:10.1109/TRO.2016.2593483

work page doi:10.1109/tro.2016.2593483 2016

[35] [35]

Rummel, Y

J. Rummel, Y. Blum, A. Seyfarth, Robust and efficient walking with spring-like legs, Bioinspiration & Biomimetics 5 (4) (2010) 046004. doi:10.1088/1748-3182/5/4/046004

work page doi:10.1088/1748-3182/5/4/046004 2010

[36] [36]

S. Xie, X. Li, H. Zhong, C. Hu, L. Gao, Compliant bipedal walking based on variable spring-loaded inverted pendulum model with finite- sized foot, in: 2021 6th IEEE International Conference on Advanced Robotics and Mechatronics (ICARM), IEEE, 2021, pp. 667–672. doi:10.1109/ICARM52023.2021.9536096

work page doi:10.1109/icarm52023.2021.9536096 2021

[37] [37]

H.Sang,S.Wang,Lunarleaprobot:3marchitecture–enhanceddeepreinforcementlearningmethodforquadrupedrobotjumpinginlow-gravity environment, Journal of Aerospace Engineering 37 (6) (2024) 04024076.doi:10.1061/JAEEEZ.ASENG-5619

work page doi:10.1061/jaeeez.aseng-5619 2024

[38] [38]

Bellegarda, C

G. Bellegarda, C. Nguyen, Q. Nguyen, Robust quadruped jumping via deep reinforcement learning, Robotics and Autonomous Systems 182 (2024) 104799.doi:10.1016/j.robot.2024.104799

work page doi:10.1016/j.robot.2024.104799 2024

[39] [39]

RoboAgent: Generalization and efficiency in robot manipulation via semantic augmentations and action chunking,

G. Bellegarda, M. Shafiee, M. E. Özberk, A. Ijspeert, Quadruped-frog: Rapid online optimization of continuous quadruped jumping, in: 2024 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2024, pp. 1443–1450.doi:10.1109/ICRA57147.2024. 10610141

work page doi:10.1109/icra57147.2024 2024

[40] [40]

G.Bellegarda,A.Ijspeert,Cpg-rl:Learningcentralpatterngeneratorsforquadrupedlocomotion,IEEERoboticsandAutomationLetters7(4) (2022) 12547–12554.doi:10.1109/LRA.2022.3218167

work page doi:10.1109/lra.2022.3218167 2022

[41] [41]

X. B. Peng, M. Andrychowicz, W. Zaremba, P. Abbeel, Sim-to-real transfer of robotic control with dynamics randomization, in: 2018 IEEE International Conference on Robotics and Automation (ICRA), IEEE, 2018, pp. 3803–3810.doi:10.1109/ICRA.2018.8460528

work page doi:10.1109/icra.2018.8460528 2018

[42] [42]

Q. Zhou, G. Li, R. Tang, Y. Xu, H. Wen, Q. Shi, Stable jumping control based on deep reinforcement learning for a locust-inspired robot, Biomimetics 9 (9) (2024) 548.doi:10.3390/biomimetics9090548

work page doi:10.3390/biomimetics9090548 2024

[43] [43]

R. J. Full, D. E. Koditschek, Templates and anchors: neuromechanical hypotheses of legged locomotion on land, Journal of Experimental Biology 202 (23) (1999) 3325–3332.doi:10.1242/jeb.202.23.3325

work page doi:10.1242/jeb.202.23.3325 1999

[44] [44]

Geyer, U

H. Geyer, U. Saranli, Gait based on the spring-loaded inverted pendulum, in: A. Goswami, P. Vadakkepat (Eds.), Humanoid Robotics: A Reference, Springer, Dordrecht, 2019, pp. 923–947.doi:10.1007/978-94-007-6046-2_43

work page doi:10.1007/978-94-007-6046-2_43 2019

[45] [45]

L. Ye, Y. Cheng, J. Li, X. Wang, B. Liang, Y. Peng, From knowing to doing: learning diverse motor skills through instruction learning, Biomimetic Intelligence and Robotics (2026) 100286

2026

[46] [46]

Hartley, M

R. Hartley, M. Ghaffari, R. M. Eustice, J. W. Grizzle, Contact-aided invariant extended kalman filtering for robot state estimation, The International Journal of Robotics Research 39 (4) (2020) 402–430.doi:10.1177/0278364919894385

work page doi:10.1177/0278364919894385 2020

[47] [47]

Humanoid-gym: Reinforcement learning for humanoid robot with zero-shot sim2real transfer,

X. Gu, Y.-J. Wang, J. Chen, Humanoid-gym: Reinforcement learning for humanoid robot with zero-shot sim2real transfer, arXiv preprint arXiv:2404.05695 (2024).doi:10.48550/arXiv.2404.05695. :Preprint submitted to Elsevier Page 17 of 17

work page doi:10.48550/arxiv.2404.05695 2024