SSR: Scaling Surefooted and Symmetric Humanoid Traversal to the Open World

Jun Wu; Qiuguo Zhu; Ruiqi Yu; Yiwen Wang; Yuan Hao

arxiv: 2605.30770 · v1 · pith:M3ASEMELnew · submitted 2026-05-29 · 💻 cs.RO

SSR: Scaling Surefooted and Symmetric Humanoid Traversal to the Open World

Ruiqi Yu , Yiwen Wang , Yuan Hao , Jun WU , Qiuguo Zhu This is my paper

Pith reviewed 2026-06-28 22:34 UTC · model grok-4.3

classification 💻 cs.RO

keywords humanoid locomotionvision-based traversalfoothold guidancesymmetry augmentationmotion priorsopen-world navigationreal-world terrain

0 comments

The pith

SSR enables humanoid robots to traverse open-world terrains safely by learning imagined future footholds and symmetric coordination from vision.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents SSR as an end-to-end vision-based system that lets humanoids move reliably across varied real terrain. It adds imagined foothold guidance that predicts upcoming foot contacts and steers swings toward stable spots to cut edge slips. Equivariant symmetry augmentation in latent space builds bilateral coordination without extra data, while terrain-specific motion priors shape natural whole-body motion. Experiments report that these pieces together support stable walking on stairs of different designs, wide gaps, high platforms, and extended outdoor routes.

Core claim

SSR jointly learns imagined foothold guidance to model and evaluate forthcoming swing-foot contacts, equivariant latent-space symmetry augmentation to induce bilateral coordination under visual input, and terrain-specific multi-discriminator motion priors; together these produce safe, stable, high-quality locomotion on heterogeneous real-world terrains including varied stairs and extreme obstacles while supporting reliable long-horizon outdoor traversal.

What carries the argument

Imagined foothold guidance, which models forthcoming swing-foot contacts and scores their support to steer pre-touchdown motion toward stable regions.

If this is right

Pre-touchdown guidance reduces the frequency of edge slips on irregular surfaces.
Symmetry augmentation produces coordinated left-right behaviors from high-dimensional visual observations.
Terrain-specific priors maintain human-like motion patterns across different scene types.
The combined system supports continuous traversal over long distances without manual resets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar guidance mechanisms could be adapted for tasks that combine locomotion with object interaction.
The symmetry approach may lower sample complexity when training on other bilateral robot platforms.
Long-horizon reliability suggests the method could support deployment in unstructured human environments without frequent retraining.

Load-bearing premise

The learned foothold predictions, symmetry augmentation, and motion priors trained in simulation will transfer to produce reliable foot placement on unseen real terrains.

What would settle it

Repeated edge slips or falls when the robot encounters a previously unseen terrain type such as loose gravel slopes during extended outdoor runs.

Figures

Figures reproduced from arXiv: 2605.30770 by Jun Wu, Qiuguo Zhu, Ruiqi Yu, Yiwen Wang, Yuan Hao.

**Figure 2.** Figure 2: Overview of the SSR framework. The policy combines a recurrent equivariant encoder, an estimator, and a MoE actor to learn unified traversal across terrains from egocentric depth images and temporal proprioception. During training, SSR learns surefooted, symmetric, and human-like motion through three mechanisms: (a) imagined foothold guidance for dense pre-contact swing correction, (b) equivariant latent … view at source ↗

**Figure 3.** Figure 3: Imagined foothold guidance. We measure support deficiency over a sole-sized terrain patch. During swing, a foothold imagination model anticipates futurecontact distributions for pre-contact dense guidance. Guiding Safer Swings Before Touchdown. Given the imagined landing distribution, we form the guidance signal by measuring the unsupported fraction within a sole-covered region. Let Hf (p)={hk(p)} n k=1 … view at source ↗

**Figure 4.** Figure 4: Equivariant latent-space augmentation. We form symmetry-structured inputs and build the encoder from equivariant linear and convolutional layers, allowing each latent head to be mirrored by Mc. Equivariant Latent Encoding. To make this latent transform valid, the encoder must map mirrored observations to mirrored latents. We first organize its inputs into symmetry-structured form: a fixed operator T reor… view at source ↗

**Figure 5.** Figure 5: Traversal performance across terrains and difficulty levels in simulation. Top row compares SSR with [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Ablation study of safe foot placement learning. (a) Foothold distributions on stairs down and gap, [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: Ablation study of symmetry learning. (a) Trajectories under eight linear-velocity and two in-place [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

**Figure 8.** Figure 8: Key frames of zero-shot real-world lab-level deployment. [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗

**Figure 9.** Figure 9: (a) The humanoid completes a 1.3 km, 40 min open-world traversal in an industrial heritage park, [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

**Figure 10.** Figure 10: Overview of terrain types used for training. [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗

**Figure 11.** Figure 11: Key frames of cross-platform laboratory deployment on the full-size DEEP Robotics DR02 hu [PITH_FULL_IMAGE:figures/full_fig_p023_11.png] view at source ↗

read the original abstract

Extending humanoid traversal to the open world is key to practical deployment in human environments, but remains challenging. The robot must use vision to ensure safe and reliable foot placement on heterogeneous terrain under highly dynamic motion, while producing coordinated, natural whole-body behaviors. We propose SSR, an efficient end-to-end framework for egocentric vision-based humanoid traversal that jointly learns these capabilities. SSR introduces imagined foothold guidance, which learns to model forthcoming swing-foot contacts and evaluates their support to guide pre-touchdown swings toward stable regions, reducing edge slips. It further employs equivariant latent-space symmetry augmentation to efficiently induce bilateral coordination under high-dimensional visual observations, and uses terrain-specific multi-discriminator motion priors to encourage human-like behavior across scenes. Extensive experiments show that SSR achieves safe, stable, and high-quality locomotion on diverse real-world terrains, including stairs with varied structures and extreme challenges such as wide gaps and high platforms, while enabling reliable long-horizon traversal in open outdoor environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The abstract sketches a plausible end-to-end vision pipeline for humanoid foot placement and coordination, but the generalization story rests on an unverified assumption that training coverage extends to the claimed real-world cases.

read the letter

The paper's core contribution is a joint learning setup that adds imagined foothold guidance to steer swing feet away from unstable edges, equivariant latent symmetry to encourage bilateral coordination from visual input, and terrain-specific multi-discriminator priors to shape whole-body motion. That specific bundle looks like the main novelty relative to prior foothold or symmetry work.

It does target the right practical problems: dynamic vision-based placement on varied stairs, gaps, and outdoor surfaces while keeping natural behavior. If the full experiments include proper ablations and real-robot metrics, this could be a useful incremental step for legged robotics.

The soft spot is exactly where the stress-test note flags it. The abstract asserts reliable long-horizon traversal on heterogeneous unseen terrain but supplies no training-distribution statistics, no OOD distance measures, no failure-case breakdown, and no baseline comparisons. Without those, the reported successes could reflect test conditions that stayed close to the training support rather than robust extrapolation. The assumption that the three learned components will generalize is doing a lot of work and is not yet shown to hold.

This is for researchers building vision-driven humanoid controllers who need concrete methods to try. A reader gets value mainly from the implementation details and data once the paper is expanded. It deserves a serious referee because the problem matters and the approach is coherent on its own terms, even though the current evidence is thin and the generalization claim needs direct testing.

Referee Report

1 major / 0 minor

Summary. The paper proposes SSR, an end-to-end egocentric vision-based framework for humanoid traversal in open-world settings. It introduces three components: imagined foothold guidance that models forthcoming swing-foot contacts and evaluates their support to guide stable placements; equivariant latent-space symmetry augmentation to induce bilateral coordination from high-dimensional visual inputs; and terrain-specific multi-discriminator motion priors to encourage human-like whole-body behaviors across scenes. The central claim, supported by reported extensive real-world experiments, is that SSR achieves safe, stable, high-quality locomotion on diverse terrains including varied stairs, wide gaps, high platforms, and enables reliable long-horizon traversal in open outdoor environments.

Significance. If the generalization claims hold with rigorous evidence, the work would be significant for practical humanoid deployment, as it targets the core difficulties of vision-guided foot placement under dynamic motion and coordinated behavior on heterogeneous terrain. The symmetry augmentation and multi-discriminator priors represent potentially efficient inductive biases for scaling learning, and the focus on real-world long-horizon tasks addresses a key gap in the field.

major comments (1)

[Abstract] Abstract: The load-bearing claim that the three learned components (imagined foothold guidance, equivariant symmetry augmentation, terrain-specific motion priors) produce reliable behavior on 'heterogeneous unseen real-world terrains' and 'extreme challenges' is not supported by any quantification of distribution shift, OOD metrics, or bounds on how the test conditions (varied stair structures, gap widths, platform heights, outdoor heterogeneity) relate to the training support. Without such analysis, the reported success does not distinguish robust generalization from in-distribution performance.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for highlighting the need to better substantiate generalization claims. We address the concern directly below and indicate where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: The load-bearing claim that the three learned components (imagined foothold guidance, equivariant symmetry augmentation, terrain-specific motion priors) produce reliable behavior on 'heterogeneous unseen real-world terrains' and 'extreme challenges' is not supported by any quantification of distribution shift, OOD metrics, or bounds on how the test conditions (varied stair structures, gap widths, platform heights, outdoor heterogeneity) relate to the training support. Without such analysis, the reported success does not distinguish robust generalization from in-distribution performance.

Authors: We agree that explicit quantification of distribution shift would make the generalization claims more rigorous. In real-world robotics, defining precise support bounds for visual and terrain distributions is inherently difficult, which is why we relied on qualitative diversity in testing (e.g., outdoor heterogeneity and extreme parameters such as wider gaps and higher platforms). However, to address the point, we will revise the abstract to temper the language and add a dedicated paragraph in the experiments section that describes the training data collection protocol versus the specific test terrain parameters (stair variations, gap widths, platform heights) to better illustrate the degree of shift. This will not include formal OOD bounds, as they are not standard or straightforward in this domain, but will provide clearer evidence distinguishing the test conditions. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical validation of learned components with no self-referential derivation chain.

full rationale

The paper proposes an end-to-end learned framework (imagined foothold guidance, equivariant symmetry augmentation, terrain-specific motion priors) and supports its claims solely via reported real-world experiments on diverse terrains. No equations, uniqueness theorems, or derivation steps are present in the provided text that reduce any output quantity to a fitted input or self-citation by construction. The central claims are empirical performance statements, not algebraic identities or predictions forced by the training procedure itself. This matches the default expectation of a non-circular empirical ML robotics paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract does not specify any free parameters, axioms, or invented entities. The approach is described at a high level without technical details on modeling assumptions.

pith-pipeline@v0.9.1-grok · 5705 in / 1203 out tokens · 38980 ms · 2026-06-28T22:34:32.367288+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

55 extracted references · 14 canonical work pages · 5 internal anchors

[1]

Radosavovic, S

I. Radosavovic, S. Kamat, T. Darrell, and J. Malik. Learning humanoid locomotion over chal- lenging terrain.arXiv preprint arXiv:2410.03654, 2024

work page arXiv 2024
[2]

Bonnen, J

K. Bonnen, J. S. Matthis, A. Gibaldi, M. S. Banks, D. M. Levi, and M. Hayhoe. Binocular vision and the control of foot placement during walking in natural terrain.Scientific reports, 11(1):20881, 2021

2021
[3]

Zhuang, S

Z. Zhuang, S. Yao, and H. Zhao. Humanoid parkour learning. InConference on Robot Learn- ing (CoRL), 2024

2024
[4]

S. Zhu, Z. Zhuang, M. Zhao, K.-Y . Lee, and H. Zhao. Hiking in the wild: A scalable perceptive parkour framework for humanoids.arXiv preprint arXiv:2601.07718, 2026

work page arXiv 2026
[5]

W. Sun, Y . Su, L. Huang, A. Zhang, D. Wei, M. San, D. Tian, E. Cao, F. Yan, E. Xie, et al. Now you see that: Learning end-to-end humanoid locomotion from raw pixels.arXiv preprint arXiv:2602.06382, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[6]

J. Sun, G. Han, P. Sun, W. Zhao, J. Cao, J. Wang, Y . Guo, and Q. Zhang. Dpl: Depth- only perceptive humanoid locomotion via realistic depth synthesis and cross-attention terrain reconstruction.arXiv preprint arXiv:2510.07152, 2025

work page arXiv 2025
[7]

B. Nie, Y . Zhang, R. Jin, Z. Cao, H. Lin, X. Yang, and Y . Gao. Coordinated humanoid robot locomotion with symmetry equivariant reinforcement learning policy. InProceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 18523–18531, 2026

2026
[8]

Mittal, N

M. Mittal, N. Rudin, V . Klemm, A. Allshire, and M. Hutter. Symmetry considerations for learning task symmetric robot policies. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 7433–7439. IEEE, 2024

2024
[9]

Z. Su, X. Huang, D. Ordo ˜nez-Apraez, Y . Li, Z. Li, Q. Liao, G. Turrisi, M. Pontil, C. Semini, Y . Wu, et al. Leveraging symmetry in rl-based legged locomotion control. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 6899–6906. IEEE, 2024

2024
[10]

X. B. Peng, Z. Ma, P. Abbeel, S. Levine, and A. Kanazawa. Amp: Adversarial motion priors for stylized physics-based character control.ACM Transactions on Graphics (ToG), 40(4): 1–20, 2021

2021
[11]

A. Tang, T. Hiraoka, N. Hiraoka, F. Shi, K. Kawaharazuka, K. Kojima, K. Okada, and M. Inaba. Humanmimic: Learning natural locomotion and transitions for humanoid robot via wasserstein adversarial imitation. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 13107–13114. IEEE, 2024

2024
[12]

Zhang, P

Q. Zhang, P. Cui, D. Yan, J. Sun, Y . Duan, G. Han, W. Zhao, W. Zhang, Y . Guo, A. Zhang, et al. Whole-body humanoid robot locomotion with human reference. In2024 IEEE/RSJ In- ternational Conference on Intelligent Robots and Systems (IROS), pages 11225–11231. IEEE, 2024. 9

2024
[13]

Cheng, K

X. Cheng, K. Shi, A. Agarwal, and D. Pathak. Extreme parkour with legged robots. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 11443–11450. IEEE, 2024

2024
[14]

Zhang, W

C. Zhang, W. Xiao, T. He, and G. Shi. Wococo: Learning whole-body humanoid control with sequential contacts. InConference on Robot Learning (CoRL), 2024

2024
[15]

V ollenweider, M

E. V ollenweider, M. Bjelonic, V . Klemm, N. Rudin, J. Lee, and M. Hutter. Advanced skills through multiple adversarial motion priors in reinforcement learning. In2023 IEEE Interna- tional Conference on Robotics and Automation (ICRA), pages 5120–5126. IEEE, 2023

2023
[16]

H. Wang, Z. Wang, J. Ren, Q. Ben, T. Huang, W. Zhang, and J. Pang. Beamdojo: Learning agile humanoid locomotion on sparse footholds. InRobotics: Science and Systems (RSS), 2025

2025
[17]

Z. Wu, X. Huang, L. Yang, Y . Zhang, K. Sreenath, X. Chen, P. Abbeel, R. Duan, A. Kanazawa, C. Sferrazza, et al. Perceptive humanoid parkour: Chaining dynamic human skills via motion matching.arXiv preprint arXiv:2602.15827, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[18]

J. Long, J. Ren, M. Shi, Z. Wang, T. Huang, P. Luo, and J. Pang. Learning humanoid locomo- tion with perceptive internal model. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 9997–10003. IEEE, 2025

2025
[19]

X. Cui, L. Feng, Y . Zhou, H. Han, Z. Liu, and H. Wang. Pilot: A perceptive integrated low-level controller for loco-manipulation over unstructured scenes.arXiv preprint arXiv:2601.17440, 2026

work page arXiv 2026
[20]

S. Ma, H. Chen, Z. Xu, Y . Zhao, K. Wu, R. Yang, L. Zou, Z. Gan, and W. Ding. Cmoe: Contrastive mixture of experts for motion control and terrain adaptation of humanoid robots. arXiv preprint arXiv:2603.03067, 2026

work page arXiv 2026
[21]

J. He, C. Zhang, F. Jenelten, R. Grandia, M. B ¨acher, and M. Hutter. Attention-based map encoding for learning generalized legged locomotion.Science Robotics, 10(105):eadv3604, 2025

2025
[22]

D. Wang, X. Wang, X. Liu, J. Shi, Y . Zhao, C. Bai, and X. Li. More: Mixture of residual experts for humanoid lifelike gaits learning on complex terrains.arXiv preprint arXiv:2506.08840, 2025

work page arXiv 2025
[23]

Zhuang, S

Z. Zhuang, S. Zhu, M. Zhao, and H. Zhao. Deep whole-body parkour.arXiv preprint arXiv:2601.07701, 2026

work page arXiv 2026
[24]

Zhang, G

Q. Zhang, G. Han, J. Sun, W. Zhao, C. Sun, J. Cao, J. Wang, Y . Guo, and R. Xu. Distillation- ppo: A novel two-stage reinforcement learning framework for humanoid robot perceptive lo- comotion. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2916–2922. IEEE, 2025

2025
[25]

Y . Liu, T. Yu, H. Song, H. Zhu, N. Hu, Y . Hao, X. Yao, X. Zang, H. Chen, and J. Zhao. Faststair: Learning to run up stairs with humanoid robots.arXiv preprint arXiv:2601.10365, 2026

work page arXiv 2026
[26]

Parkour in the wild: Learn- ing a general and extensible agile locomotion policy using multi-expert distillation and rl fine-tuning,

N. Rudin, J. He, J. Aurand, and M. Hutter. Parkour in the wild: Learning a general and extensible agile locomotion policy using multi-expert distillation and rl fine-tuning.arXiv preprint arXiv:2505.11164, 2025

work page arXiv 2025
[27]

Mastalli, I

C. Mastalli, I. Havoutis, M. Focchi, D. G. Caldwell, and C. Semini. Motion planning for quadrupedal locomotion: Coupled planning, terrain mapping, and whole-body control.IEEE Transactions on Robotics, 36(6):1635–1648, 2020. 10

2020
[28]

Agrawal, S

A. Agrawal, S. Chen, A. Rai, and K. Sreenath. Vision-aided dynamic quadrupedal locomotion on discrete terrain using motion libraries. In2022 International Conference on Robotics and Automation (ICRA), pages 4708–4714. IEEE, 2022

2022
[29]

Fahmi, V

S. Fahmi, V . Barasuol, D. Esteban, O. Villarreal, and C. Semini. Vital: Vision-based terrain- aware locomotion for legged robots.IEEE Transactions on Robotics, 39(2):885–904, 2022

2022
[30]

Tsounis, M

V . Tsounis, M. Alge, J. Lee, F. Farshidian, and M. Hutter. Deepgait: Planning and control of quadrupedal gaits using deep reinforcement learning.IEEE Robotics and Automation Letters, 5(2):3699–3706, 2020

2020
[31]

Jenelten, J

F. Jenelten, J. He, F. Farshidian, and M. Hutter. Dtc: Deep tracking control.Science Robotics, 9(86):eadh5401, 2024

2024
[32]

W. Yu, D. Jain, A. Escontrela, A. Iscen, P. Xu, E. Coumans, S. Ha, J. Tan, and T. Zhang. Visual-locomotion: Learning to walk on complex terrains with vision. InConference on Robot Learning (CoRL), 2021

2021
[33]

Gangapurwala, M

S. Gangapurwala, M. Geisert, R. Orsolino, M. Fallon, and I. Havoutis. Rloc: Terrain-aware legged locomotion using reinforcement learning and optimal control.IEEE Transactions on Robotics, 38(5):2908–2927, 2022

2022
[34]

Zhuang, Z

Z. Zhuang, Z. Fu, J. Wang, C. Atkeson, S. Schwertfeger, C. Finn, and H. Zhao. Robot parkour learning. InConference on Robot Learning (CoRL), 2023

2023
[35]

R. S. Sutton.Temporal credit assignment in reinforcement learning. University of Mas- sachusetts Amherst, 1984

1984
[36]

D. Zhu, C. Zhu, Z. Zhang, S. Xin, and Y . Liu. Learning safe locomotion for quadrupedal robots by derived-action optimization. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 6870–6876. IEEE, 2024

2024
[37]

Zhang, N

C. Zhang, N. Rudin, D. Hoeller, and M. Hutter. Learning agile locomotion on risky terrains. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 11864–11871. IEEE, 2024

2024
[38]

Abdolhosseini, H

F. Abdolhosseini, H. Y . Ling, Z. Xie, X. B. Peng, and M. Van de Panne. On learning symmetric locomotion. InProceedings of the 12th ACM SIGGRAPH Conference on Motion, Interaction and Games, pages 1–10, 2019

2019
[39]

Proximal Policy Optimization Algorithms

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[40]

I. M. A. Nahrendra, B. Yu, and H. Myung. Dreamwaq: Learning robust quadrupedal lo- comotion with implicit terrain imagination via deep reinforcement learning. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 5078–5084. IEEE, 2023

2023
[41]

Huang, S

R. Huang, S. Zhu, Y . Du, and H. Zhao. Moe-loco: Mixture of experts for multitask locomotion. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 14218–14225. IEEE, 2025

2025
[42]

R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton. Adaptive mixtures of local experts. Neural computation, 3(1):79–87, 1991

1991
[43]

S. Luo, S. Li, R. Yu, Z. Wang, J. Wu, and Q. Zhu. Pie: Parkour with implicit-explicit learning framework for legged robots.IEEE Robotics and Automation Letters, 9(11):9986–9993, 2024

2024
[44]

P. Li, H. Li, Y . Ma, L. Chang, X. Yang, R. Yu, Y . Zhang, Y . Cao, Q. Zhu, and G. Sartoretti. Kivi: Kinesthetic-visuospatial integration for dynamic and safe egocentric legged locomotion. arXiv preprint arXiv:2509.23650, 2025. 11

work page internal anchor Pith review Pith/arXiv arXiv 2025
[45]

R. Yu, Q. Wang, H. Li, Z. Jun, Z. Wang, J. Wu, and Q. Zhu. Start: Traversing sparse footholds with terrain reconstruction.IEEE Robotics and Automation Letters, 11(2):2194–2201, 2025

2025
[46]

Zargarbashi, J

F. Zargarbashi, J. Cheng, D. Kang, R. Sumner, and S. Coros. Robotkeyframing: Learning locomotion with high-level objectives via mixture of dense and sparse rewards. InConference on Robot Learning (CoRL), 2024

2024
[47]

Huang, J

T. Huang, J. Ren, H. Wang, Z. Wang, Q. Ben, M. Wen, X. Chen, J. Li, and J. Pang. Learn- ing humanoid standing-up control across diverse postures. InRobotics: Science and Systems (RSS), 2025

2025
[48]

G. Cesa, L. Lang, and M. Weiler. A program to build e(n)-equivariant steerable cnns. In International Conference on Learning Representations (ICLR), 2022

2022
[49]

Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning

V . Makoviychuk, L. Wawrzyniak, Y . Guo, M. Lu, K. Storey, M. Macklin, D. Hoeller, N. Rudin, A. Allshire, A. Handa, et al. Isaac gym: High performance gpu-based physics simulation for robot learning.arXiv preprint arXiv:2108.10470, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[50]

M. Macklin. Warp: A high-performance python framework for gpu simulation and graphics. InNVIDIA GPU Technology Conference (GTC), volume 3, 2022

2022
[51]

Z. Fu, A. Kumar, J. Malik, and D. Pathak. Minimizing energy consumption leads to the emer- gence of gaits in legged robots. InConference on Robot Learning (CoRL), 2021

2021
[52]

B. R. Whittington and D. G. Thelen. A simple mass-spring model with roller feet can induce the ground reactions observed in human walking.Journal of biomechanical engineering, 131 (1):011013, 2009

2009
[53]

Rudin, D

N. Rudin, D. Hoeller, P. Reist, and M. Hutter. Learning to walk in minutes using massively parallel deep reinforcement learning. InConference on Robot Learning (CoRL), 2021

2021
[54]

Mahmood, N

N. Mahmood, N. Ghorbani, N. F. Troje, G. Pons-Moll, and M. J. Black. Amass: Archive of motion capture as surface shapes. InProceedings of the IEEE/CVF international conference on computer vision, pages 5442–5451, 2019

2019
[55]

W. Xie, J. Han, J. Zheng, H. Li, X. Liu, J. Shi, W. Zhang, C. Bai, and X. Li. Kungfubot: Physics-based humanoid whole-body control for learning highly-dynamic skills. In D. Bel- grave, C. Zhang, H. Lin, R. Pascanu, P. Koniusz, M. Ghassemi, and N. Chen, editors,Ad- vances in Neural Information Processing Systems, volume 38, pages 62406–62433. Curran Associ...

2025

[1] [1]

Radosavovic, S

I. Radosavovic, S. Kamat, T. Darrell, and J. Malik. Learning humanoid locomotion over chal- lenging terrain.arXiv preprint arXiv:2410.03654, 2024

work page arXiv 2024

[2] [2]

Bonnen, J

K. Bonnen, J. S. Matthis, A. Gibaldi, M. S. Banks, D. M. Levi, and M. Hayhoe. Binocular vision and the control of foot placement during walking in natural terrain.Scientific reports, 11(1):20881, 2021

2021

[3] [3]

Zhuang, S

Z. Zhuang, S. Yao, and H. Zhao. Humanoid parkour learning. InConference on Robot Learn- ing (CoRL), 2024

2024

[4] [4]

S. Zhu, Z. Zhuang, M. Zhao, K.-Y . Lee, and H. Zhao. Hiking in the wild: A scalable perceptive parkour framework for humanoids.arXiv preprint arXiv:2601.07718, 2026

work page arXiv 2026

[5] [5]

W. Sun, Y . Su, L. Huang, A. Zhang, D. Wei, M. San, D. Tian, E. Cao, F. Yan, E. Xie, et al. Now you see that: Learning end-to-end humanoid locomotion from raw pixels.arXiv preprint arXiv:2602.06382, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[6] [6]

J. Sun, G. Han, P. Sun, W. Zhao, J. Cao, J. Wang, Y . Guo, and Q. Zhang. Dpl: Depth- only perceptive humanoid locomotion via realistic depth synthesis and cross-attention terrain reconstruction.arXiv preprint arXiv:2510.07152, 2025

work page arXiv 2025

[7] [7]

B. Nie, Y . Zhang, R. Jin, Z. Cao, H. Lin, X. Yang, and Y . Gao. Coordinated humanoid robot locomotion with symmetry equivariant reinforcement learning policy. InProceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 18523–18531, 2026

2026

[8] [8]

Mittal, N

M. Mittal, N. Rudin, V . Klemm, A. Allshire, and M. Hutter. Symmetry considerations for learning task symmetric robot policies. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 7433–7439. IEEE, 2024

2024

[9] [9]

Z. Su, X. Huang, D. Ordo ˜nez-Apraez, Y . Li, Z. Li, Q. Liao, G. Turrisi, M. Pontil, C. Semini, Y . Wu, et al. Leveraging symmetry in rl-based legged locomotion control. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 6899–6906. IEEE, 2024

2024

[10] [10]

X. B. Peng, Z. Ma, P. Abbeel, S. Levine, and A. Kanazawa. Amp: Adversarial motion priors for stylized physics-based character control.ACM Transactions on Graphics (ToG), 40(4): 1–20, 2021

2021

[11] [11]

A. Tang, T. Hiraoka, N. Hiraoka, F. Shi, K. Kawaharazuka, K. Kojima, K. Okada, and M. Inaba. Humanmimic: Learning natural locomotion and transitions for humanoid robot via wasserstein adversarial imitation. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 13107–13114. IEEE, 2024

2024

[12] [12]

Zhang, P

Q. Zhang, P. Cui, D. Yan, J. Sun, Y . Duan, G. Han, W. Zhao, W. Zhang, Y . Guo, A. Zhang, et al. Whole-body humanoid robot locomotion with human reference. In2024 IEEE/RSJ In- ternational Conference on Intelligent Robots and Systems (IROS), pages 11225–11231. IEEE, 2024. 9

2024

[13] [13]

Cheng, K

X. Cheng, K. Shi, A. Agarwal, and D. Pathak. Extreme parkour with legged robots. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 11443–11450. IEEE, 2024

2024

[14] [14]

Zhang, W

C. Zhang, W. Xiao, T. He, and G. Shi. Wococo: Learning whole-body humanoid control with sequential contacts. InConference on Robot Learning (CoRL), 2024

2024

[15] [15]

V ollenweider, M

E. V ollenweider, M. Bjelonic, V . Klemm, N. Rudin, J. Lee, and M. Hutter. Advanced skills through multiple adversarial motion priors in reinforcement learning. In2023 IEEE Interna- tional Conference on Robotics and Automation (ICRA), pages 5120–5126. IEEE, 2023

2023

[16] [16]

H. Wang, Z. Wang, J. Ren, Q. Ben, T. Huang, W. Zhang, and J. Pang. Beamdojo: Learning agile humanoid locomotion on sparse footholds. InRobotics: Science and Systems (RSS), 2025

2025

[17] [17]

Z. Wu, X. Huang, L. Yang, Y . Zhang, K. Sreenath, X. Chen, P. Abbeel, R. Duan, A. Kanazawa, C. Sferrazza, et al. Perceptive humanoid parkour: Chaining dynamic human skills via motion matching.arXiv preprint arXiv:2602.15827, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[18] [18]

J. Long, J. Ren, M. Shi, Z. Wang, T. Huang, P. Luo, and J. Pang. Learning humanoid locomo- tion with perceptive internal model. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 9997–10003. IEEE, 2025

2025

[19] [19]

X. Cui, L. Feng, Y . Zhou, H. Han, Z. Liu, and H. Wang. Pilot: A perceptive integrated low-level controller for loco-manipulation over unstructured scenes.arXiv preprint arXiv:2601.17440, 2026

work page arXiv 2026

[20] [20]

S. Ma, H. Chen, Z. Xu, Y . Zhao, K. Wu, R. Yang, L. Zou, Z. Gan, and W. Ding. Cmoe: Contrastive mixture of experts for motion control and terrain adaptation of humanoid robots. arXiv preprint arXiv:2603.03067, 2026

work page arXiv 2026

[21] [21]

J. He, C. Zhang, F. Jenelten, R. Grandia, M. B ¨acher, and M. Hutter. Attention-based map encoding for learning generalized legged locomotion.Science Robotics, 10(105):eadv3604, 2025

2025

[22] [22]

D. Wang, X. Wang, X. Liu, J. Shi, Y . Zhao, C. Bai, and X. Li. More: Mixture of residual experts for humanoid lifelike gaits learning on complex terrains.arXiv preprint arXiv:2506.08840, 2025

work page arXiv 2025

[23] [23]

Zhuang, S

Z. Zhuang, S. Zhu, M. Zhao, and H. Zhao. Deep whole-body parkour.arXiv preprint arXiv:2601.07701, 2026

work page arXiv 2026

[24] [24]

Zhang, G

Q. Zhang, G. Han, J. Sun, W. Zhao, C. Sun, J. Cao, J. Wang, Y . Guo, and R. Xu. Distillation- ppo: A novel two-stage reinforcement learning framework for humanoid robot perceptive lo- comotion. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 2916–2922. IEEE, 2025

2025

[25] [25]

Y . Liu, T. Yu, H. Song, H. Zhu, N. Hu, Y . Hao, X. Yao, X. Zang, H. Chen, and J. Zhao. Faststair: Learning to run up stairs with humanoid robots.arXiv preprint arXiv:2601.10365, 2026

work page arXiv 2026

[26] [26]

Parkour in the wild: Learn- ing a general and extensible agile locomotion policy using multi-expert distillation and rl fine-tuning,

N. Rudin, J. He, J. Aurand, and M. Hutter. Parkour in the wild: Learning a general and extensible agile locomotion policy using multi-expert distillation and rl fine-tuning.arXiv preprint arXiv:2505.11164, 2025

work page arXiv 2025

[27] [27]

Mastalli, I

C. Mastalli, I. Havoutis, M. Focchi, D. G. Caldwell, and C. Semini. Motion planning for quadrupedal locomotion: Coupled planning, terrain mapping, and whole-body control.IEEE Transactions on Robotics, 36(6):1635–1648, 2020. 10

2020

[28] [28]

Agrawal, S

A. Agrawal, S. Chen, A. Rai, and K. Sreenath. Vision-aided dynamic quadrupedal locomotion on discrete terrain using motion libraries. In2022 International Conference on Robotics and Automation (ICRA), pages 4708–4714. IEEE, 2022

2022

[29] [29]

Fahmi, V

S. Fahmi, V . Barasuol, D. Esteban, O. Villarreal, and C. Semini. Vital: Vision-based terrain- aware locomotion for legged robots.IEEE Transactions on Robotics, 39(2):885–904, 2022

2022

[30] [30]

Tsounis, M

V . Tsounis, M. Alge, J. Lee, F. Farshidian, and M. Hutter. Deepgait: Planning and control of quadrupedal gaits using deep reinforcement learning.IEEE Robotics and Automation Letters, 5(2):3699–3706, 2020

2020

[31] [31]

Jenelten, J

F. Jenelten, J. He, F. Farshidian, and M. Hutter. Dtc: Deep tracking control.Science Robotics, 9(86):eadh5401, 2024

2024

[32] [32]

W. Yu, D. Jain, A. Escontrela, A. Iscen, P. Xu, E. Coumans, S. Ha, J. Tan, and T. Zhang. Visual-locomotion: Learning to walk on complex terrains with vision. InConference on Robot Learning (CoRL), 2021

2021

[33] [33]

Gangapurwala, M

S. Gangapurwala, M. Geisert, R. Orsolino, M. Fallon, and I. Havoutis. Rloc: Terrain-aware legged locomotion using reinforcement learning and optimal control.IEEE Transactions on Robotics, 38(5):2908–2927, 2022

2022

[34] [34]

Zhuang, Z

Z. Zhuang, Z. Fu, J. Wang, C. Atkeson, S. Schwertfeger, C. Finn, and H. Zhao. Robot parkour learning. InConference on Robot Learning (CoRL), 2023

2023

[35] [35]

R. S. Sutton.Temporal credit assignment in reinforcement learning. University of Mas- sachusetts Amherst, 1984

1984

[36] [36]

D. Zhu, C. Zhu, Z. Zhang, S. Xin, and Y . Liu. Learning safe locomotion for quadrupedal robots by derived-action optimization. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 6870–6876. IEEE, 2024

2024

[37] [37]

Zhang, N

C. Zhang, N. Rudin, D. Hoeller, and M. Hutter. Learning agile locomotion on risky terrains. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 11864–11871. IEEE, 2024

2024

[38] [38]

Abdolhosseini, H

F. Abdolhosseini, H. Y . Ling, Z. Xie, X. B. Peng, and M. Van de Panne. On learning symmetric locomotion. InProceedings of the 12th ACM SIGGRAPH Conference on Motion, Interaction and Games, pages 1–10, 2019

2019

[39] [39]

Proximal Policy Optimization Algorithms

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[40] [40]

I. M. A. Nahrendra, B. Yu, and H. Myung. Dreamwaq: Learning robust quadrupedal lo- comotion with implicit terrain imagination via deep reinforcement learning. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 5078–5084. IEEE, 2023

2023

[41] [41]

Huang, S

R. Huang, S. Zhu, Y . Du, and H. Zhao. Moe-loco: Mixture of experts for multitask locomotion. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 14218–14225. IEEE, 2025

2025

[42] [42]

R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton. Adaptive mixtures of local experts. Neural computation, 3(1):79–87, 1991

1991

[43] [43]

S. Luo, S. Li, R. Yu, Z. Wang, J. Wu, and Q. Zhu. Pie: Parkour with implicit-explicit learning framework for legged robots.IEEE Robotics and Automation Letters, 9(11):9986–9993, 2024

2024

[44] [44]

P. Li, H. Li, Y . Ma, L. Chang, X. Yang, R. Yu, Y . Zhang, Y . Cao, Q. Zhu, and G. Sartoretti. Kivi: Kinesthetic-visuospatial integration for dynamic and safe egocentric legged locomotion. arXiv preprint arXiv:2509.23650, 2025. 11

work page internal anchor Pith review Pith/arXiv arXiv 2025

[45] [45]

R. Yu, Q. Wang, H. Li, Z. Jun, Z. Wang, J. Wu, and Q. Zhu. Start: Traversing sparse footholds with terrain reconstruction.IEEE Robotics and Automation Letters, 11(2):2194–2201, 2025

2025

[46] [46]

Zargarbashi, J

F. Zargarbashi, J. Cheng, D. Kang, R. Sumner, and S. Coros. Robotkeyframing: Learning locomotion with high-level objectives via mixture of dense and sparse rewards. InConference on Robot Learning (CoRL), 2024

2024

[47] [47]

Huang, J

T. Huang, J. Ren, H. Wang, Z. Wang, Q. Ben, M. Wen, X. Chen, J. Li, and J. Pang. Learn- ing humanoid standing-up control across diverse postures. InRobotics: Science and Systems (RSS), 2025

2025

[48] [48]

G. Cesa, L. Lang, and M. Weiler. A program to build e(n)-equivariant steerable cnns. In International Conference on Learning Representations (ICLR), 2022

2022

[49] [49]

Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning

V . Makoviychuk, L. Wawrzyniak, Y . Guo, M. Lu, K. Storey, M. Macklin, D. Hoeller, N. Rudin, A. Allshire, A. Handa, et al. Isaac gym: High performance gpu-based physics simulation for robot learning.arXiv preprint arXiv:2108.10470, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021

[50] [50]

M. Macklin. Warp: A high-performance python framework for gpu simulation and graphics. InNVIDIA GPU Technology Conference (GTC), volume 3, 2022

2022

[51] [51]

Z. Fu, A. Kumar, J. Malik, and D. Pathak. Minimizing energy consumption leads to the emer- gence of gaits in legged robots. InConference on Robot Learning (CoRL), 2021

2021

[52] [52]

B. R. Whittington and D. G. Thelen. A simple mass-spring model with roller feet can induce the ground reactions observed in human walking.Journal of biomechanical engineering, 131 (1):011013, 2009

2009

[53] [53]

Rudin, D

N. Rudin, D. Hoeller, P. Reist, and M. Hutter. Learning to walk in minutes using massively parallel deep reinforcement learning. InConference on Robot Learning (CoRL), 2021

2021

[54] [54]

Mahmood, N

N. Mahmood, N. Ghorbani, N. F. Troje, G. Pons-Moll, and M. J. Black. Amass: Archive of motion capture as surface shapes. InProceedings of the IEEE/CVF international conference on computer vision, pages 5442–5451, 2019

2019

[55] [55]

W. Xie, J. Han, J. Zheng, H. Li, X. Liu, J. Shi, W. Zhang, C. Bai, and X. Li. Kungfubot: Physics-based humanoid whole-body control for learning highly-dynamic skills. In D. Bel- grave, C. Zhang, H. Lin, R. Pascanu, P. Koniusz, M. Ghassemi, and N. Chen, editors,Ad- vances in Neural Information Processing Systems, volume 38, pages 62406–62433. Curran Associ...

2025