Recognition: no theorem link
Perceptive Humanoid Parkour: Chaining Dynamic Human Skills via Motion Matching
Pith reviewed 2026-05-15 21:25 UTC · model grok-4.3
The pith
A humanoid robot chains retargeted human parkour skills into one depth-driven policy that autonomously chooses and executes climbs, vaults, or rolls over obstacles.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Retargeted human kinematic trajectories are composed into long-horizon motions through nearest-neighbor search in feature space, tracked by expert RL policies, and distilled into a unified depth-based multi-skill policy. With only onboard depth sensing and discrete 2D velocity commands, the policy selects and executes context-appropriate skills such as stepping over, climbing onto, vaulting, or rolling off obstacles of varying geometries and heights, enabling autonomous long-horizon parkour that adapts to real-time perturbations.
What carries the argument
Motion matching formulated as nearest-neighbor search in feature space, which composes atomic human skills into continuous kinematic trajectories for subsequent RL tracking and distillation into a single depth-based policy.
Load-bearing premise
Retargeted human motion data can be tracked by RL policies and successfully distilled into one depth-only policy that handles real-world changes in obstacle geometry without further tuning or sensing.
What would settle it
Deploy the policy on an obstacle whose height or shape lies outside the range used in training and check whether the robot still selects and completes an appropriate skill without falling or requiring manual retuning.
Figures
read the original abstract
While recent advances in humanoid locomotion have achieved stable walking on varied terrains, capturing the agility and adaptivity of highly dynamic human motions remains an open challenge. In particular, agile parkour in complex environments demands not only low-level robustness, but also human-like motion expressiveness, long-horizon skill composition, and perception-driven decision-making. In this paper, we present Perceptive Humanoid Parkour (PHP), a modular framework that enables humanoid robots to autonomously perform long-horizon, vision-based parkour across challenging obstacle courses. Our approach first leverages motion matching, formulated as nearest-neighbor search in a feature space, to compose retargeted atomic human skills into long-horizon kinematic trajectories. This framework enables the flexible composition and smooth transition of complex skill chains while preserving the elegance and fluidity of dynamic human motions. Next, we train motion-tracking reinforcement learning (RL) expert policies for these composed motions, and distill them into a single depth-based, multi-skill student policy, using a combination of DAgger and RL. Crucially, the combination of perception and skill composition enables autonomous, context-aware decision-making: using only onboard depth sensing and a discrete 2D velocity command, the robot selects and executes whether to step over, climb onto, vault or roll off obstacles of varying geometries and heights. We validate our framework with extensive real-world experiments on a Unitree G1 humanoid robot, demonstrating highly dynamic parkour skills such as climbing tall obstacles up to 1.25m (96% robot height), as well as long-horizon multi-obstacle traversal with closed-loop adaptation to real-time obstacle perturbations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents Perceptive Humanoid Parkour (PHP), a modular framework for humanoid robots to perform long-horizon, vision-based parkour. It uses motion matching via nearest-neighbor search in feature space to compose retargeted human skills into kinematic trajectories, trains RL expert policies to track them, and distills the experts into a single depth-image-conditioned student policy via DAgger+RL. The student policy, given only onboard depth sensing and a discrete 2D velocity command, autonomously selects and executes skills such as step-over, climb, vault, or roll-off for varying obstacle geometries. The framework is validated through real-world experiments on a Unitree G1 humanoid, including climbs up to 1.25 m and closed-loop adaptation to perturbations.
Significance. If the central claims hold, the work would represent a meaningful advance in agile humanoid locomotion by demonstrating perception-driven composition of dynamic human skills without hand-crafted controllers or additional sensing. The combination of motion-matching composition with policy distillation offers a scalable path toward long-horizon behaviors, and successful real-world transfer on a commercial platform would strengthen evidence for sim-to-real methods in high-dynamics settings.
major comments (2)
- [Abstract] Abstract: the claim of 'extensive real-world experiments' demonstrating 'highly dynamic parkour skills' and 'closed-loop adaptation to real-time obstacle perturbations' is not accompanied by any quantitative metrics, success rates, failure modes, baselines, or ablation studies. This leaves the central claim of reliable autonomous skill selection under depth-only sensing only partially supported.
- [Method (distillation subsection)] The distillation step (DAgger+RL from multiple RL experts into a single depth-based student) is load-bearing for the autonomous decision-making claim, yet the manuscript provides no analysis of whether depth observations alone suffice to recover both geometry classification and precise timing for transitions without mode collapse or loss of dynamic fidelity on perturbed geometries.
minor comments (2)
- [Method] Notation for the motion-matching feature space and nearest-neighbor distance metric should be defined explicitly with equations rather than described in prose.
- [Experiments] The abstract mentions '96% robot height' for the 1.25 m climb; the corresponding robot height and exact obstacle dimensions should be stated consistently in the experiments section.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below, clarifying the experimental support in the manuscript while committing to revisions that strengthen the presentation of quantitative evidence and analysis.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim of 'extensive real-world experiments' demonstrating 'highly dynamic parkour skills' and 'closed-loop adaptation to real-time obstacle perturbations' is not accompanied by any quantitative metrics, success rates, failure modes, baselines, or ablation studies. This leaves the central claim of reliable autonomous skill selection under depth-only sensing only partially supported.
Authors: We agree that the abstract would be strengthened by explicit quantitative metrics. The manuscript body (Section 5 and supplementary material) presents results from extensive real-world trials on the Unitree G1, including success across multiple obstacle types up to 1.25 m, long-horizon traversals, and adaptation to perturbations, with comparisons to non-perceptive baselines. We will revise the abstract to incorporate key metrics such as overall success rates, adaptation performance, and references to the ablation studies and failure mode analysis already present in the main text. revision: yes
-
Referee: [Method (distillation subsection)] The distillation step (DAgger+RL from multiple RL experts into a single depth-based student) is load-bearing for the autonomous decision-making claim, yet the manuscript provides no analysis of whether depth observations alone suffice to recover both geometry classification and precise timing for transitions without mode collapse or loss of dynamic fidelity on perturbed geometries.
Authors: The distillation subsection describes the DAgger+RL procedure that produces a single depth-conditioned student policy from the expert set. Real-world experiments demonstrate that this policy achieves autonomous skill selection and closed-loop adaptation to perturbations on varied geometries without evident mode collapse or loss of dynamic fidelity. We acknowledge that an explicit analysis of depth sufficiency for geometry classification and transition timing would provide additional rigor. We will add a targeted discussion in the revised method section examining the policy's observed behavior under depth-only input, drawing on the experimental outcomes. revision: yes
Circularity Check
No significant circularity in the derivation chain
full rationale
The paper describes a modular pipeline that first applies motion matching (nearest-neighbor search in feature space) to compose retargeted human skills into kinematic trajectories, then trains separate RL expert policies to track those trajectories, and finally distills the experts into one depth-conditioned student policy via DAgger plus RL. None of these steps reduce by construction to quantities defined inside the paper; motion matching is a standard external technique, the RL tracking objective is independent of the final student policy, and the distillation step is a conventional supervised transfer process whose success is measured by external real-world experiments on the Unitree G1 rather than by internal re-use of fitted parameters. No self-citations are invoked to establish uniqueness or to smuggle in ansatzes, and the central claim of perception-driven skill selection rests on the empirical behavior of the trained policy rather than on any definitional equivalence.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Retargeted human motions preserve sufficient dynamic properties for stable robot execution
Forward citations
Cited by 1 Pith paper
-
Learning Versatile Humanoid Manipulation with Touch Dreaming
HTD, a multimodal transformer policy trained with behavioral cloning and touch dreaming to predict future tactile latents, achieves a 90.9% relative success rate improvement over baselines on five real-world contact-r...
Reference graph
Works this paper leans on
-
[1]
Legged locomotion in challenging ter- rains using egocentric vision
Ananye Agarwal, Ashish Kumar, Jitendra Malik, and Deepak Pathak. Legged locomotion in challenging ter- rains using egocentric vision. InConference on robot learning, pages 403–415. PMLR, 2023
work page 2023
-
[2]
Qingwei Ben, Botian Xu, Kailin Li, Feiyu Jia, Wentao Zhang, Jingping Wang, Jingbo Wang, Dahua Lin, and Jiangmiao Pang. Gallant: V oxel grid-based humanoid locomotion and local-navigation across 3d constrained terrains, 2025. URL https://arxiv.org/abs/2511.14625
-
[3]
Kevin Bergamin, Simon Clavet, Daniel Holden, and James Richard Forbes. Drecon: data-driven responsive control of physics-based characters.ACM Transactions On Graphics (TOG), 38(6):1–11, 2019
work page 2019
-
[4]
Inertialization: High-performance anima- tion transitions in Gears of War
David Bollo. Inertialization: High-performance anima- tion transitions in Gears of War. Proc. of GDC, 2018
work page 2018
-
[5]
Motion matching - the road to next gen animation
Michael B ¨uttner and Simon Clavet. Motion matching - the road to next gen animation. Proc. of Nucl.ai, 2015
work page 2015
-
[6]
Ken Caluwaerts, Atil Iscen, J Chase Kew, Wenhao Yu, Tingnan Zhang, Daniel Freeman, Kuang-Huei Lee, Lisa Lee, Stefano Saliceti, Vincent Zhuang, et al. Bark- our: Benchmarking animal-level agility with quadruped robots.arXiv preprint arXiv:2305.14654, 2023
-
[7]
Gmt: General motion tracking for humanoid whole-body control.arXiv preprint arXiv:2506.14770, 2025
Zixuan Chen, Mazeyu Ji, Xuxin Cheng, Xuanbin Peng, Xue Bin Peng, and Xiaolong Wang. Gmt: General motion tracking for humanoid whole-body control.arXiv preprint arXiv:2506.14770, 2025
-
[8]
Extreme parkour with legged robots
Xuxin Cheng, Kexin Shi, Ananye Agarwal, and Deepak Pathak. Extreme parkour with legged robots. In2024 IEEE International Conference on Robotics and Automa- tion (ICRA), pages 11443–11450. IEEE, 2024
work page 2024
-
[9]
Motion matching and the road to next-gen animation
Simon Clavet. Motion matching and the road to next-gen animation. Proc. of GDC, 2016
work page 2016
-
[10]
Humanplus: Humanoid shad- owing and imitation from humans.arXiv preprint arXiv:2406.10454, 2024
Zipeng Fu, Qingqing Zhao, Qi Wu, Gordon Wet- zstein, and Chelsea Finn. Humanplus: Humanoid shad- owing and imitation from humans.arXiv preprint arXiv:2406.10454, 2024
-
[11]
Control operators for interactive character animation
Ruiyu Gou, Michiel van de Panne, and Daniel Holden. Control operators for interactive character animation. ACM Transactions on Graphics (TOG), 2025
work page 2025
-
[12]
Junzhe He, Chong Zhang, Fabian Jenelten, Ruben Grandia, Moritz B ¨acher, and Marco Hutter. Attention- based map encoding for learning generalized legged locomotion.Science Robotics, 10(105):eadv3604, 2025
work page 2025
-
[13]
David Hoeller, Nikita Rudin, Dhionis Sako, and Marco Hutter. Anymal parkour: Learning agile navigation for quadrupedal robots.Science Robotics, 9(88):eadi7566, 2024
work page 2024
-
[14]
Learned motion matching.ACM Transactions on Graph- ics (TOG), 2020
Daniel Holden, Anas Kanoun, Michiel B ˘uttner, Sofien Bouaziz, Sebastian Thrun, and Aaron Hertzmann. Learned motion matching.ACM Transactions on Graph- ics (TOG), 2020
work page 2020
-
[15]
Xiaoyu Huang, Takara Truong, Yunbo Zhang, Fangzhou Yu, Jean Pierre Sleiman, Jessica Hodgins, Koushil Sreenath, and Farbod Farshidian. Diffuse-cloc: Guided diffusion for physics-based character look-ahead control. arXiv preprint arXiv:2503.11801, 2025
-
[16]
Dvij Kalaria, Sudarshan S Harithas, Pushkal Katara, Sangkyung Kwak, Sarthak Bhagat, Shankar Sastry, Sri- nath Sridhar, Sai Vemprala, Ashish Kapoor, and Jonathan Chung-Kuan Huang. Dreamcontrol: Human-inspired whole-body humanoid control for scene interaction via guided diffusion.arXiv preprint arXiv:2509.14353, 2025
-
[17]
Animal gaits on quadrupedal robots using motion match- ing and model-based control
Dongho Kang, Simon Zimmermann, and Stelian Coros. Animal gaits on quadrupedal robots using motion match- ing and model-based control. In2021 IEEE/RSJ Inter- national Conference on Intelligent Robots and Systems (IROS), pages 8500–8507. IEEE, 2021
work page 2021
-
[18]
Rma: Rapid motor adaptation for legged robots
Ashish Kumar, Zipeng Fu, Deepak Pathak, and Jitendra Malik. Rma: Rapid motor adaptation for legged robots. arXiv preprint arXiv:2107.04034, 2021
-
[19]
Learning quadrupedal locomotion over challenging terrain.Science robotics, 5 (47):eabc5986, 2020
Joonho Lee, Jemin Hwangbo, Lorenz Wellhausen, Vladlen Koltun, and Marco Hutter. Learning quadrupedal locomotion over challenging terrain.Science robotics, 5 (47):eabc5986, 2020
work page 2020
-
[20]
BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion
Qiayuan Liao, Takara E Truong, Xiaoyu Huang, Yu- man Gao, Guy Tevet, Koushil Sreenath, and C Karen Liu. Beyondmimic: From motion tracking to versatile humanoid control via guided diffusion.arXiv preprint arXiv:2508.08241, 2025
work page internal anchor Pith review arXiv 2025
-
[21]
Junfeng Long, Zirui Wang, Quanyi Li, Jiawei Gao, Liu Cao, and Jiangmiao Pang. Hybrid internal model: Learning agile legged locomotion with simulated robot response.arXiv preprint arXiv:2312.11460, 2023
-
[22]
Learning hu- manoid locomotion with perceptive internal model
Junfeng Long, Junli Ren, Moji Shi, Zirui Wang, Tao Huang, Ping Luo, and Jiangmiao Pang. Learning hu- manoid locomotion with perceptive internal model. In 2025 IEEE International Conference on Robotics and Automation (ICRA), pages 9997–10003. IEEE, 2025
work page 2025
-
[23]
Shixin Luo, Songbo Li, Ruiqi Yu, Zhicheng Wang, Jun Wu, and Qiuguo Zhu. Pie: Parkour with implicit-explicit learning framework for legged robots.IEEE Robotics and Automation Letters, 2024
work page 2024
-
[24]
Zhengyi Luo, Ye Yuan, Tingwu Wang, Chenran Li, Sirui Chen, Fernando Casta ˜neda, Zi-Ang Cao, Jiefeng Li, David Minor, Qingwei Ben, et al. Sonic: Supersizing motion tracking for natural humanoid whole-body con- trol.arXiv preprint arXiv:2511.07820, 2025
-
[25]
Warp: A high-performance python frame- work for gpu simulation and graphics
Miles Macklin. Warp: A high-performance python frame- work for gpu simulation and graphics. https://github.com/ nvidia/warp, March 2022. NVIDIA GPU Technology Conference (GTC)
work page 2022
-
[26]
Takahiro Miki, Joonho Lee, Jemin Hwangbo, Lorenz Wellhausen, Vladlen Koltun, and Marco Hutter. Learning robust perceptive locomotion for quadrupedal robots in the wild.Science robotics, 7(62):eabk2822, 2022
work page 2022
-
[27]
Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot Learning
Mayank Mittal, Pascal Roth, James Tigue, Antoine Richard, Octi Zhang, Peter Du, Antonio Serrano-Mu ˜noz, Xinjie Yao, Ren ´e Zurbr ¨ugg, Nikita Rudin, et al. Isaac lab: A gpu-accelerated simulation framework for multi- modal robot learning.arXiv preprint arXiv:2511.04831, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[28]
I Nahrendra, Byeongho Yu, and Hyun Myung. Dreamwaq: Learning robust quadrupedal locomotion with implicit terrain imagination via deep reinforcement learning.arXiv preprint arXiv:2301.10602, 2023
-
[29]
Yixuan Pan, Ruoyi Qiao, Li Chen, Kashyap Chitta, Liang Pan, Haoguang Mai, Qingwen Bu, Hao Zhao, Cunyuan Zheng, Ping Luo, et al. Agility meets stability: Versa- tile humanoid control with heterogeneous data.arXiv preprint arXiv:2511.17373, 2025
-
[30]
Deepmimic: Example-guided deep re- inforcement learning of physics-based character skills
Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel Van de Panne. Deepmimic: Example-guided deep re- inforcement learning of physics-based character skills. ACM Transactions On Graphics (TOG), 37(4):1–14, 2018
work page 2018
-
[31]
Xue Bin Peng, Ze Ma, Pieter Abbeel, Sergey Levine, and Angjoo Kanazawa. Amp: Adversarial motion priors for stylized physics-based character control.ACM Transac- tions on Graphics (TOG), 2021
work page 2021
-
[32]
A reduction of imitation learning and structured prediction to no-regret online learning
St ´ephane Ross, Geoffrey Gordon, and Drew Bagnell. A reduction of imitation learning and structured prediction to no-regret online learning. InProceedings of the fourteenth international conference on artificial intelli- gence and statistics, pages 627–635. JMLR Workshop and Conference Proceedings, 2011
work page 2011
-
[33]
Nikita Rudin, Junzhe He, Joshua Aurand, and Marco Hutter. Parkour in the wild: Learning a general and exten- sible agile locomotion policy using multi-expert distilla- tion and rl fine-tuning.arXiv preprint arXiv:2505.11164, 2025
-
[34]
Learn parkour - climb up tutorial
Salgadopk. Learn parkour - climb up tutorial. URL https://youtu.be/6U1sIgqgPFo?si=339TPTxlFB5lWGB1
-
[35]
Jingkai Sun, Gang Han, Pihai Sun, Wen Zhao, Jiahang Cao, Jiaxu Wang, Yijie Guo, and Qiang Zhang. Dpl: Depth-only perceptive humanoid locomotion via realistic depth synthesis and cross-attention terrain reconstruction. arXiv preprint arXiv:2510.07152, 2025
-
[36]
Guy Tevet, Sigal Raab, Brian Gordon, Yoni Shafir, Daniel Cohen-or, and Amit Haim Bermano. Human motion diffusion model. InICLR, 2023
work page 2023
-
[37]
Beamdojo: Learning agile humanoid locomotion on sparse footholds
Huayi Wang, Zirui Wang, Junli Ren, Qingwei Ben, Tao Huang, Weinan Zhang, and Jiangmiao Pang. Beamdojo: Learning agile humanoid locomotion on sparse footholds. InRobotics: Science and Systems (RSS), 2025
work page 2025
-
[38]
Huayi Wang, Wentao Zhang, Runyi Yu, Tao Huang, Junli Ren, Feiyu Jia, Zirui Wang, Xiaojie Niu, Xiao Chen, Jiahe Chen, et al. Physhsi: Towards a real-world gener- alizable and natural humanoid-scene interaction system. arXiv preprint arXiv:2510.11072, 2025
-
[39]
Jinze Wu, Guiyang Xin, Chenkun Qi, and Yufei Xue. Learning robust and agile legged locomotion using ad- versarial motion priors.IEEE Robotics and Automation Letters, 8(8):4975–4982, 2023
work page 2023
-
[40]
Weiji Xie, Jinrui Han, Jiakun Zheng, Huanyu Li, Xinzhe Liu, Jiyuan Shi, Weinan Zhang, Chenjia Bai, and Xue- long Li. Kungfubot: Physics-based humanoid whole- body control for learning highly-dynamic skills.arXiv preprint arXiv:2506.12851, 2025
-
[41]
Parc: Physics-based augmentation with reinforcement learning for character controllers
Michael Xu, Yi Shi, KangKang Yin, and Xue Bin Peng. Parc: Physics-based augmentation with reinforcement learning for character controllers. InACM SIGGRAPH, 2025
work page 2025
-
[42]
Pei Xu, Zhen Wu, Ruocheng Wang, Vishnu Sarukkai, Kayvon Fatahalian, Ioannis Karamouzas, Victor Zordan, and C Karen Liu. Learning to ball: Composing policies for long-horizon basketball moves.ACM Transactions on Graphics (TOG), 44(6):1–14, 2025
work page 2025
-
[43]
Lujie Yang, Xiaoyu Huang, Zhen Wu, Angjoo Kanazawa, Pieter Abbeel, Carmelo Sferrazza, C Karen Liu, Rocky Duan, and Guanya Shi. Omniretarget: Interaction- preserving data generation for humanoid whole-body loco-manipulation and scene interaction.arXiv preprint arXiv:2509.26633, 2025
-
[44]
Neural volumetric memory for visual locomotion control
Ruihan Yang, Ge Yang, and Xiaolong Wang. Neural volumetric memory for visual locomotion control. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1430–1440, 2023
work page 2023
-
[45]
Learning visual parkour from generated images
Alan Yu, Ge Yang, Ran Choi, Yajvan Ravan, John Leonard, and Phillip Isola. Learning visual parkour from generated images. In8th Annual Conference on Robot Learning, 2024
work page 2024
-
[46]
Walking with terrain recon- struction: Learning to traverse risky sparse footholds
Ruiqi Yu, Qianshi Wang, Yizhen Wang, Zhicheng Wang, Jun Wu, and Qiuguo Zhu. Walking with terrain recon- struction: Learning to traverse risky sparse footholds. arXiv preprint arXiv:2409.15692, 2024
-
[47]
Yanjie Ze, Siheng Zhao, Weizhuo Wang, Angjoo Kanazawa, Rocky Duan, Pieter Abbeel, Guanya Shi, Jiajun Wu, and C Karen Liu. Twist2: Scalable, portable, and holistic humanoid data collection system.arXiv preprint arXiv:2511.02832, 2025
-
[48]
Hub: Learning extreme humanoid balance.CoRL, 2025
Tong Zhang, Boyuan Zheng, Ruiqian Nai, Yingdong Hu, Yen-Jen Wang, Geng Chen, Fanqi Lin, Jiongye Li, Chuye Hong, Koushil Sreenath, et al. Hub: Learning extreme humanoid balance.CoRL, 2025
work page 2025
-
[49]
Ziyu Zhang, Sergey Bashkirov, Dun Yang, Michael Tay- lor, and Xue Bin Peng. Add: Physics-based motion imi- tation with adversarial differential discriminators.arXiv preprint arXiv:2505.04961, 2025
-
[50]
Shaoting Zhu, Ziwen Zhuang, Mengjie Zhao, Kun-Ying Lee, and Hang Zhao. Hiking in the wild: A scalable perceptive parkour framework for humanoids.arXiv preprint arXiv:2601.07718, 2026
-
[51]
Robot parkour learning.arXiv preprint arXiv:2309.05665, 2023
Ziwen Zhuang, Zipeng Fu, Jianren Wang, Christo- pher Atkeson, Soeren Schwertfeger, Chelsea Finn, and Hang Zhao. Robot parkour learning.arXiv preprint arXiv:2309.05665, 2023
-
[52]
Humanoid parkour learning.arXiv preprint arXiv:2406.10759, 2024
Ziwen Zhuang, Shenzhe Yao, and Hang Zhao. Humanoid parkour learning.arXiv preprint arXiv:2406.10759, 2024. APPENDIX A. Motion Matching Implementation Details This section provides implementation details for the motion matching procedure used to synthesize long-horizon parkour reference trajectories
-
[53]
Motion Database and Feature Precomputation:All mo- tion clips are first retargeted to a 29-DOF Unitree G1 hu- manoid using OmniRetarget [43] and represented as frame sequences. At each framei, we store the robot configuration qi = (p i,r i,θ i), consisting of the root translationp i ∈R 3, root quaternionr i ∈R 4, and joint anglesθ i ∈R 29. For each frame,...
-
[54]
Query Feature Construction:At runtime, a query feature ˆxt is constructed from the current robot configurationq t and a 2D velocity command. We first extract the kinematic features fromq t to form the pose-based part of the query, namely the local foot state ˆft and the root velocity ˆht. We then compute the short-horizon future root trajectory from the 2...
-
[55]
Transition Smoothing via Inertialization:To ensure smooth transitions when switching the playback index to a newly retrieved frame, we adopt inertialization [4]. The key idea is to compute an offset between the currently playing motion and the target motion at the transition instant, apply this offset after switching so the output remains continuous, and ...
-
[56]
Skill List:Our motion library includes locomotion and a set of atomic parkour skills. Locomotion provides a shared transition manifold and includes standing, walking, and run- ning motions spanning commanded speeds from 0.8 to 3.5 m/s. Most parkour skills are instantiated at 1.0 m/s and 2.0 m/s. We additionally include a single 3.0 m/s cat-vault skill to ...
-
[57]
Motion Tracking Details:Specific reward formulations and domain randomization settings used for expert policy learning from [20] are summarized in Table IV and Table V for reference
-
[58]
Distillation Details:During student training, we relax the termination conditions relative to the expert to prevent premature termination of valid but mirrored executions. While this improves PPO stability, the student may visit states that are out-of-distribution for the expert policies, which were trained under the original termination thresholds and ma...
-
[59]
Training Hyperparameters:We include all hyperparam- eters for two-stage training in Table VI for reference. C. Details for Baselines
-
[60]
curriculum, without any motion imitation or human refer- ence trajectories
Velocity Tracking Baseline:To show the importance of human reference motion in our framework, we include a standard reward-shaping velocity-tracking baseline that learns locomotion purely from handcrafted rewards and a terrain Skill Duration (s) Locomotion Locomotion 495.5 Parkour skills @ 1.0 m/s Step (36 cm) 2.2 Climb (58 cm) 12.1 Climb (76 cm) 8.8 Clim...
-
[61]
AMP Baseline:Since AMP [31] is a popular algorithm for chaining skills with human reference data, we also im- plemented an AMP baseline by following theMimicKit 2 AMP implementation released by the original AMP authors. In our experiments, this baseline can walk stably and track the commanded velocity, but it does not perform well on obstacle traversal: i...
work page 2048
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.