Long-Distance Real-World Navigation of the Legged-Wheeled Robot Go2-W Using Deep Reinforcement Learning

Kiyoshi Irie; Masahiro Tomono; Takaaki Matsuzawa; Taro Suzuki; Tomoaki Yoshida; Yoshitaka Hara

arxiv: 2606.21387 · v1 · pith:24GNHKIOnew · submitted 2026-06-19 · 💻 cs.RO

Long-Distance Real-World Navigation of the Legged-Wheeled Robot Go2-W Using Deep Reinforcement Learning

Takaaki Matsuzawa , Kiyoshi Irie , Tomoaki Yoshida , Taro Suzuki , Yoshitaka Hara , Masahiro Tomono This is my paper

Pith reviewed 2026-06-26 13:55 UTC · model grok-4.3

classification 💻 cs.RO

keywords legged-wheeled robotdeep reinforcement learninglocomotion controlautonomous navigationreal-world deploymentload distributionGo2-W

0 comments

The pith

A proprioception-only deep reinforcement learning policy extended to the 16-DoF Go2-W robot with load-distribution training enables 2.8 km autonomous real-world navigation without overheating.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a locomotion controller first built for standard quadrupeds can be transferred to the legged-wheeled Go2-W by retraining it to spread motor load away from the hip joints. Wheeled motion otherwise concentrates heat at those joints and halts long runs; the added training objective removes that bottleneck. The resulting controller was run on a 2.8 km outdoor course containing sidewalks, a park, and stairs at the Tsukuba Challenge 2025 and completed the route without thermal shutdown. A reader would care because the result turns a hybrid platform into a practical long-range autonomous vehicle using only onboard sensing.

Core claim

The paper establishes that extending a proprioception-only DRL policy to the 16-DoF legged-wheeled Go2-W and training it to distribute load produces stable locomotion that suppresses hip-joint heat concentration, allowing sustained autonomous traversal of a 2.8 km real-world route that includes sidewalks, a park, and stairs without stopping due to overheating.

What carries the argument

The extended proprioception-only DRL policy augmented with load-distribution training on the Go2-W platform, which balances actuator effort to prevent localized overheating during wheeled segments.

If this is right

Hybrid legged-wheeled robots become capable of multi-kilometer autonomous missions on mixed outdoor terrain without external cooling or frequent stops.
Proprioception alone suffices for reliable long-duration control once load distribution is included in training.
The same policy transfer approach can be applied to other commercial legged-wheeled platforms to shorten development time for real-world navigation.
Thermal limits shift from hardware redesign to software objectives that can be optimized during training.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar load-balancing objectives could be added to policies for other hybrid robots to extend their operational range.
The method may allow legged-wheeled systems to replace wheeled robots on tasks that occasionally require stair climbing without sacrificing flat-ground efficiency.
Integration with global planners could turn the demonstrated local controller into a city-scale autonomous delivery or inspection system.

Load-bearing premise

That the proprioception-only policy developed for quadrupeds can be extended to the Go2-W and, when retrained for load distribution, will remain stable and thermally safe over multi-kilometer real-world distances.

What would settle it

A single continuous run on the same 2.8 km Tsukuba Challenge route in which the Go2-W stops from hip-joint overheating despite the load-distribution training.

read the original abstract

Legged-wheeled robots have long been studied for their potential to combine the efficient flat-ground mobility of wheels with the rough-terrain capability of legs. However, examples of their application to long-range autonomous navigation in real environments remain limited. This paper reports our effort to build a deep reinforcement learning (DRL) based locomotion controller and an autonomous navigation system for the commercially available legged-wheeled robot Go2-W, and to apply them to long-range autonomous navigation in a real environment. For locomotion control, we extended a proprioception-only policy, which we had previously developed for quadruped robots, to the 16-DoF legged-wheeled robot. We also found that wheeled locomotion concentrates the load on the hip joints and causes heat concentration that hinders sustained travel, and obtained a policy that suppresses it by distributing the load. We evaluated the system at the Tsukuba Challenge 2025, demonstrating that it can autonomously traverse an approximately 2.8 km route including sidewalks, a park, and stairs without stopping due to overheating.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A systems report of a 2.8 km real-world run on the Go2-W with DRL and load balancing for heat control.

read the letter

The main thing to know is that this paper describes a successful 2.8 km autonomous traversal by the Go2-W legged-wheeled robot in a real outdoor setting, using deep reinforcement learning with an added heat management trick.

They extended their earlier proprioception-only DRL policy, originally for quadrupeds, to this 16-DoF platform. The addition is a training objective that distributes load to prevent heat concentration in the hip joints when using the wheels. The evaluation comes from the Tsukuba Challenge 2025, where the robot handled sidewalks, a park, and stairs without stopping for overheating.

This is solid as a systems demonstration. It shows the policy can transfer and that the load term addresses a real platform issue in practice.

The soft spots are the missing pieces that would let a reader judge the contribution more precisely. There are no quantitative results on locomotion quality, no ablations on the new objective, and no discussion of failure modes or how often the system needed intervention. The soundness rests on one long successful run.

This paper is for people in field robotics who are building or testing hybrid locomotion controllers for long-range tasks. It provides a concrete example of what works on commercial hardware in an actual competition setting.

I would recommend sending it to peer review. The real-world outcome is worth documenting, and the description is clear enough for others to follow the approach even if they want more data in a revision.

Referee Report

2 major / 0 minor

Summary. The paper reports the extension of a prior proprioception-only DRL locomotion policy from quadruped robots to the 16-DoF legged-wheeled Go2-W platform. It identifies hip-joint heat concentration during wheeled locomotion as a platform-specific issue and describes obtaining a policy via load-distribution training to mitigate it. The central empirical result is a successful autonomous 2.8 km traversal at the Tsukuba Challenge 2025 across sidewalks, a park, and stairs without overheating-related stops.

Significance. If the deployment result holds under scrutiny, the work provides a concrete systems-level demonstration of long-range real-world navigation with a commercial legged-wheeled robot using DRL. The explicit identification and training-based mitigation of thermal load concentration on a hybrid platform is a practical contribution. The use of a public competition as the evaluation venue supplies a falsifiable, high-stakes test of sustained operation.

major comments (2)

[Evaluation at Tsukuba Challenge 2025] The manuscript supplies no quantitative metrics (traversal time, average speed, joint-temperature time series, or power-consumption data), ablation results, or failure-mode analysis to support the claim of 2.8 km autonomous navigation without overheating stoppages. This information is load-bearing for the central empirical assertion.
[Locomotion control description] The load-distribution training procedure is described only in a single sentence; no modified reward terms, training hyperparameters, simulation setup, or comparison against the baseline policy are provided. These details are required to evaluate how the heat-mitigation claim was achieved and whether the extension to 16 DoF is reproducible.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. The feedback identifies opportunities to strengthen the empirical support and methodological detail. We respond to each major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Evaluation at Tsukuba Challenge 2025] The manuscript supplies no quantitative metrics (traversal time, average speed, joint-temperature time series, or power-consumption data), ablation results, or failure-mode analysis to support the claim of 2.8 km autonomous navigation without overheating stoppages. This information is load-bearing for the central empirical assertion.

Authors: We agree that the current manuscript would be strengthened by additional quantitative metrics from the Tsukuba Challenge 2025 deployment. The primary claim rests on successful completion of the 2.8 km route without overheating-related stops. In revision we will incorporate logged data on traversal time, average speed, joint-temperature time series, and power consumption. We will also add a discussion of observed failure modes and any minor issues encountered during the event. Controlled ablations were outside the scope of the public competition evaluation, but we will include qualitative comparisons to the baseline policy where supporting simulation or field data exist. revision: yes
Referee: [Locomotion control description] The load-distribution training procedure is described only in a single sentence; no modified reward terms, training hyperparameters, simulation setup, or comparison against the baseline policy are provided. These details are required to evaluate how the heat-mitigation claim was achieved and whether the extension to 16 DoF is reproducible.

Authors: The load-distribution training procedure is presented concisely. We will expand the methods section in revision to specify the modified reward terms used to penalize hip-joint load concentration, the training hyperparameters, the simulation environment configuration for the 16-DoF Go2-W platform, and direct comparisons of joint-load and temperature metrics between the baseline and modified policies. These additions will improve reproducibility of the extension from prior quadruped work. revision: yes

Circularity Check

0 steps flagged

Empirical systems report with no derivation chain

full rationale

The paper is a deployment report describing extension of a prior proprioception-only DRL policy to the Go2-W platform, addition of load-distribution training, and real-world evaluation over 2.8 km at Tsukuba Challenge 2025. No equations, predictions, or uniqueness theorems are invoked; the result is an empirical outcome rather than a derived claim. Self-citation of prior policy work is present but not load-bearing for any circular reduction. This matches the default non-circular case for systems papers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract contains no mathematical derivations, fitted parameters, or postulated entities; the contribution is an engineering demonstration of an existing DRL method on new hardware.

pith-pipeline@v0.9.1-grok · 5740 in / 987 out tokens · 36182 ms · 2026-06-26T13:55:37.445471+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 35 canonical work pages · 4 internal anchors

[1]

In: Robotics in Natural Settings (CLAWAR 2022)

Bjelonic, M., Klemm, V., Lee, J., Hutter, M.: A survey of Wheeled-Legged robots. In: Robotics in Natural Settings (CLAWAR 2022). Lecture Notes in Networks and Systems, vol. 530, pp. 83–94. Springer, Cham (2022). https://doi.org/10.1007/ 978-3-031-15226-9 11 28

2022
[2]

Journal of the Robotics Society of Japan 10(4), 520–525 (1992) https://doi.org/10.7210/jrsj.10.520

Kimura, H., Nakano, E., Nonaka, Y.: Development of leg-wheel robot and coop- erational motion of legs and wheels. Journal of the Robotics Society of Japan 10(4), 520–525 (1992) https://doi.org/10.7210/jrsj.10.520 . (in Japanese)

work page doi:10.7210/jrsj.10.520 1992
[3]

Advanced Robotics26(8-9), 969–988 (2012) https://doi.org/10.1163/156855312X633066

Endo, G., Hirose, S.: Study on Roller-Walker – improvement of locomotive efficiency of quadruped robots by passive wheels. Advanced Robotics26(8-9), 969–988 (2012) https://doi.org/10.1163/156855312X633066

work page doi:10.1163/156855312x633066 2012
[4]

In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Bjelonic, M., Grandia, R., Harley, O., Galliard, C., Zimmermann, S., Hutter, M.: Whole-body MPC and online gait sequence generation for wheeled-legged robots. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Sys- tems (IROS), pp. 8388–8395 (2021). https://doi.org/10.1109/IROS51168.2021. 9636371

work page doi:10.1109/iros51168.2021 2021
[5]

In: IEEE International Conference on Robotics and Automation (ICRA 2024), Yokohama, Japan, May 13–17, 2024

Chamorro, S., Klemm, V., La Iglesia Valls, M., Pal, C., Siegwart, R.: Reinforce- ment learning for blind stair climbing with legged and wheeled-legged robots. In: Proceedings of the 2024 IEEE International Conference on Robotics and Automa- tion (ICRA), pp. 8081–8087 (2024). https://doi.org/10.1109/ICRA57147.2024. 10610069

work page doi:10.1109/icra57147.2024 2024
[6]

https://www.unitree.com/ go2-w

Unitree Robotics: Unitree Go2-W Driving All Terrain. https://www.unitree.com/ go2-w. Accessed: 2026-06-11 (2024)

2026
[7]

https://www

Ascento Robotics: Ascento – Secure Assets with Robotics and AI. https://www. ascento.ai. Accessed: 2026-06-11 (2024)

2026
[8]

Science Robotics9(89), eadi9641 (2024) https://doi.org/10.1126/scirobotics.adi9641

Lee, J., Bjelonic, M., Reske, A., Wellhausen, L., Miki, T., Hutter, M.: Learning robust autonomous navigation and locomotion for wheeled-legged robots. Science Robotics9(89), eadi9641 (2024) https://doi.org/10.1126/scirobotics.adi9641

work page doi:10.1126/scirobotics.adi9641 2024
[9]

Text-driven affordance learning from egocentric vision.Adv

Irie, K., Yoshida, T., Matsuzawa, T., Suzuki, T., Hara, Y., Tomono, M.: Rough terrain navigation for a quadruped robot using deep reinforcement learning-based blind locomotion control and a stuck-escape strategy. Advanced Robotics39(18), 1182–1198 (2025) https://doi.org/10.1080/01691864.2025.2561643

work page doi:10.1080/01691864.2025.2561643 2025
[10]

In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018)

Tan, J., Zhang, T., Coumans, E., Iscen, A., Bai, Y., Hafner, D., Bohez, S., Vanhoucke, V.: Sim-to-real: Learning agile locomotion for quadruped robots. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018). https://doi.org/10.15607/RSS.2018.XIV.010

work page doi:10.15607/rss.2018.xiv.010 2018
[11]

Hwangbo, J

Hwangbo, J., Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V., Hutter, M.: Learning agile and dynamic motor skills for legged robots. Science Robotics4(26), eaau5872 (2019) https://doi.org/10.1126/scirobotics.aau5872

work page doi:10.1126/scirobotics.aau5872 2019
[12]

Science Robotics5(47), eabc5986 (2020) https://doi.org/10.1126/scirobotics.abc5986 29

Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V., Hutter, M.: Learning quadrupedal locomotion over challenging terrain. Science Robotics5(47), eabc5986 (2020) https://doi.org/10.1126/scirobotics.abc5986 29

work page doi:10.1126/scirobotics.abc5986 2020
[13]

Kumar, Z

Kumar, A., Fu, Z., Pathak, D., Malik, J.: RMA: Rapid Motor Adaptation for Legged Robots. In: Proceedings of Robotics: Science and Systems, Virtual (2021). https://doi.org/10.15607/RSS.2021.XVII.011

work page doi:10.15607/rss.2021.xvii.011 2021
[14]

URLhttps://doi.org/10.1126/scirobotics

Miki, T., Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V., Hutter, M.: Learning robust perceptive locomotion for quadrupedal robots in the wild. Science Robotics 7(62), eabk2822 (2022) https://doi.org/10.1126/scirobotics.abk2822

work page doi:10.1126/scirobotics.abk2822 2022
[15]

Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning

Makoviychuk, V., Wawrzyniak, L., Guo, Y., Lu, M., Storey, K., Macklin, M., Hoeller, D., Rudin, N., Allshire, A., Handa, A., State, G.: Isaac gym: High per- formance GPU based physics simulation for robot learning. In: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, vol. 1 (2021). https://doi.org/10.48550/arXiv.2108.10470

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2108.10470 2021
[16]

Fu, Tri Dao, Khaled K

Rudin, N., Hoeller, D., Reist, P., Hutter, M.: Learning to walk in minutes using massively parallel deep reinforcement learning. In: Proceedings of the 5th Con- ference on Robot Learning, pp. 91–100 (2021). https://doi.org/10.48550/arXiv. 2109.11978

work page internal anchor Pith review doi:10.48550/arxiv 2021
[17]

https://doi.org/10.48550/arXiv.2502.08844

Zakka, K., Tabanpour, B., Liao, Q., Haiderbhai, M., Holt, S., Luo, J.Y., Allshire, A., Frey, E., Sreenath, K., Kahrs, L.A., Sferrazza, C., Tassa, Y., Abbeel, P.: MuJoCo Playground (2025). https://doi.org/10.48550/arXiv.2502.08844

work page doi:10.48550/arxiv.2502.08844 2025
[18]

Tziafas and H

Xu, Z., Raj, A.H., Xiao, X., Stone, P.: Dexterous legged locomotion in confined 3D spaces with reinforcement learning. In: 2024 IEEE International Conference on Robotics and Automation (ICRA) (2024). https://doi.org/10.1109/ICRA57147. 2024.10610668

work page doi:10.1109/icra57147 2024
[19]

IEEE Robotics and Automation Letters9(11), 9986–9993 (2024) https://doi.org/10.1109/LRA.2024.3459797

Luo, S., Li, S., Yu, R., Wang, Z., Wu, J., Zhu, Q.: PIE: Parkour with implicit- explicit learning framework for legged robots. IEEE Robotics and Automation Letters9(11), 9986–9993 (2024) https://doi.org/10.1109/LRA.2024.3459797

work page doi:10.1109/lra.2024.3459797 2024
[20]

npj Robotics3(22) (2025) https://doi.org/10.1038/s44182-025-00043-2

Xiao, E., Dong, Y., Lam, J., Lu, P.: Learning stable bipedal locomotion skills for quadrupedal robots on challenging terrains with automatic fall recovery. npj Robotics3(22) (2025) https://doi.org/10.1038/s44182-025-00043-2

work page doi:10.1038/s44182-025-00043-2 2025
[21]

In: 2019 IEEE International Conference on Robotics and Automation (ICRA), pp

Klemm, V., Morra, A., Salzmann, C., Tschopp, F., Bodie, K., Gulich, L., K¨ ung, N., Mannhart, D., Pfister, C., Vierneisel, M., Weber, F., Deuber, R., Siegwart, R.: Ascento: A two-wheeled jumping robot. In: 2019 IEEE International Conference on Robotics and Automation (ICRA), pp. 7515–7521 (2019). https://doi.org/10. 1109/ICRA.2019.8793792

arXiv 2019
[22]

IEEE Robotics and Automation Letters5(2), 3745–3752 (2020) https://doi.org/10.1109/LRA.2020.2979625 30

Klemm, V., Morra, A., Gulich, L., Mannhart, D., Rohr, D., Kamel, M., Viragh, Y., Siegwart, R.: LQR-assisted whole-body control of a wheeled bipedal robot with kinematic loops. IEEE Robotics and Automation Letters5(2), 3745–3752 (2020) https://doi.org/10.1109/LRA.2020.2979625 30

work page doi:10.1109/lra.2020.2979625 2020
[23]

IEEE Robotics and Automation Letters4(2), 2116–2123 (2019) https://doi.org/10.1109/LRA.2019.2899750

Bjelonic, M., Bellicoso, C.D., Viragh, Y., Sako, D., Tresoldi, F.D., Jenelten, F., Hutter, M.: Keep rollin’—whole-body motion control and planning for wheeled quadrupedal robots. IEEE Robotics and Automation Letters4(2), 2116–2123 (2019) https://doi.org/10.1109/LRA.2019.2899750

work page doi:10.1109/lra.2019.2899750 2019
[24]

Transactions of the Japan Society of Mechanical Engineers, Series C75(759), 2996–3004 (2009) https://doi.org/10.1299/kikaic.75

Takubo, T., Yoshioka, T., Arai, T., Mae, Y., Ohara, K.: Leg-wheel hybrid loco- motion for multi-legged robot. Transactions of the Japan Society of Mechanical Engineers, Series C75(759), 2996–3004 (2009) https://doi.org/10.1299/kikaic.75. 2996 . (in Japanese)

work page doi:10.1299/kikaic.75 2009
[25]

Journal of the Robotics Society of Japan40(5), 421–430 (2022) https://doi.org/10.7210/jrsj.40.421

Oda, K., Ida, Y., Ishikawa, J., Hiraoka, M., Hyon, S.-H.: Realization of whole-body torque-controlled hydraulic wheel-on-leg rover. Journal of the Robotics Society of Japan40(5), 421–430 (2022) https://doi.org/10.7210/jrsj.40.421 . (in Japanese)

work page doi:10.7210/jrsj.40.421 2022
[26]

In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, pp

Besseron, G., Grand, C., Amar, F.B., Bidaud, P.: Decoupled control of the high mobility robot Hylos based on a dynamic stability margin. In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, pp. 2435–2440 (2008). https://doi.org/10.1109/IROS.2008.4651092

work page doi:10.1109/iros.2008.4651092 2008
[27]

InProceedings of Robotics: Science and Systems, DOI: 10.15607/RSS

Pinto, L., Andrychowicz, M., Welinder, P., Zaremba, W., Abbeel, P.: Asymmetric actor critic for image-based robot learning. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018). https://doi.org/10.15607/RSS. 2018.XIV.008

work page doi:10.15607/rss 2018
[28]

Proximal Policy Optimization Algorithms

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal Policy Optimization Algorithms (2017). https://doi.org/10.48550/arXiv.1707.06347

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1707.06347 2017
[29]

Tobin, R

Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30 (2017). https://doi.org/10.1109/IROS.2017.8202133

work page doi:10.1109/iros.2017.8202133 2017
[30]

IEEE Transactions on Robotics38(4), 2053–2073 (2022) https: //doi.org/10.1109/TRO.2022.3141876

Xu, W., Cai, Y., He, D., Lin, J., Zhang, F.: FAST-LIO2: Fast direct LiDAR- inertial odometry. IEEE Transactions on Robotics38(4), 2053–2073 (2022) https: //doi.org/10.1109/TRO.2022.3141876

work page doi:10.1109/tro.2022.3141876 2053
[31]

Journal of Open Source Software9(100), 6948 (2024) https://doi.org/10

Koide, K.: small gicp: Efficient and parallel algorithms for point cloud registra- tion. Journal of Open Source Software9(100), 6948 (2024) https://doi.org/10. 21105/joss.06948

2024
[32]

In: Proceedings of the Japan Society of Mechanical Engineers Robotics and Mechatronics Conference (ROBOMECH2021) (2021)

Irie, K., Suzuki, T., Hara, Y., Yoshida, T., Tomono, M., Nishimura, K., Yamato, H., Shimizu, M.: Autonomous navigation of a legged robot using data-driven path following. In: Proceedings of the Japan Society of Mechanical Engineers Robotics and Mechatronics Conference (ROBOMECH2021) (2021). (in Japanese) 1P1-L01

2021
[33]

31 Journal of Robotics and Mechatronics30(4), 504–512 (2018) https://doi.org/10

Yuta, S.: Tsukuba Challenge: Open experiments for autonomous navigation of mobile robots in the city – activities and results of the first and second stages –. 31 Journal of Robotics and Mechatronics30(4), 504–512 (2018) https://doi.org/10. 20965/jrm.2018.p0504

2018
[34]

Journal of Robotics and Mechatronics32(6), 1104–1111 (2020) https://doi.org/10.20965/jrm.2020.p1104

Hara, Y., Tomizawa, T., Date, H., Kuroda, Y., Tsubouchi, T.: Tsukuba Chal- lenge 2019: Task settings and experimental results. Journal of Robotics and Mechatronics32(6), 1104–1111 (2020) https://doi.org/10.20965/jrm.2020.p1104

work page doi:10.20965/jrm.2020.p1104 2019
[35]

running the Tsukuba Challenge 2007

Saiki, Y.M., Takeuchi, E., Carballo, A., Tokunaga, W., Kuniyoshi, H., Aburadani, A., Hirosawa, A., Nagasaka, Y., Suzuki, Y., Tsubouchi, T.: 1Km autonomous robot navigation on outdoor pedestrian paths “running the Tsukuba Challenge 2007”. In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 219–225 (2008). https://doi.org/10.11...

work page doi:10.1109/iros.2008.4650584 2007
[36]

Robotics and Autonomous Systems179, 104750 (2024) https://doi.org/10.1016/j.robot.2024

Koide, K., Yokozuka, M., Oishi, S., Banno, A.: GLIM: 3D range-inertial local- ization and mapping with GPU-accelerated scan matching factors. Robotics and Autonomous Systems179, 104750 (2024) https://doi.org/10.1016/j.robot.2024. 104750

work page doi:10.1016/j.robot.2024 2024
[37]

https://github.com/tsukubachallenge/tc-datasets

Tsukuba Challenge Datasets: Real World Datasets for Autonomous Navigation. https://github.com/tsukubachallenge/tc-datasets. Accessed: 2026-06-11

2026
[38]

Learning to Balance Motor Thermal Safety and Quadrupedal Locomotion Performance with Residual Policy

Wan, Y., Lin, W., Qian, L., Zou, Y., Wu, W., Wu, S., Zhao, C., Luo, X.: Learning to Balance Motor Thermal Safety and Quadrupedal Locomotion Performance with Residual Policy (2026). https://doi.org/10.48550/arXiv.2605.27046

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.27046 2026
[39]

In: Proceedings of Robotics: Science and Systems, Virtual (2021)

Zhang, Z., Fisac, J.F.: Safe Occlusion-Aware autonomous driving via Game- Theoretic active perception. In: Proceedings of Robotics: Science and Systems, Virtual (2021). https://doi.org/10.15607/RSS.2021.XVII.066

work page doi:10.15607/rss.2021.xvii.066 2021
[40]

IEEE Transactions on Control Systems Technology33(3), 940–951 (2025) https://doi

Firoozi, R., Mir, A., Camps, G.S., Schwager, M.: OA-MPC: Occlusion-Aware MPC for guaranteed safe robot navigation with unseen dynamic obstacles. IEEE Transactions on Control Systems Technology33(3), 940–951 (2025) https://doi. org/10.1109/TCST.2024.3520462

work page doi:10.1109/tcst.2024.3520462 2025
[41]

Autonomous Robots49, 19 (2025) https://doi.org/10.1007/s10514-025-10202-x

Mattamala, M., Frey, J., Libera, P., Chebrolu, N., Martius, G., Cadena, C., Hutter, M., Fallon, M.: Wild visual navigation: Fast traversability learning via pre-trained models and online self-supervision. Autonomous Robots49, 19 (2025) https://doi.org/10.1007/s10514-025-10202-x

work page doi:10.1007/s10514-025-10202-x 2025
[42]

IEEE Robotics and Automation Letters9(11), 10423–10430 (2024) https://doi

Kim, Y., Lee, J.H., Lee, C., Mun, J., Youm, D., Park, J., Hwangbo, J.: Learning semantic traversability with egocentric video and automated annotation strategy. IEEE Robotics and Automation Letters9(11), 10423–10430 (2024) https://doi. org/10.1109/LRA.2024.3474548

work page doi:10.1109/lra.2024.3474548 2024
[43]

Random Forests,

Lu, F., Milios, E.: Globally consistent range scan alignment for environment mapping. Autonomous Robots4, 333–349 (1997) https://doi.org/10.1023/A: 1008854305733 32

work page doi:10.1023/a: 1997
[44]

In: Pro- ceedings of the 35th International Conference on Machine Learning

Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Pro- ceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 1861–1870 (2018) 33

2018

[1] [1]

In: Robotics in Natural Settings (CLAWAR 2022)

Bjelonic, M., Klemm, V., Lee, J., Hutter, M.: A survey of Wheeled-Legged robots. In: Robotics in Natural Settings (CLAWAR 2022). Lecture Notes in Networks and Systems, vol. 530, pp. 83–94. Springer, Cham (2022). https://doi.org/10.1007/ 978-3-031-15226-9 11 28

2022

[2] [2]

Journal of the Robotics Society of Japan 10(4), 520–525 (1992) https://doi.org/10.7210/jrsj.10.520

Kimura, H., Nakano, E., Nonaka, Y.: Development of leg-wheel robot and coop- erational motion of legs and wheels. Journal of the Robotics Society of Japan 10(4), 520–525 (1992) https://doi.org/10.7210/jrsj.10.520 . (in Japanese)

work page doi:10.7210/jrsj.10.520 1992

[3] [3]

Advanced Robotics26(8-9), 969–988 (2012) https://doi.org/10.1163/156855312X633066

Endo, G., Hirose, S.: Study on Roller-Walker – improvement of locomotive efficiency of quadruped robots by passive wheels. Advanced Robotics26(8-9), 969–988 (2012) https://doi.org/10.1163/156855312X633066

work page doi:10.1163/156855312x633066 2012

[4] [4]

In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Bjelonic, M., Grandia, R., Harley, O., Galliard, C., Zimmermann, S., Hutter, M.: Whole-body MPC and online gait sequence generation for wheeled-legged robots. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Sys- tems (IROS), pp. 8388–8395 (2021). https://doi.org/10.1109/IROS51168.2021. 9636371

work page doi:10.1109/iros51168.2021 2021

[5] [5]

In: IEEE International Conference on Robotics and Automation (ICRA 2024), Yokohama, Japan, May 13–17, 2024

Chamorro, S., Klemm, V., La Iglesia Valls, M., Pal, C., Siegwart, R.: Reinforce- ment learning for blind stair climbing with legged and wheeled-legged robots. In: Proceedings of the 2024 IEEE International Conference on Robotics and Automa- tion (ICRA), pp. 8081–8087 (2024). https://doi.org/10.1109/ICRA57147.2024. 10610069

work page doi:10.1109/icra57147.2024 2024

[6] [6]

https://www.unitree.com/ go2-w

Unitree Robotics: Unitree Go2-W Driving All Terrain. https://www.unitree.com/ go2-w. Accessed: 2026-06-11 (2024)

2026

[7] [7]

https://www

Ascento Robotics: Ascento – Secure Assets with Robotics and AI. https://www. ascento.ai. Accessed: 2026-06-11 (2024)

2026

[8] [8]

Science Robotics9(89), eadi9641 (2024) https://doi.org/10.1126/scirobotics.adi9641

Lee, J., Bjelonic, M., Reske, A., Wellhausen, L., Miki, T., Hutter, M.: Learning robust autonomous navigation and locomotion for wheeled-legged robots. Science Robotics9(89), eadi9641 (2024) https://doi.org/10.1126/scirobotics.adi9641

work page doi:10.1126/scirobotics.adi9641 2024

[9] [9]

Text-driven affordance learning from egocentric vision.Adv

Irie, K., Yoshida, T., Matsuzawa, T., Suzuki, T., Hara, Y., Tomono, M.: Rough terrain navigation for a quadruped robot using deep reinforcement learning-based blind locomotion control and a stuck-escape strategy. Advanced Robotics39(18), 1182–1198 (2025) https://doi.org/10.1080/01691864.2025.2561643

work page doi:10.1080/01691864.2025.2561643 2025

[10] [10]

In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018)

Tan, J., Zhang, T., Coumans, E., Iscen, A., Bai, Y., Hafner, D., Bohez, S., Vanhoucke, V.: Sim-to-real: Learning agile locomotion for quadruped robots. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018). https://doi.org/10.15607/RSS.2018.XIV.010

work page doi:10.15607/rss.2018.xiv.010 2018

[11] [11]

Hwangbo, J

Hwangbo, J., Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V., Hutter, M.: Learning agile and dynamic motor skills for legged robots. Science Robotics4(26), eaau5872 (2019) https://doi.org/10.1126/scirobotics.aau5872

work page doi:10.1126/scirobotics.aau5872 2019

[12] [12]

Science Robotics5(47), eabc5986 (2020) https://doi.org/10.1126/scirobotics.abc5986 29

Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V., Hutter, M.: Learning quadrupedal locomotion over challenging terrain. Science Robotics5(47), eabc5986 (2020) https://doi.org/10.1126/scirobotics.abc5986 29

work page doi:10.1126/scirobotics.abc5986 2020

[13] [13]

Kumar, Z

Kumar, A., Fu, Z., Pathak, D., Malik, J.: RMA: Rapid Motor Adaptation for Legged Robots. In: Proceedings of Robotics: Science and Systems, Virtual (2021). https://doi.org/10.15607/RSS.2021.XVII.011

work page doi:10.15607/rss.2021.xvii.011 2021

[14] [14]

URLhttps://doi.org/10.1126/scirobotics

Miki, T., Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V., Hutter, M.: Learning robust perceptive locomotion for quadrupedal robots in the wild. Science Robotics 7(62), eabk2822 (2022) https://doi.org/10.1126/scirobotics.abk2822

work page doi:10.1126/scirobotics.abk2822 2022

[15] [15]

Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning

Makoviychuk, V., Wawrzyniak, L., Guo, Y., Lu, M., Storey, K., Macklin, M., Hoeller, D., Rudin, N., Allshire, A., Handa, A., State, G.: Isaac gym: High per- formance GPU based physics simulation for robot learning. In: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, vol. 1 (2021). https://doi.org/10.48550/arXiv.2108.10470

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2108.10470 2021

[16] [16]

Fu, Tri Dao, Khaled K

Rudin, N., Hoeller, D., Reist, P., Hutter, M.: Learning to walk in minutes using massively parallel deep reinforcement learning. In: Proceedings of the 5th Con- ference on Robot Learning, pp. 91–100 (2021). https://doi.org/10.48550/arXiv. 2109.11978

work page internal anchor Pith review doi:10.48550/arxiv 2021

[17] [17]

https://doi.org/10.48550/arXiv.2502.08844

Zakka, K., Tabanpour, B., Liao, Q., Haiderbhai, M., Holt, S., Luo, J.Y., Allshire, A., Frey, E., Sreenath, K., Kahrs, L.A., Sferrazza, C., Tassa, Y., Abbeel, P.: MuJoCo Playground (2025). https://doi.org/10.48550/arXiv.2502.08844

work page doi:10.48550/arxiv.2502.08844 2025

[18] [18]

Tziafas and H

Xu, Z., Raj, A.H., Xiao, X., Stone, P.: Dexterous legged locomotion in confined 3D spaces with reinforcement learning. In: 2024 IEEE International Conference on Robotics and Automation (ICRA) (2024). https://doi.org/10.1109/ICRA57147. 2024.10610668

work page doi:10.1109/icra57147 2024

[19] [19]

IEEE Robotics and Automation Letters9(11), 9986–9993 (2024) https://doi.org/10.1109/LRA.2024.3459797

Luo, S., Li, S., Yu, R., Wang, Z., Wu, J., Zhu, Q.: PIE: Parkour with implicit- explicit learning framework for legged robots. IEEE Robotics and Automation Letters9(11), 9986–9993 (2024) https://doi.org/10.1109/LRA.2024.3459797

work page doi:10.1109/lra.2024.3459797 2024

[20] [20]

npj Robotics3(22) (2025) https://doi.org/10.1038/s44182-025-00043-2

Xiao, E., Dong, Y., Lam, J., Lu, P.: Learning stable bipedal locomotion skills for quadrupedal robots on challenging terrains with automatic fall recovery. npj Robotics3(22) (2025) https://doi.org/10.1038/s44182-025-00043-2

work page doi:10.1038/s44182-025-00043-2 2025

[21] [21]

In: 2019 IEEE International Conference on Robotics and Automation (ICRA), pp

Klemm, V., Morra, A., Salzmann, C., Tschopp, F., Bodie, K., Gulich, L., K¨ ung, N., Mannhart, D., Pfister, C., Vierneisel, M., Weber, F., Deuber, R., Siegwart, R.: Ascento: A two-wheeled jumping robot. In: 2019 IEEE International Conference on Robotics and Automation (ICRA), pp. 7515–7521 (2019). https://doi.org/10. 1109/ICRA.2019.8793792

arXiv 2019

[22] [22]

IEEE Robotics and Automation Letters5(2), 3745–3752 (2020) https://doi.org/10.1109/LRA.2020.2979625 30

Klemm, V., Morra, A., Gulich, L., Mannhart, D., Rohr, D., Kamel, M., Viragh, Y., Siegwart, R.: LQR-assisted whole-body control of a wheeled bipedal robot with kinematic loops. IEEE Robotics and Automation Letters5(2), 3745–3752 (2020) https://doi.org/10.1109/LRA.2020.2979625 30

work page doi:10.1109/lra.2020.2979625 2020

[23] [23]

IEEE Robotics and Automation Letters4(2), 2116–2123 (2019) https://doi.org/10.1109/LRA.2019.2899750

Bjelonic, M., Bellicoso, C.D., Viragh, Y., Sako, D., Tresoldi, F.D., Jenelten, F., Hutter, M.: Keep rollin’—whole-body motion control and planning for wheeled quadrupedal robots. IEEE Robotics and Automation Letters4(2), 2116–2123 (2019) https://doi.org/10.1109/LRA.2019.2899750

work page doi:10.1109/lra.2019.2899750 2019

[24] [24]

Transactions of the Japan Society of Mechanical Engineers, Series C75(759), 2996–3004 (2009) https://doi.org/10.1299/kikaic.75

Takubo, T., Yoshioka, T., Arai, T., Mae, Y., Ohara, K.: Leg-wheel hybrid loco- motion for multi-legged robot. Transactions of the Japan Society of Mechanical Engineers, Series C75(759), 2996–3004 (2009) https://doi.org/10.1299/kikaic.75. 2996 . (in Japanese)

work page doi:10.1299/kikaic.75 2009

[25] [25]

Journal of the Robotics Society of Japan40(5), 421–430 (2022) https://doi.org/10.7210/jrsj.40.421

Oda, K., Ida, Y., Ishikawa, J., Hiraoka, M., Hyon, S.-H.: Realization of whole-body torque-controlled hydraulic wheel-on-leg rover. Journal of the Robotics Society of Japan40(5), 421–430 (2022) https://doi.org/10.7210/jrsj.40.421 . (in Japanese)

work page doi:10.7210/jrsj.40.421 2022

[26] [26]

In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, pp

Besseron, G., Grand, C., Amar, F.B., Bidaud, P.: Decoupled control of the high mobility robot Hylos based on a dynamic stability margin. In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, pp. 2435–2440 (2008). https://doi.org/10.1109/IROS.2008.4651092

work page doi:10.1109/iros.2008.4651092 2008

[27] [27]

InProceedings of Robotics: Science and Systems, DOI: 10.15607/RSS

Pinto, L., Andrychowicz, M., Welinder, P., Zaremba, W., Abbeel, P.: Asymmetric actor critic for image-based robot learning. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018). https://doi.org/10.15607/RSS. 2018.XIV.008

work page doi:10.15607/rss 2018

[28] [28]

Proximal Policy Optimization Algorithms

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal Policy Optimization Algorithms (2017). https://doi.org/10.48550/arXiv.1707.06347

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1707.06347 2017

[29] [29]

Tobin, R

Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30 (2017). https://doi.org/10.1109/IROS.2017.8202133

work page doi:10.1109/iros.2017.8202133 2017

[30] [30]

IEEE Transactions on Robotics38(4), 2053–2073 (2022) https: //doi.org/10.1109/TRO.2022.3141876

Xu, W., Cai, Y., He, D., Lin, J., Zhang, F.: FAST-LIO2: Fast direct LiDAR- inertial odometry. IEEE Transactions on Robotics38(4), 2053–2073 (2022) https: //doi.org/10.1109/TRO.2022.3141876

work page doi:10.1109/tro.2022.3141876 2053

[31] [31]

Journal of Open Source Software9(100), 6948 (2024) https://doi.org/10

Koide, K.: small gicp: Efficient and parallel algorithms for point cloud registra- tion. Journal of Open Source Software9(100), 6948 (2024) https://doi.org/10. 21105/joss.06948

2024

[32] [32]

In: Proceedings of the Japan Society of Mechanical Engineers Robotics and Mechatronics Conference (ROBOMECH2021) (2021)

Irie, K., Suzuki, T., Hara, Y., Yoshida, T., Tomono, M., Nishimura, K., Yamato, H., Shimizu, M.: Autonomous navigation of a legged robot using data-driven path following. In: Proceedings of the Japan Society of Mechanical Engineers Robotics and Mechatronics Conference (ROBOMECH2021) (2021). (in Japanese) 1P1-L01

2021

[33] [33]

31 Journal of Robotics and Mechatronics30(4), 504–512 (2018) https://doi.org/10

Yuta, S.: Tsukuba Challenge: Open experiments for autonomous navigation of mobile robots in the city – activities and results of the first and second stages –. 31 Journal of Robotics and Mechatronics30(4), 504–512 (2018) https://doi.org/10. 20965/jrm.2018.p0504

2018

[34] [34]

Journal of Robotics and Mechatronics32(6), 1104–1111 (2020) https://doi.org/10.20965/jrm.2020.p1104

Hara, Y., Tomizawa, T., Date, H., Kuroda, Y., Tsubouchi, T.: Tsukuba Chal- lenge 2019: Task settings and experimental results. Journal of Robotics and Mechatronics32(6), 1104–1111 (2020) https://doi.org/10.20965/jrm.2020.p1104

work page doi:10.20965/jrm.2020.p1104 2019

[35] [35]

running the Tsukuba Challenge 2007

Saiki, Y.M., Takeuchi, E., Carballo, A., Tokunaga, W., Kuniyoshi, H., Aburadani, A., Hirosawa, A., Nagasaka, Y., Suzuki, Y., Tsubouchi, T.: 1Km autonomous robot navigation on outdoor pedestrian paths “running the Tsukuba Challenge 2007”. In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 219–225 (2008). https://doi.org/10.11...

work page doi:10.1109/iros.2008.4650584 2007

[36] [36]

Robotics and Autonomous Systems179, 104750 (2024) https://doi.org/10.1016/j.robot.2024

Koide, K., Yokozuka, M., Oishi, S., Banno, A.: GLIM: 3D range-inertial local- ization and mapping with GPU-accelerated scan matching factors. Robotics and Autonomous Systems179, 104750 (2024) https://doi.org/10.1016/j.robot.2024. 104750

work page doi:10.1016/j.robot.2024 2024

[37] [37]

https://github.com/tsukubachallenge/tc-datasets

Tsukuba Challenge Datasets: Real World Datasets for Autonomous Navigation. https://github.com/tsukubachallenge/tc-datasets. Accessed: 2026-06-11

2026

[38] [38]

Learning to Balance Motor Thermal Safety and Quadrupedal Locomotion Performance with Residual Policy

Wan, Y., Lin, W., Qian, L., Zou, Y., Wu, W., Wu, S., Zhao, C., Luo, X.: Learning to Balance Motor Thermal Safety and Quadrupedal Locomotion Performance with Residual Policy (2026). https://doi.org/10.48550/arXiv.2605.27046

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.27046 2026

[39] [39]

In: Proceedings of Robotics: Science and Systems, Virtual (2021)

Zhang, Z., Fisac, J.F.: Safe Occlusion-Aware autonomous driving via Game- Theoretic active perception. In: Proceedings of Robotics: Science and Systems, Virtual (2021). https://doi.org/10.15607/RSS.2021.XVII.066

work page doi:10.15607/rss.2021.xvii.066 2021

[40] [40]

IEEE Transactions on Control Systems Technology33(3), 940–951 (2025) https://doi

Firoozi, R., Mir, A., Camps, G.S., Schwager, M.: OA-MPC: Occlusion-Aware MPC for guaranteed safe robot navigation with unseen dynamic obstacles. IEEE Transactions on Control Systems Technology33(3), 940–951 (2025) https://doi. org/10.1109/TCST.2024.3520462

work page doi:10.1109/tcst.2024.3520462 2025

[41] [41]

Autonomous Robots49, 19 (2025) https://doi.org/10.1007/s10514-025-10202-x

Mattamala, M., Frey, J., Libera, P., Chebrolu, N., Martius, G., Cadena, C., Hutter, M., Fallon, M.: Wild visual navigation: Fast traversability learning via pre-trained models and online self-supervision. Autonomous Robots49, 19 (2025) https://doi.org/10.1007/s10514-025-10202-x

work page doi:10.1007/s10514-025-10202-x 2025

[42] [42]

IEEE Robotics and Automation Letters9(11), 10423–10430 (2024) https://doi

Kim, Y., Lee, J.H., Lee, C., Mun, J., Youm, D., Park, J., Hwangbo, J.: Learning semantic traversability with egocentric video and automated annotation strategy. IEEE Robotics and Automation Letters9(11), 10423–10430 (2024) https://doi. org/10.1109/LRA.2024.3474548

work page doi:10.1109/lra.2024.3474548 2024

[43] [43]

Random Forests,

Lu, F., Milios, E.: Globally consistent range scan alignment for environment mapping. Autonomous Robots4, 333–349 (1997) https://doi.org/10.1023/A: 1008854305733 32

work page doi:10.1023/a: 1997

[44] [44]

In: Pro- ceedings of the 35th International Conference on Machine Learning

Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Pro- ceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 1861–1870 (2018) 33

2018