pith. sign in

arxiv: 2606.21387 · v1 · pith:24GNHKIOnew · submitted 2026-06-19 · 💻 cs.RO

Long-Distance Real-World Navigation of the Legged-Wheeled Robot Go2-W Using Deep Reinforcement Learning

Pith reviewed 2026-06-26 13:55 UTC · model grok-4.3

classification 💻 cs.RO
keywords legged-wheeled robotdeep reinforcement learninglocomotion controlautonomous navigationreal-world deploymentload distributionGo2-W
0
0 comments X

The pith

A proprioception-only deep reinforcement learning policy extended to the 16-DoF Go2-W robot with load-distribution training enables 2.8 km autonomous real-world navigation without overheating.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a locomotion controller first built for standard quadrupeds can be transferred to the legged-wheeled Go2-W by retraining it to spread motor load away from the hip joints. Wheeled motion otherwise concentrates heat at those joints and halts long runs; the added training objective removes that bottleneck. The resulting controller was run on a 2.8 km outdoor course containing sidewalks, a park, and stairs at the Tsukuba Challenge 2025 and completed the route without thermal shutdown. A reader would care because the result turns a hybrid platform into a practical long-range autonomous vehicle using only onboard sensing.

Core claim

The paper establishes that extending a proprioception-only DRL policy to the 16-DoF legged-wheeled Go2-W and training it to distribute load produces stable locomotion that suppresses hip-joint heat concentration, allowing sustained autonomous traversal of a 2.8 km real-world route that includes sidewalks, a park, and stairs without stopping due to overheating.

What carries the argument

The extended proprioception-only DRL policy augmented with load-distribution training on the Go2-W platform, which balances actuator effort to prevent localized overheating during wheeled segments.

If this is right

  • Hybrid legged-wheeled robots become capable of multi-kilometer autonomous missions on mixed outdoor terrain without external cooling or frequent stops.
  • Proprioception alone suffices for reliable long-duration control once load distribution is included in training.
  • The same policy transfer approach can be applied to other commercial legged-wheeled platforms to shorten development time for real-world navigation.
  • Thermal limits shift from hardware redesign to software objectives that can be optimized during training.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar load-balancing objectives could be added to policies for other hybrid robots to extend their operational range.
  • The method may allow legged-wheeled systems to replace wheeled robots on tasks that occasionally require stair climbing without sacrificing flat-ground efficiency.
  • Integration with global planners could turn the demonstrated local controller into a city-scale autonomous delivery or inspection system.

Load-bearing premise

That the proprioception-only policy developed for quadrupeds can be extended to the Go2-W and, when retrained for load distribution, will remain stable and thermally safe over multi-kilometer real-world distances.

What would settle it

A single continuous run on the same 2.8 km Tsukuba Challenge route in which the Go2-W stops from hip-joint overheating despite the load-distribution training.

read the original abstract

Legged-wheeled robots have long been studied for their potential to combine the efficient flat-ground mobility of wheels with the rough-terrain capability of legs. However, examples of their application to long-range autonomous navigation in real environments remain limited. This paper reports our effort to build a deep reinforcement learning (DRL) based locomotion controller and an autonomous navigation system for the commercially available legged-wheeled robot Go2-W, and to apply them to long-range autonomous navigation in a real environment. For locomotion control, we extended a proprioception-only policy, which we had previously developed for quadruped robots, to the 16-DoF legged-wheeled robot. We also found that wheeled locomotion concentrates the load on the hip joints and causes heat concentration that hinders sustained travel, and obtained a policy that suppresses it by distributing the load. We evaluated the system at the Tsukuba Challenge 2025, demonstrating that it can autonomously traverse an approximately 2.8 km route including sidewalks, a park, and stairs without stopping due to overheating.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper reports the extension of a prior proprioception-only DRL locomotion policy from quadruped robots to the 16-DoF legged-wheeled Go2-W platform. It identifies hip-joint heat concentration during wheeled locomotion as a platform-specific issue and describes obtaining a policy via load-distribution training to mitigate it. The central empirical result is a successful autonomous 2.8 km traversal at the Tsukuba Challenge 2025 across sidewalks, a park, and stairs without overheating-related stops.

Significance. If the deployment result holds under scrutiny, the work provides a concrete systems-level demonstration of long-range real-world navigation with a commercial legged-wheeled robot using DRL. The explicit identification and training-based mitigation of thermal load concentration on a hybrid platform is a practical contribution. The use of a public competition as the evaluation venue supplies a falsifiable, high-stakes test of sustained operation.

major comments (2)
  1. [Evaluation at Tsukuba Challenge 2025] The manuscript supplies no quantitative metrics (traversal time, average speed, joint-temperature time series, or power-consumption data), ablation results, or failure-mode analysis to support the claim of 2.8 km autonomous navigation without overheating stoppages. This information is load-bearing for the central empirical assertion.
  2. [Locomotion control description] The load-distribution training procedure is described only in a single sentence; no modified reward terms, training hyperparameters, simulation setup, or comparison against the baseline policy are provided. These details are required to evaluate how the heat-mitigation claim was achieved and whether the extension to 16 DoF is reproducible.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. The feedback identifies opportunities to strengthen the empirical support and methodological detail. We respond to each major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Evaluation at Tsukuba Challenge 2025] The manuscript supplies no quantitative metrics (traversal time, average speed, joint-temperature time series, or power-consumption data), ablation results, or failure-mode analysis to support the claim of 2.8 km autonomous navigation without overheating stoppages. This information is load-bearing for the central empirical assertion.

    Authors: We agree that the current manuscript would be strengthened by additional quantitative metrics from the Tsukuba Challenge 2025 deployment. The primary claim rests on successful completion of the 2.8 km route without overheating-related stops. In revision we will incorporate logged data on traversal time, average speed, joint-temperature time series, and power consumption. We will also add a discussion of observed failure modes and any minor issues encountered during the event. Controlled ablations were outside the scope of the public competition evaluation, but we will include qualitative comparisons to the baseline policy where supporting simulation or field data exist. revision: yes

  2. Referee: [Locomotion control description] The load-distribution training procedure is described only in a single sentence; no modified reward terms, training hyperparameters, simulation setup, or comparison against the baseline policy are provided. These details are required to evaluate how the heat-mitigation claim was achieved and whether the extension to 16 DoF is reproducible.

    Authors: The load-distribution training procedure is presented concisely. We will expand the methods section in revision to specify the modified reward terms used to penalize hip-joint load concentration, the training hyperparameters, the simulation environment configuration for the 16-DoF Go2-W platform, and direct comparisons of joint-load and temperature metrics between the baseline and modified policies. These additions will improve reproducibility of the extension from prior quadruped work. revision: yes

Circularity Check

0 steps flagged

Empirical systems report with no derivation chain

full rationale

The paper is a deployment report describing extension of a prior proprioception-only DRL policy to the Go2-W platform, addition of load-distribution training, and real-world evaluation over 2.8 km at Tsukuba Challenge 2025. No equations, predictions, or uniqueness theorems are invoked; the result is an empirical outcome rather than a derived claim. Self-citation of prior policy work is present but not load-bearing for any circular reduction. This matches the default non-circular case for systems papers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract contains no mathematical derivations, fitted parameters, or postulated entities; the contribution is an engineering demonstration of an existing DRL method on new hardware.

pith-pipeline@v0.9.1-grok · 5740 in / 987 out tokens · 36182 ms · 2026-06-26T13:55:37.445471+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 35 canonical work pages · 4 internal anchors

  1. [1]

    In: Robotics in Natural Settings (CLAWAR 2022)

    Bjelonic, M., Klemm, V., Lee, J., Hutter, M.: A survey of Wheeled-Legged robots. In: Robotics in Natural Settings (CLAWAR 2022). Lecture Notes in Networks and Systems, vol. 530, pp. 83–94. Springer, Cham (2022). https://doi.org/10.1007/ 978-3-031-15226-9 11 28

  2. [2]

    Journal of the Robotics Society of Japan 10(4), 520–525 (1992) https://doi.org/10.7210/jrsj.10.520

    Kimura, H., Nakano, E., Nonaka, Y.: Development of leg-wheel robot and coop- erational motion of legs and wheels. Journal of the Robotics Society of Japan 10(4), 520–525 (1992) https://doi.org/10.7210/jrsj.10.520 . (in Japanese)

  3. [3]

    Advanced Robotics26(8-9), 969–988 (2012) https://doi.org/10.1163/156855312X633066

    Endo, G., Hirose, S.: Study on Roller-Walker – improvement of locomotive efficiency of quadruped robots by passive wheels. Advanced Robotics26(8-9), 969–988 (2012) https://doi.org/10.1163/156855312X633066

  4. [4]

    In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

    Bjelonic, M., Grandia, R., Harley, O., Galliard, C., Zimmermann, S., Hutter, M.: Whole-body MPC and online gait sequence generation for wheeled-legged robots. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Sys- tems (IROS), pp. 8388–8395 (2021). https://doi.org/10.1109/IROS51168.2021. 9636371

  5. [5]

    Cheng, Y

    Chamorro, S., Klemm, V., La Iglesia Valls, M., Pal, C., Siegwart, R.: Reinforce- ment learning for blind stair climbing with legged and wheeled-legged robots. In: Proceedings of the 2024 IEEE International Conference on Robotics and Automa- tion (ICRA), pp. 8081–8087 (2024). https://doi.org/10.1109/ICRA57147.2024. 10610069

  6. [6]

    https://www.unitree.com/ go2-w

    Unitree Robotics: Unitree Go2-W Driving All Terrain. https://www.unitree.com/ go2-w. Accessed: 2026-06-11 (2024)

  7. [7]

    https://www

    Ascento Robotics: Ascento – Secure Assets with Robotics and AI. https://www. ascento.ai. Accessed: 2026-06-11 (2024)

  8. [8]

    Science Robotics9(89), eadi9641 (2024) https://doi.org/10.1126/scirobotics.adi9641

    Lee, J., Bjelonic, M., Reske, A., Wellhausen, L., Miki, T., Hutter, M.: Learning robust autonomous navigation and locomotion for wheeled-legged robots. Science Robotics9(89), eadi9641 (2024) https://doi.org/10.1126/scirobotics.adi9641

  9. [9]

    Text-driven affordance learning from egocentric vision.Adv

    Irie, K., Yoshida, T., Matsuzawa, T., Suzuki, T., Hara, Y., Tomono, M.: Rough terrain navigation for a quadruped robot using deep reinforcement learning-based blind locomotion control and a stuck-escape strategy. Advanced Robotics39(18), 1182–1198 (2025) https://doi.org/10.1080/01691864.2025.2561643

  10. [10]

    In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018)

    Tan, J., Zhang, T., Coumans, E., Iscen, A., Bai, Y., Hafner, D., Bohez, S., Vanhoucke, V.: Sim-to-real: Learning agile locomotion for quadruped robots. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018). https://doi.org/10.15607/RSS.2018.XIV.010

  11. [11]

    Hwangbo, J

    Hwangbo, J., Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V., Hutter, M.: Learning agile and dynamic motor skills for legged robots. Science Robotics4(26), eaau5872 (2019) https://doi.org/10.1126/scirobotics.aau5872

  12. [12]

    Science Robotics5(47), eabc5986 (2020) https://doi.org/10.1126/scirobotics.abc5986 29

    Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V., Hutter, M.: Learning quadrupedal locomotion over challenging terrain. Science Robotics5(47), eabc5986 (2020) https://doi.org/10.1126/scirobotics.abc5986 29

  13. [13]

    Kumar, Z

    Kumar, A., Fu, Z., Pathak, D., Malik, J.: RMA: Rapid Motor Adaptation for Legged Robots. In: Proceedings of Robotics: Science and Systems, Virtual (2021). https://doi.org/10.15607/RSS.2021.XVII.011

  14. [14]

    Learning robust perceptive locomotion for quadrupedal robots in the wild,

    Miki, T., Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V., Hutter, M.: Learning robust perceptive locomotion for quadrupedal robots in the wild. Science Robotics 7(62), eabk2822 (2022) https://doi.org/10.1126/scirobotics.abk2822

  15. [15]

    Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning

    Makoviychuk, V., Wawrzyniak, L., Guo, Y., Lu, M., Storey, K., Macklin, M., Hoeller, D., Rudin, N., Allshire, A., Handa, A., State, G.: Isaac gym: High per- formance GPU based physics simulation for robot learning. In: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, vol. 1 (2021). https://doi.org/10.48550/arXiv.2108.10470

  16. [16]

    2026.doi: 10.48550/arXiv

    Rudin, N., Hoeller, D., Reist, P., Hutter, M.: Learning to walk in minutes using massively parallel deep reinforcement learning. In: Proceedings of the 5th Con- ference on Robot Learning, pp. 91–100 (2021). https://doi.org/10.48550/arXiv. 2109.11978

  17. [17]

    https://doi.org/10.48550/arXiv.2502.08844

    Zakka, K., Tabanpour, B., Liao, Q., Haiderbhai, M., Holt, S., Luo, J.Y., Allshire, A., Frey, E., Sreenath, K., Kahrs, L.A., Sferrazza, C., Tassa, Y., Abbeel, P.: MuJoCo Playground (2025). https://doi.org/10.48550/arXiv.2502.08844

  18. [18]

    Tziafas and H

    Xu, Z., Raj, A.H., Xiao, X., Stone, P.: Dexterous legged locomotion in confined 3D spaces with reinforcement learning. In: 2024 IEEE International Conference on Robotics and Automation (ICRA) (2024). https://doi.org/10.1109/ICRA57147. 2024.10610668

  19. [19]

    IEEE Robotics and Automation Letters9(11), 9986–9993 (2024) https://doi.org/10.1109/LRA.2024.3459797

    Luo, S., Li, S., Yu, R., Wang, Z., Wu, J., Zhu, Q.: PIE: Parkour with implicit- explicit learning framework for legged robots. IEEE Robotics and Automation Letters9(11), 9986–9993 (2024) https://doi.org/10.1109/LRA.2024.3459797

  20. [20]

    npj Robotics3(22) (2025) https://doi.org/10.1038/s44182-025-00043-2

    Xiao, E., Dong, Y., Lam, J., Lu, P.: Learning stable bipedal locomotion skills for quadrupedal robots on challenging terrains with automatic fall recovery. npj Robotics3(22) (2025) https://doi.org/10.1038/s44182-025-00043-2

  21. [21]

    In: 2019 IEEE International Conference on Robotics and Automation (ICRA), pp

    Klemm, V., Morra, A., Salzmann, C., Tschopp, F., Bodie, K., Gulich, L., K¨ ung, N., Mannhart, D., Pfister, C., Vierneisel, M., Weber, F., Deuber, R., Siegwart, R.: Ascento: A two-wheeled jumping robot. In: 2019 IEEE International Conference on Robotics and Automation (ICRA), pp. 7515–7521 (2019). https://doi.org/10. 1109/ICRA.2019.8793792

  22. [22]

    IEEE Robotics and Automation Letters5(2), 3745–3752 (2020) https://doi.org/10.1109/LRA.2020.2979625 30

    Klemm, V., Morra, A., Gulich, L., Mannhart, D., Rohr, D., Kamel, M., Viragh, Y., Siegwart, R.: LQR-assisted whole-body control of a wheeled bipedal robot with kinematic loops. IEEE Robotics and Automation Letters5(2), 3745–3752 (2020) https://doi.org/10.1109/LRA.2020.2979625 30

  23. [23]

    IEEE Robotics and Automation Letters4(2), 2116–2123 (2019) https://doi.org/10.1109/LRA.2019.2899750

    Bjelonic, M., Bellicoso, C.D., Viragh, Y., Sako, D., Tresoldi, F.D., Jenelten, F., Hutter, M.: Keep rollin’—whole-body motion control and planning for wheeled quadrupedal robots. IEEE Robotics and Automation Letters4(2), 2116–2123 (2019) https://doi.org/10.1109/LRA.2019.2899750

  24. [24]

    Transactions of the Japan Society of Mechanical Engineers, Series C75(759), 2996–3004 (2009) https://doi.org/10.1299/kikaic.75

    Takubo, T., Yoshioka, T., Arai, T., Mae, Y., Ohara, K.: Leg-wheel hybrid loco- motion for multi-legged robot. Transactions of the Japan Society of Mechanical Engineers, Series C75(759), 2996–3004 (2009) https://doi.org/10.1299/kikaic.75. 2996 . (in Japanese)

  25. [25]

    Journal of the Robotics Society of Japan40(5), 421–430 (2022) https://doi.org/10.7210/jrsj.40.421

    Oda, K., Ida, Y., Ishikawa, J., Hiraoka, M., Hyon, S.-H.: Realization of whole-body torque-controlled hydraulic wheel-on-leg rover. Journal of the Robotics Society of Japan40(5), 421–430 (2022) https://doi.org/10.7210/jrsj.40.421 . (in Japanese)

  26. [26]

    In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, pp

    Besseron, G., Grand, C., Amar, F.B., Bidaud, P.: Decoupled control of the high mobility robot Hylos based on a dynamic stability margin. In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, pp. 2435–2440 (2008). https://doi.org/10.1109/IROS.2008.4651092

  27. [27]

    InProceedings of Robotics: Science and Systems, DOI: 10.15607/RSS

    Pinto, L., Andrychowicz, M., Welinder, P., Zaremba, W., Abbeel, P.: Asymmetric actor critic for image-based robot learning. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018). https://doi.org/10.15607/RSS. 2018.XIV.008

  28. [28]

    Proximal Policy Optimization Algorithms

    Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal Policy Optimization Algorithms (2017). https://doi.org/10.48550/arXiv.1707.06347

  29. [29]

    Domain randomization for transferring deep neural networks from simulation to the real world

    Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30 (2017). https://doi.org/10.1109/IROS.2017.8202133

  30. [30]

    IEEE Transactions on Robotics38(4), 2053–2073 (2022) https: //doi.org/10.1109/TRO.2022.3141876

    Xu, W., Cai, Y., He, D., Lin, J., Zhang, F.: FAST-LIO2: Fast direct LiDAR- inertial odometry. IEEE Transactions on Robotics38(4), 2053–2073 (2022) https: //doi.org/10.1109/TRO.2022.3141876

  31. [31]

    Journal of Open Source Software9(100), 6948 (2024) https://doi.org/10

    Koide, K.: small gicp: Efficient and parallel algorithms for point cloud registra- tion. Journal of Open Source Software9(100), 6948 (2024) https://doi.org/10. 21105/joss.06948

  32. [32]

    In: Proceedings of the Japan Society of Mechanical Engineers Robotics and Mechatronics Conference (ROBOMECH2021) (2021)

    Irie, K., Suzuki, T., Hara, Y., Yoshida, T., Tomono, M., Nishimura, K., Yamato, H., Shimizu, M.: Autonomous navigation of a legged robot using data-driven path following. In: Proceedings of the Japan Society of Mechanical Engineers Robotics and Mechatronics Conference (ROBOMECH2021) (2021). (in Japanese) 1P1-L01

  33. [33]

    31 Journal of Robotics and Mechatronics30(4), 504–512 (2018) https://doi.org/10

    Yuta, S.: Tsukuba Challenge: Open experiments for autonomous navigation of mobile robots in the city – activities and results of the first and second stages –. 31 Journal of Robotics and Mechatronics30(4), 504–512 (2018) https://doi.org/10. 20965/jrm.2018.p0504

  34. [34]

    Journal of Robotics and Mechatronics32(6), 1104–1111 (2020) https://doi.org/10.20965/jrm.2020.p1104

    Hara, Y., Tomizawa, T., Date, H., Kuroda, Y., Tsubouchi, T.: Tsukuba Chal- lenge 2019: Task settings and experimental results. Journal of Robotics and Mechatronics32(6), 1104–1111 (2020) https://doi.org/10.20965/jrm.2020.p1104

  35. [35]

    running the Tsukuba Challenge 2007

    Saiki, Y.M., Takeuchi, E., Carballo, A., Tokunaga, W., Kuniyoshi, H., Aburadani, A., Hirosawa, A., Nagasaka, Y., Suzuki, Y., Tsubouchi, T.: 1Km autonomous robot navigation on outdoor pedestrian paths “running the Tsukuba Challenge 2007”. In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 219–225 (2008). https://doi.org/10.11...

  36. [36]

    Robotics and Autonomous Systems179, 104750 (2024) https://doi.org/10.1016/j.robot.2024

    Koide, K., Yokozuka, M., Oishi, S., Banno, A.: GLIM: 3D range-inertial local- ization and mapping with GPU-accelerated scan matching factors. Robotics and Autonomous Systems179, 104750 (2024) https://doi.org/10.1016/j.robot.2024. 104750

  37. [37]

    https://github.com/tsukubachallenge/tc-datasets

    Tsukuba Challenge Datasets: Real World Datasets for Autonomous Navigation. https://github.com/tsukubachallenge/tc-datasets. Accessed: 2026-06-11

  38. [38]

    Learning to Balance Motor Thermal Safety and Quadrupedal Locomotion Performance with Residual Policy

    Wan, Y., Lin, W., Qian, L., Zou, Y., Wu, W., Wu, S., Zhao, C., Luo, X.: Learning to Balance Motor Thermal Safety and Quadrupedal Locomotion Performance with Residual Policy (2026). https://doi.org/10.48550/arXiv.2605.27046

  39. [39]

    In: Proceedings of Robotics: Science and Systems, Virtual (2021)

    Zhang, Z., Fisac, J.F.: Safe Occlusion-Aware autonomous driving via Game- Theoretic active perception. In: Proceedings of Robotics: Science and Systems, Virtual (2021). https://doi.org/10.15607/RSS.2021.XVII.066

  40. [40]

    IEEE Transactions on Control Systems Technology33(3), 940–951 (2025) https://doi

    Firoozi, R., Mir, A., Camps, G.S., Schwager, M.: OA-MPC: Occlusion-Aware MPC for guaranteed safe robot navigation with unseen dynamic obstacles. IEEE Transactions on Control Systems Technology33(3), 940–951 (2025) https://doi. org/10.1109/TCST.2024.3520462

  41. [41]

    Autonomous Robots49, 19 (2025) https://doi.org/10.1007/s10514-025-10202-x

    Mattamala, M., Frey, J., Libera, P., Chebrolu, N., Martius, G., Cadena, C., Hutter, M., Fallon, M.: Wild visual navigation: Fast traversability learning via pre-trained models and online self-supervision. Autonomous Robots49, 19 (2025) https://doi.org/10.1007/s10514-025-10202-x

  42. [42]

    IEEE Robotics and Automation Letters9(11), 10423–10430 (2024) https://doi

    Kim, Y., Lee, J.H., Lee, C., Mun, J., Youm, D., Park, J., Hwangbo, J.: Learning semantic traversability with egocentric video and automated annotation strategy. IEEE Robotics and Automation Letters9(11), 10423–10430 (2024) https://doi. org/10.1109/LRA.2024.3474548

  43. [43]

    Random Forests,

    Lu, F., Milios, E.: Globally consistent range scan alignment for environment mapping. Autonomous Robots4, 333–349 (1997) https://doi.org/10.1023/A: 1008854305733 32

  44. [44]

    In: Pro- ceedings of the 35th International Conference on Machine Learning

    Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: Pro- ceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 1861–1870 (2018) 33