Recognition: unknown
asRoBallet: Closing the Sim2Real Gap via Friction-Aware Reinforcement Learning for Underactuated Spherical Dynamics
Pith reviewed 2026-05-08 03:23 UTC · model grok-4.3
The pith
Friction-aware RL policy achieves zero-shot transfer to real humanoid ballbot hardware
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce asRoBallet, to the best of our knowledge, the first end-to-end reinforcement learning (RL) locomotion policy deployed on a humanoid ballbot hardware platform. A high-fidelity MuJoCo simulation explicitly models the discrete roller mechanics of ETH-type omni-wheels to capture parasitic vibrations and contact discontinuities. A Friction-Aware Reinforcement Learning framework masters the coupled rolling, lateral, and torsional friction channels at the wheel-ball and ball-floor interfaces to achieve zero-shot Sim2Real transfer.
What carries the argument
High-fidelity MuJoCo simulation of discrete omni-wheel roller mechanics combined with a Friction-Aware Reinforcement Learning framework that masters coupled friction channels
If this is right
- Zero-shot deployment of the RL policy on real hardware without additional training
- Accurate capture of previously ignored vibrations and discontinuous contacts in omni-wheel systems
- Low-cost ballbot platform constructed by repurposing quadruped robot parts
- Intuitive single-operator control through a generalized iOS ecosystem for expressive maneuvers
Where Pith is reading between the lines
- The friction-modeling approach could extend to other nonholonomic platforms that rely on rolling contact, such as certain wheeled manipulators.
- Adding sensor noise or terrain variation directly into the same simulation might further improve policy robustness without changing the training pipeline.
- The subtractive hardware design process could be applied to other robot morphologies to reduce development cost while preserving research utility.
Load-bearing premise
The MuJoCo simulation accurately reproduces the real-world parasitic vibrations, contact discontinuities, and coupled friction effects at the wheel-ball and ball-floor interfaces.
What would settle it
Deploy the trained policy on the physical ballbot and observe whether it produces stable locomotion and balancing without any hardware-specific tuning or retraining; repeated failure would show the modeled friction channels do not close the gap.
Figures
read the original abstract
We introduce asRoBallet, to the best of our knowledge, the first end-to-end reinforcement learning (RL) locomotion policy deployed on a humanoid ballbot hardware platform. Historically, ballbots have served as a canonical benchmark for underactuated and nonholonomic control, which are characterized by a reality gap in complex friction models for wheel-ball-floor interactions. While current literature demonstrates successful handling of 3D balancing with LQR and MPC, transitioning to actual hardware for a humanoid ballbot using RL is currently hindered by critical gaps in contact modeling, actuator latency & jitter, and safe hardware exploration. This study proposes a high-fidelity MuJoCo simulation that explicitly models the discrete roller mechanics of ETH-type omni-wheels, thereby capturing parasitic vibrations and contact discontinuities that have previously been ignored. We also developed a Friction-Aware Reinforcement Learning framework that achieves zero-shot Sim2Real transfer by mastering the coupled rolling, lateral, and torsional friction channels at the wheel-ball and ball-floor interfaces. We designed asRoBallet through subtractive reconfiguration, repurposing key components from an overconstrained quadruped and integrating them into a newly designed structural frame to achieve a robust research platform at low cost. We also developed a generalized iOS ecosystem that transforms consumer electronics into a low-latency interface, enabling a single operator to orchestrate expressive humanoid maneuvers via intuitive natural motion.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces asRoBallet as the first end-to-end RL locomotion policy deployed on humanoid ballbot hardware. It proposes a high-fidelity MuJoCo simulation explicitly modeling discrete roller mechanics of ETH-type omni-wheels to capture parasitic vibrations and contact discontinuities previously ignored. A Friction-Aware RL framework is claimed to enable zero-shot Sim2Real transfer by mastering coupled rolling, lateral, and torsional friction channels at wheel-ball and ball-floor interfaces. The platform is constructed via subtractive reconfiguration from an overconstrained quadruped, with an iOS-based low-latency control interface.
Significance. If the zero-shot hardware transfer is quantitatively validated, the work would advance RL application to underactuated nonholonomic systems by demonstrating that explicit friction modeling in simulation can close the reality gap for ballbot locomotion without real-world fine-tuning. This could inform contact-rich control for other spherical or wheeled platforms where friction discontinuities dominate dynamics.
major comments (2)
- [§3] §3 (Simulation and Contact Modeling): The assertion that the MuJoCo model accurately captures coupled rolling/lateral/torsional friction and parasitic vibrations at wheel-ball and ball-floor interfaces is not supported by any quantitative validation (e.g., force-torque sensor comparisons, frequency-domain vibration matching, or ablation on contact parameters). This undermines the central claim that friction-aware RL, rather than policy robustness to mismatch, enables zero-shot transfer.
- [Results] Results and Experiments: No error metrics, success rates, ablation studies, or baseline comparisons (e.g., vs. LQR/MPC or non-friction-aware RL) are reported for the hardware deployment. The abstract and methods describe successful deployment but supply no data to evaluate the zero-shot claim or the contribution of the friction modeling.
minor comments (2)
- [Abstract] The abstract states 'to the best of our knowledge' for the first RL deployment on humanoid ballbot; a brief literature comparison table would strengthen this novelty claim.
- [§3] Notation for friction channels (rolling, lateral, torsional) is introduced without explicit equations or parameter values in the provided description; adding these in §3 would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below and outline the revisions we will make to strengthen the quantitative support for our claims.
read point-by-point responses
-
Referee: [§3] §3 (Simulation and Contact Modeling): The assertion that the MuJoCo model accurately captures coupled rolling/lateral/torsional friction and parasitic vibrations at wheel-ball and ball-floor interfaces is not supported by any quantitative validation (e.g., force-torque sensor comparisons, frequency-domain vibration matching, or ablation on contact parameters). This undermines the central claim that friction-aware RL, rather than policy robustness to mismatch, enables zero-shot transfer.
Authors: We agree that the manuscript would benefit from explicit quantitative validation of the contact model. The simulation parameters were selected based on manufacturer data for the ETH-type omni-wheels and iterative tuning to reproduce observed hardware dynamics, including parasitic vibrations. However, direct comparisons such as force-torque measurements or frequency-domain matching were not included in the original submission. In the revised version, we will add an ablation study varying friction coefficients across the three channels and include spectral analysis of simulated versus hardware vibrations to better substantiate the modeling fidelity and isolate the contribution of friction awareness. revision: yes
-
Referee: [Results] Results and Experiments: No error metrics, success rates, ablation studies, or baseline comparisons (e.g., vs. LQR/MPC or non-friction-aware RL) are reported for the hardware deployment. The abstract and methods describe successful deployment but supply no data to evaluate the zero-shot claim or the contribution of the friction modeling.
Authors: We acknowledge that the hardware results section currently relies on qualitative demonstration of successful zero-shot transfer without accompanying quantitative metrics. The manuscript prioritizes the novelty of the first end-to-end RL deployment on this platform and the simulation framework, but this leaves the zero-shot performance and the specific role of friction modeling insufficiently quantified. We will revise the results to include success rates over multiple trials, trajectory tracking error metrics, and ablations comparing the friction-aware policy against a non-friction-aware baseline, thereby providing clearer evidence for the contribution of the proposed approach. revision: yes
Circularity Check
No circularity: empirical hardware result with no self-referential derivations
full rationale
The paper's core claim is an experimental outcome: successful zero-shot deployment of an end-to-end RL policy on physical ballbot hardware after training in a custom MuJoCo simulator. The abstract and provided context describe modeling choices (discrete roller geometry, friction channels) and an RL framework, but contain no equations, parameter-fitting procedures, or predictions that reduce to their own inputs by construction. No self-citations, uniqueness theorems, or ansatzes are invoked as load-bearing steps. The result is presented as a hardware validation rather than a closed mathematical derivation, making the derivation chain self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Lauwers, George A
Tom B. Lauwers, George A. Kantor, and Ralph L. Hollis. A Dynamically Stable Single-Wheeled Mobile Robot with Inverse Mouse-Ball Drive. InIEEE International Conference on Robotics and Automation (ICRA), pages 2884–2889, 2006
2006
-
[2]
Umashankar Nagarajan, George Kantor, and Ralph L. Hollis. The Ballbot: An Omnidirectional Balancing Mobile Robot.The International Journal of Robotics Research, 33(6):917–930, 2014
2014
-
[3]
Hsiao-Wecksler
Seung Yun Song, Nadja Marin, Chenzhang Xiao, Ryu Okubo, Joao Ramos, and Elizabeth T. Hsiao-Wecksler. Hands-Free Physical Human-Robot Interaction and Testing for Navigating a Virtual Ballbot. InIEEE International Conference on Robot and Human Interactive Communication (RO-MAN), pages 556–563, 2023
2023
-
[4]
Hsiao-Wecksler
Chenzhang Xiao, Mahshid Mansouri, David Lam, Joao Ramos, and Elizabeth T. Hsiao-Wecksler. Design and Control of a Ballbot Drivetrain with High Agility, Minimal Footprint, and High Payload. InIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 376–383, 2023
2023
-
[5]
Bachelor thesis, ETH Zurich, 2013
Christoph Skrabel.Mechanical Design of a Ballbot Platform. Bachelor thesis, ETH Zurich, 2013
2013
-
[6]
Bachelor thesis, Aalborg University, 2019
Thomas Kølbæk Jespersen.Kugle - Modelling and Control of a Ball-balancing Robot. Bachelor thesis, Aalborg University, 2019
2019
-
[7]
Momentum-Based Whole-Body Optimal Planning for a Single-Spherical-Wheeled Balancing Mobile Manipulator
Roberto Shu and Ralph Hollis. Momentum-Based Whole-Body Optimal Planning for a Single-Spherical-Wheeled Balancing Mobile Manipulator. InIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3221–3226, 2021
2021
-
[8]
Stanford Doggo: An Open-Source, Quasi-Direct- Drive Quadruped
Nathan Kau, Aaron Schultz, Natalie Ferrante, and Patrick Slade. Stanford Doggo: An Open-Source, Quasi-Direct- Drive Quadruped. InIEEE International Conference on Robotics and Automation (ICRA), pages 6309–6315, 2019
2019
-
[9]
Smartphone Based Robotics:Powerful, Flexible and Inexpensive Robots forHobbyists, Educators, Students and Researchers
Nicolas Oros and Jeffrey L Krichmar. Smartphone Based Robotics:Powerful, Flexible and Inexpensive Robots forHobbyists, Educators, Students and Researchers. Technical report, University of California, Irvine, 2013
2013
-
[10]
Path-Following Model Predictive Control of Ballbots
Thomas K Jespersen, Mohammad al Ahdab, F Mendez Juan de Dios, Malte R Damgaard, Karl D Hansen, Rasmus Pedersen, and Thomas Bak. Path-Following Model Predictive Control of Ballbots. InIEEE International Conference on Robotics and Automation (ICRA), pages 1498–1504, 2020
2020
-
[11]
Parameter Identification and LQR/MPC Balancing Control of a Ballbot
Max Studt, Ievgen Zhavzharov, and Hossam S Abbas. Parameter Identification and LQR/MPC Balancing Control of a Ballbot. InEuropean Control Conference (ECC), pages 1315–1321, 2022
2022
-
[12]
John Wiley & Sons, Ltd, 2013
Bharat Bhushan.Introduction to Tribology. John Wiley & Sons, Ltd, 2013
2013
-
[13]
Learning Ball-Balancing Robot through Deep Reinforcement Learning
Yifan Zhou, Jianghao Lin, Shuai Wang, and Chong Zhang. Learning Ball-Balancing Robot through Deep Reinforcement Learning. InInternational Conference on Computer, Control and Robotics (ICCCR), pages 1–8, 2021
2021
-
[14]
Reinforcement Learning for Ballbot Navigation in Uneven Terrain, 2025
Achkan Salehi. Reinforcement Learning for Ballbot Navigation in Uneven Terrain, 2025. arXiv:2505.18417 [cs.RO]
-
[15]
Overconstrained Locomotion
Haoran Sun, Bangchao Huang, Zishang Zhang, Ronghan Xu, Guojing Huang, Guangyi Huang, Jiayi Yin, Nuofan Qiu, Hua Chen, Wei Zhang, Jia Pan, Fang Wan, and Chaoyang Song. Overconstrained Locomotion. In International Symposium of Robotics Research (ISRR), 2024
2024
-
[16]
One-DoF Robotic Design of Overconstrained Limbs with Energy-Efficient, Self-Collision-Free Motion
Yuping Gu, Bangchao Huang, Haoran Sun, Ronghan Xu, Jiayi Yin, Wei Zhang, Fang Wan, Jia Pan, and Chaoyang Song. One-DoF Robotic Design of Overconstrained Limbs with Energy-Efficient, Self-Collision-Free Motion. Fundamental Research, 21(5):1571, 2025
2025
-
[17]
SeeThruFinger: See and Grasp Anything with a Multi-Modal Soft Touch, 2025
Fang Wan and Chaoyang Song. SeeThruFinger: See and Grasp Anything with a Multi-Modal Soft Touch, 2025. arXiv:2312.09822 [cs.RO]
-
[18]
Anchoring Morphological Representations Unlocks Latent Proprioception in Soft Robots.Advanced Intelligent Systems, 7(12):e202500444, 2025
Xudong Han, Ning Guo, Ronghan Xu, Fang Wan, and Chaoyang Song. Anchoring Morphological Representations Unlocks Latent Proprioception in Soft Robots.Advanced Intelligent Systems, 7(12):e202500444, 2025
2025
-
[19]
Rezero, Focus Project Report
Simon Doessegger, Peter Fankhauser, Corsin Gwerder, Jonathan Huessy, Jerome Kaeser, Thomas Kammermann, Lukas Limacher, and Michael Neunert. Rezero, Focus Project Report. Technical report, ETH Zurich, 2010
2010
-
[20]
Kevin Zakka, Baruch Tabanpour, Qiayuan Liao, Mustafa Haiderbhai, Samuel Holt, Jing Yuan Luo, Arthur Allshire, Erik Frey, Koushil Sreenath, Lueder A. Kahrs, Carmelo Sferrazza, Yuval Tassa, and Pieter Abbeel. MuJoCo Playground, 2025. arXiv:2502.08844 [cs.RO]
-
[21]
Human-Robot Perception in Industrial Environments: A Survey.Sensors, 21(5):1571, 2021
Andrea Bonci, Pangcheng David Cen Cheng, Marina Indri, Giacomo Nabissi, and Fiorella Sibona. Human-Robot Perception in Industrial Environments: A Survey.Sensors, 21(5):1571, 2021. 16 APREPRINT- APRIL29, 2026
2021
-
[22]
Cross- embodiment robot manipulation skill transfer using la- tent space alignment,
Tianyu Wang, Dwait Bhatt, Xiaolong Wang, and Nikolay Atanasov. Cross-Embodiment Robot Manipulation Skill Transfer using Latent Space Alignment, 2024. arXiv:2406.01968 [cs.RO]
-
[23]
Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation
Ria Doshi, Homer Rich Walke, Oier Mees, Sudeep Dasari, and Sergey Levine. Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation. InConference on Robot Learning (CoRL), 2024
2024
-
[24]
Chengshu Li, Ruohan Zhang, Josiah Wong, Cem Gokmen, Sanjana Srivastava, Roberto Martín-Martín, Chen Wang, Gabrael Levine, Wensi Ai, Benjamin Martinez, Hang Yin, Michael Lingelbach, Minjune Hwang, Ayano Hiranaka, Sujay Garlanka, Arman Aydin, Sharon Lee, Jiankai Sun, Mona Anvari, Manasi Sharma, Dhruva Bansal, Samuel Hunter, Kyu-Young Kim, Alan Lou, Caleb R....
work page internal anchor Pith review arXiv 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.