AhaRobot: A Low-Cost Open-Source Bimanual Mobile Manipulator for Embodied AI
Pith reviewed 2026-05-22 23:58 UTC · model grok-4.3
The pith
A $1,000 open-source bimanual robot achieves 0.7 mm repeatability for embodied AI data collection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The hardware-control co-design of AhaRobot delivers 0.7 mm repeatability at a total hardware cost of only $1,000. The 26-faced handle reduces tracking error by 80% over a 6-faced baseline and improves data-collection efficiency by 30%, while supporting long-horizon tasks and singularities in remote teleoperation.
What carries the argument
SCARA-like dual-arm hardware with dual-motor backlash mitigation, dithering for friction compensation, and the 26-faced marker handle in the RoboPilot teleoperation interface.
If this is right
- Enables large-scale collection of diverse manipulation data for training Vision-Language-Action models.
- Supports imitation learning of complex household behaviors involving bimanual coordination, upper-body mobility, and contact-rich interaction.
- Maintains data quality comparable to VR-based collection at far lower cost.
- Allows fully remote long-horizon teleoperation without singularity issues.
Where Pith is reading between the lines
- The open-source release could let other labs adapt the arm geometry or handle for non-household tasks.
- The marker design might transfer to improve precision in existing commercial tracking setups.
- Long-term wear tests by multiple groups would be needed to confirm sustained 0.7 mm performance.
Load-bearing premise
The reported repeatability, error reduction, and data quality will hold when independent groups replicate the system under varied real-world lighting, operator skill levels, and long-term hardware wear.
What would settle it
An independent build and test that measures repeatability worse than 2 mm or tracking error reduction below 50 percent under standard indoor conditions would falsify the performance claims.
Figures
read the original abstract
Scaling Vision-Language-Action models for embodied manipulation demands large volumes of diverse manipulation data, yet the high cost of commercial mobile manipulators and teleoperation interfaces that are difficult to deploy at scale remain key bottlenecks. We present AhaRobot, a low-cost, fully open-source bimanual mobile manipulator tailored for Embodied-AI. The system contributes: (1) a SCARA-like dual-arm hardware design that reduces motor torque demands while maintaining a large vertical reachable workspace, (2) an optimized control stack that improves precision via dual-motor backlash mitigation and static-friction compensation through dithering, and (3) RoboPilot, a teleoperation interface featuring a novel 26-faced marker handle for precise, long-horizon remote data collection. Experimental results show that our hardware-control co-design achieves 0.7 mm repeatability at a total hardware cost of only $1,000. The proposed 26-faced handle reduces tracking error by 80% over a 6-faced baseline and improves data-collection efficiency by 30%, while robustly handling singularities and supporting extremely long-horizon tasks in fully remote settings. Despite its low cost, AhaRobot enables imitation learning of complex household behaviors involving bimanual coordination, upper-body mobility, and contact-rich interaction, with data quality comparable to VR-based collection. All software, CAD files, and documentation are available at https://aha-robot.github.io.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces AhaRobot, a fully open-source bimanual mobile manipulator costing $1,000, designed to address data collection bottlenecks for vision-language-action models. It contributes a SCARA-like dual-arm hardware architecture, a control stack incorporating dual-motor backlash mitigation and dithering-based friction compensation, and the RoboPilot teleoperation system featuring a novel 26-faced marker handle. Reported results include 0.7 mm repeatability, an 80% reduction in tracking error versus a 6-faced baseline, a 30% gain in data-collection efficiency, robust singularity handling for long-horizon tasks, and imitation-learning data quality comparable to VR systems, with all CAD, code, and documentation released publicly.
Significance. If the experimental results replicate, the work has clear significance for embodied AI by substantially lowering the cost of high-quality bimanual and mobile manipulation data collection. The explicit bill-of-materials, control equations, marker geometry, and experimental protocols constitute a reproducible contribution. The open-source release of hardware designs and software is a particular strength that directly supports community adoption and extension beyond the authors' lab.
minor comments (2)
- [Experimental Results] Experimental Results section: the repeatability trials and tracking-error comparisons would benefit from an explicit statement of trial count, operator count, and lighting conditions to strengthen the replication claim.
- [Experimental Results] The comparison of data quality to VR systems is stated qualitatively; a short quantitative table (e.g., success rates or trajectory smoothness metrics) would make the claim more precise without altering scope.
Simulated Author's Rebuttal
We thank the referee for the positive evaluation of AhaRobot and the recommendation for minor revision. The assessment correctly identifies the core contributions in hardware design, control, and the open-source teleoperation interface. No major comments were raised in the report.
Circularity Check
No significant circularity detected
full rationale
The manuscript presents a hardware design, control stack (backlash mitigation and dithering), and 26-faced marker handle whose performance claims rest on direct experimental measurements (repeatability trials, tracking-error comparisons to a 6-faced baseline, data-collection efficiency) and a bill-of-materials. No equations, fitted parameters, or self-citations are invoked in a load-bearing way that reduces a claimed result to its own inputs by construction. The derivation chain consists of physical construction followed by empirical validation, which is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
Nori Bot: A Sub-$1,000 Floor-to-Counter Mobile Manipulator
Nori Bot is a 17-DoF dual-arm mobile manipulator costing $947 with a 600 mm Z-axis lift, Raspberry Pi proactive control, and current-based servo protection.
Reference graph
Works this paper leans on
-
[1]
Learning fine-grained bimanual manipulation with low-cost hardware,
T. Z. Zhao, V . Kumar, S. Levine, and C. Finn, “Learning fine-grained bimanual manipulation with low-cost hardware,” in RSS, 2023
work page 2023
-
[2]
Diffusion policy: Visuomotor policy learning via action diffusion,
C. Chi, S. Feng, Y . Du, Z. Xu, E. Cousineau, B. C. Burchfiel, and S. Song, “Diffusion policy: Visuomotor policy learning via action diffusion,” in RSS, 2023
work page 2023
-
[3]
Openvla: An open-source vision-language-action model,
M. J. Kim, K. Pertsch, S. Karamcheti, T. Xiao, A. Balakrishna, S. Nair, R. Rafailov, E. P. Foster, P. R. Sanketi, Q. Vuong, T. Kollar, B. Burchfiel, R. Tedrake, D. Sadigh, S. Levine, P. Liang, and C. Finn, “Openvla: An open-source vision-language-action model,” in CoRL, 2025
work page 2025
-
[4]
Octo: An open-source generalist robot policy,
O. team, “Octo: An open-source generalist robot policy,” in RSS, 2024
work page 2024
-
[5]
Rdt-1b: A diffusion foundation model for bimanual manipulation,
S. Liu, L. Wu, B. Li, H. Tan, H. Chen, Z. Wang, K. Xu, H. Su, and J. Zhu, “Rdt-1b: A diffusion foundation model for bimanual manipulation,” in ICLR, 2025
work page 2025
-
[6]
Demonstrating ok-robot: What really matters in integrating open- knowledge models for robotics,
P. Liu, Y . Orru, J. Vakil, C. Paxton, N. M. M. Shafiullah, and L. Pinto, “Demonstrating ok-robot: What really matters in integrating open- knowledge models for robotics,” in RSS, 2024
work page 2024
-
[7]
Navid: Video-based vlm plans the next step for vision-and-language navigation,
J. Zhang, K. Wang, R. Xu, G. Zhou, Y . Hong, X. Fang, Q. Wu, Z. Zhang, and H. Wang, “Navid: Video-based vlm plans the next step for vision-and-language navigation,” in RSS, 2024
work page 2024
-
[8]
Navgpt-2: Unleashing navigational reasoning capability for large vision-language models,
G. Zhou, Y . Hong, Z. Wang, X. E. Wang, and Q. Wu, “Navgpt-2: Unleashing navigational reasoning capability for large vision-language models,” in ECCV, 2025
work page 2025
-
[9]
Mobile aloha: Learning bimanual mobile manipulation using low-cost whole-body teleoperation,
Z. Fu, T. Z. Zhao, and C. Finn, “Mobile aloha: Learning bimanual mobile manipulation using low-cost whole-body teleoperation,” in CoRL, 2024
work page 2024
-
[10]
Whole-body teleoperation for mobile manipulation at zero added cost,
D. Honerkamp, H. Mahesheka, J. O. von Hartz, T. Welschehold, and A. Valada, “Whole-body teleoperation for mobile manipulation at zero added cost,” IEEE Robotics and Automation Letters , 2025
work page 2025
-
[11]
Open-television: Teleoperation with immersive active visual feedback,
X. Cheng, J. Li, S. Yang, G. Yang, and X. Wang, “Open-television: Teleoperation with immersive active visual feedback,” in CoRL, 2024
work page 2024
-
[12]
Bunny-visionpro: Real-time bimanual dexterous teleoperation for imitation learning,
R. Ding, Y . Qin, J. Zhu, C. Jia, S. Yang, R. Yang, X. Qi, and X. Wang, “Bunny-visionpro: Real-time bimanual dexterous teleoperation for imitation learning,” arXiv:2407.03162, 2024
-
[13]
Omnih2o: Universal and dexterous human-to- humanoid whole-body teleoperation and learning,
T. He, Z. Luo, X. He, W. Xiao, C. Zhang, W. Zhang, K. M. Kitani, C. Liu, and G. Shi, “Omnih2o: Universal and dexterous human-to- humanoid whole-body teleoperation and learning,” in CoRL, 2024
work page 2024
-
[14]
Precise and dexterous robotic manipulation via human-in-the-loop reinforcement learning,
J. Luo, C. Xu, J. Wu, and S. Levine, “Precise and dexterous robotic manipulation via human-in-the-loop reinforcement learning,” arxiv:2410.21845, 2024
-
[15]
Gello: A general, low- cost, and intuitive teleoperation framework for robot manipulators,
P. Wu, F. Shentu, X. Lin, and P. Abbeel, “Gello: A general, low- cost, and intuitive teleoperation framework for robot manipulators,” in CoRL, 2023
work page 2023
-
[16]
Airexo: Low-cost exoskeletons for learning whole-arm manipulation in the wild,
H. Fang, H.-S. Fang, Y . Wang, J. Ren, J. Chen, R. Zhang, W. Wang, and C. Lu, “Airexo: Low-cost exoskeletons for learning whole-arm manipulation in the wild,” in ICRA, 2024
work page 2024
-
[17]
Ace: A cross-platform and visual-exoskeletons system for low-cost dexterous teleoperation,
S. Yang, M. Liu, Y . Qin, R. Ding, J. Li, X. Cheng, R. Yang, S. Yi, and X. Wang, “Ace: A cross-platform and visual-exoskeletons system for low-cost dexterous teleoperation,” in CoRL, 2024
work page 2024
-
[18]
Bimanual dexterity for complex tasks,
K. Shaw, Y . Li, J. Yang, M. K. Srirama, R. Liu, H. Xiong, R. Men- donca, and D. Pathak, “Bimanual dexterity for complex tasks,” in CoRL, 2024
work page 2024
-
[19]
The design of stretch: A compact, lightweight mobile manipulator for indoor human environments,
C. C. Kemp, A. Edsinger, H. M. Clever, and B. Matulevich, “The design of stretch: A compact, lightweight mobile manipulator for indoor human environments,” in ICRA, 2022
work page 2022
-
[20]
Droid: A large-scale in-the-wild robot manipulation dataset,
D. Team, “Droid: A large-scale in-the-wild robot manipulation dataset,” in RSS, 2024
work page 2024
- [21]
-
[22]
Demonstrating adap- tive mobile manipulation in retail environments,
M. Spahn, C. Pezzato, C. Salmi, R. Dekker, C. Wang, C. Pek, J. Kober, J. Alonso-Mora, C. H. Corbato, and M. Wisse, “Demonstrating adap- tive mobile manipulation in retail environments,” in RSS, 2024
work page 2024
-
[23]
Y . Peng, Z. Wang, Y . Zhang, S. Zhang, N. Cai, F. Wu, and M. Chen, “Revolutionizing battery disassembly: The design and implementation of a battery disassembly autonomous mobile manipulator robot(beam- 1),” in IROS, 2024
work page 2024
-
[24]
Tidybot++: An open-source holonomic mobile manipulator for robot learning,
J. Wu, W. Chong, R. Holmberg, A. Prasad, Y . Gao, O. Khatib, S. Song, S. Rusinkiewicz, and J. Bohg, “Tidybot++: An open-source holonomic mobile manipulator for robot learning,” in CoRL, 2024
work page 2024
-
[25]
S. Xie, C. Hu, D. Wang, J. Johnson, M. Bagavathiannan, and D. Song, “Coupled active perception and manipulation planning for a mobile manipulator in precision agriculture applications,” in ICRA, 2024
work page 2024
-
[26]
Dynamic inter- action control in legged mobile manipulators: A decoupled approach,
Q. Li, Q. Meng, Y . Qin, J. Chen, X. Ding, and K. Xu, “Dynamic inter- action control in legged mobile manipulators: A decoupled approach,” in ICRA, 2024
work page 2024
-
[27]
Learning to open and traverse doors with a legged manipulator,
M. Zhang, Y . Ma, T. Miki, and M. Hutter, “Learning to open and traverse doors with a legged manipulator,” in CoRL, 2024
work page 2024
-
[28]
A mobile manipulation system for one-shot teaching of complex tasks in homes,
M. Bajracharya, J. Borders, D. Helmick, T. Kollar, M. Laskey, J. Leichty, J. Ma, U. Nagarajan, A. Ochiai, J. Petersen, K. Shankar, K. Stone, and Y . Takaoka, “A mobile manipulation system for one-shot teaching of complex tasks in homes,” in ICRA, 2020
work page 2020
-
[29]
Demonstrating mobile manipulation in the wild: A metrics-driven approach,
M. Bajracharya, J. Borders, R. Cheng, D. Helmick, L. Kaul, D. Kruse, J. Leichty, J. Ma, C. Matl, F. Michel, C. Papazov, J. Petersen, K. Shankar, and M. Tjersland, “Demonstrating mobile manipulation in the wild: A metrics-driven approach,” in RSS, 2023
work page 2023
-
[30]
Design of stickbug: A six-armed precision pollination robot,
T. Smith, M. Rijal, C. Tatsch, R. M. Butts, J. Beard, R. T. Cook, A. Chu, J. Gross, and Y . Gu, “Design of stickbug: A six-armed precision pollination robot,” in IROS, 2024
work page 2024
-
[31]
C. Lenz, M. Schwarz, A. Rochow, B. P ¨atzold, R. Memmesheimer, M. Schreiber, and S. Behnke, “Nimbro wins ana avatar xprize immer- sive telepresence competition: Human-centric evaluation and lessons learned,” International Journal of Social Robotics , 2023
work page 2023
-
[32]
Temporal difference learning for model predictive control,
N. A. Hansen, H. Su, and X. Wang, “Temporal difference learning for model predictive control,” in ICML, 2022
work page 2022
-
[33]
$\pi_0$: A Vision-Language-Action Flow Model for General Robot Control
π0 Team, “π0: A vision-language-action flow model for general robot control,” arXiv:2410.24164, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[34]
Open teach: A versatile teleoperation system for robotic manipulation,
A. Iyer, Z. Peng, Y . Dai, I. Guzey, S. Haldar, S. Chintala, and L. Pinto, “Open teach: A versatile teleoperation system for robotic manipulation,” in CoRL, 2024
work page 2024
-
[35]
Marionet: Motion acquisition for robots through iterative online evaluative training,
A. Setapen, M. Quinlan, and P. Stone, “Marionet: Motion acquisition for robots through iterative online evaluative training,” in AAMAS, 2010
work page 2010
-
[36]
C. Stanton, A. Bogdanovych, and E. Ratanasena, “Teleoperation of a humanoid robot using full-body motion capture, example movements, and machine learning,” Australasian Conference on Robotics and Automation, 2012
work page 2012
-
[37]
Universal manipulation interface: In-the-wild robot teaching without in-the-wild robots,
C. Chi, Z. Xu, C. Pan, E. Cousineau, B. Burchfiel, S. Feng, R. Tedrake, and S. Song, “Universal manipulation interface: In-the-wild robot teaching without in-the-wild robots,” in RSS, 2024
work page 2024
-
[38]
Fast-umi: A scalable and hardware-independent universal manipulation interface,
Z. Wu, T. Wang, C. Guan, Z. Jia, S. Liang, H. Song, D. Qu, D. Wang, Z. Wang, N. Cao, Y . Ding, B. Zhao, and X. Li, “Fast-umi: A scalable and hardware-independent universal manipulation interface,” arXiv:2409.19499, 2024
-
[39]
Self-organization, embodiment, and biologically inspired robotics,
R. Pfeifer, M. Lungarella, and F. Iida, “Self-organization, embodiment, and biologically inspired robotics,” Science, 2007
work page 2007
-
[40]
Friction models and friction compensation,
H. Olsson, K. J. ˚Astr¨om, C. Canudas de Wit, M. G ¨afvert, and P. Lischinsky, “Friction models and friction compensation,” European Journal of Control , 1998
work page 1998
-
[41]
Apriltag: A robust and flexible visual fiducial system,
E. Olson, “Apriltag: A robust and flexible visual fiducial system,” in ICRA, 2011
work page 2011
-
[42]
Robust pose estimation from a planar target,
G. Schweighofer and A. Pinz, “Robust pose estimation from a planar target,” IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 2006
work page 2006
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.