{"paper":{"title":"Open-H-Embodiment: A Large-Scale Dataset for Enabling Foundation Models in Medical Robotics","license":"http://creativecommons.org/licenses/by/4.0/","headline":"A large open dataset of medical robot videos and motions from 49 institutions enables the first foundation model to complete suturing tasks end-to-end.","cross_cats":["cs.AI"],"primary_cat":"cs.RO","authors_text":"Aditya Amit Godbole, Alaa Eldin Abdelaal, Alan Kuntz, Alberto Arezzo, Alexander Dimitrakakis, Allison M. Okamura, Andrew Howe, Anqing Duan, Anton Deguet, Antony Goldenberg, Ariel Rodriguez Jimenez, Asmitha Sathya, Axel Krieger, Ayberk Acar, Benjamin Calm\\'e, Brett Marinelli, Britton Jordan, Camilo Correa-Gallego, Carlo Alberto Ammirati, Carlos Vives, Chandra Kuchi, Chang Liu, Changwei Chen, Chelsea Finn, Chenhao Yu, Chetan Reddy Narayanaswamy, Chi Kit Ng, Chim Ho Yin, Christopher Nguan, Chung-Pang Wang, Cyrus Neary, Daniel Donoho, David Navarro-Alarcon, David Noonan, Dianye Huang, Diego Granero Marana, Doan Xuan Viet Pham, Dominic Jones, Erqi Wang, Eszter Luk\\'acs, Ethan Kilmer, Evan Hailey, Fabio Carrillo, Farong Wang, Farshid Alambeigi, Fausto Kang, Federica Barontini, Federico Lavagno, Fei Liu, Ferdinando Rodriguez y Baena, Filip Binkiewicz, Filippo Filicori, Francesco Marzola, Frederic Giraud, Giovanni Distefano, Giulio Dagnino, Guankun Wang, Hang Li, Han Zhang, Hao-Chih Lee, Hao Ding, Haoxin Chen, Hao Yang, Haoying Zhou, Hongjun Wu, Hongliang Ren, Howard Ji, Idris Sunmola, Jacob Delgado, Jad Fayad, Jeffrey Jopling, Jesse Haworth, Jianhao Su, Jianmin Ji, Jiaqi Shao, Jiawei Ge, Jie Ying Wu, Jinsong Lin, Ji Woong Kim, Jonathan C. DeLong, Jonathon Hawkins, Jordan Thompson, Joyce Zhang, Junlei Hu, Junlin Wu, Junyi Wang, Juo-Tung Chen, Justin Opfermann, Kaixuan Wu, Kaizhong Deng, Kengo Hayashi, Ken Goldberg, Ki Hwan Oh, Krist\\'of Tak\\'acs, Kush Hari, Lalithkumar Seenivasan, Leonardo Borgioli, Lidian Wang, Lorenzo Mazza, Lucy Xiaoyang Shi, Luigi Muratore, Lukas Zbinden, Luohong Wu, Luoyao Kang, Mahdi Azizian, Marco Esposito, Maria Clara Morais, Mario Ferradosa, Martin Wagner, Mateusz W\\'ojcikowski, Mathias Unberath, Matteo Pescio, Mattia Ballo, Mehmet K. Turkcan, Meiqing Cheng, Menglong Ye, Michael Kam, Michael Yip, Micha{\\l} Naskr\\k{e}t, Miles Mannas, Milos Zefran, Min Cheng, Mingwu Su, Mohammad Rafiee Javazm, Nan Xiao, Nassir Navab, Nicola Cavalcanti, Nikola Budjak, Ning Zhong, Nithesh Kumar, Noah Barnes, Nural Yilmaz, Open-H-Embodiment Consortium: Nigel Nelson, Ortrun Hellig, Pablo David Aranda Rodriguez, Pascal Hansen, Patrick Thornycroft, Pei Liu, Peng Zhou, Peter Black, Peter Kazanzides, Philipp F\\\"urnstahl, Pietro Valdastri, Preethi Satish, Przemys{\\l}aw Korzeniowski, Qihan Chen, Qingpeng Ding, Quan Vuong, Ran Ju, Rayan Younis, Ria Jain, Rui Ji, Ryan S. Yeung, Sabina Martyniak, Sareena Mann, Sayem Nazmuz Zaman, S. Duke Herrell, Sean D. Huver, Sebastian Bodenstedt, Septimiu E. Salcudean, Shane Farritor, Shelby Haworth, Shing Shin Cheng, Siddhartha Kapuria, Sihang Chen, Sonika Kiehler, Soofiyan Atar, Stamatia Giannarou, Stefanie Speidel, Tam\\'as Haidegger, Tanner Watts, Tianqi Yang, Tito Porras, Tom Christian Olesch, Victoria Wu, Wanli Liuchen, Wei Wang, Wenxuan Xie, Wolfgang Wein, Xavier Giralt Ludevid, Xiangyu Chu, Xiao Liang, Xiaoqing Guo, Xinhao Chen, Xinxin Lin, Xiuli Zuo, Xueyan Mei, Xuyang Zhang, Yameng Zhang, Yanyong Zhang, Yidong Zhang, Yimeng Wu, Yinuo Yang, Yiqing Shen, Yu Chung Lee, Yuelin Zhang, Yun-hui Liu, Yunke Ao, Yunxi Tang, Yunye Xiao, Yu Sheng, Yu Tian, Zahi Fayad, Zhaoyang Jacopo Hu, Zhen Li, Zhongliang Jiang, Zhongyu Chen, Zhouyang Hong, Zih-Yun Sarah Chiu, Zijian Wu, Ziyang Chen, Ziyi Hao, Ziyi Wang, Zoe Soul\\'e","submitted_at":"2026-04-22T19:05:17Z","abstract_excerpt":"Autonomous medical robots hold promise to improve patient outcomes, reduce provider workload, democratize access to care, and enable superhuman precision. However, autonomous medical robotics has been limited by a fundamental data problem: existing medical robotic datasets are small, single-embodiment, and rarely shared openly, restricting the development of foundation models that the field needs to advance. We introduce Open-H-Embodiment, the largest open dataset of medical robotic video with synchronized kinematics to date, spanning more than 50 institutions and multiple robotic platforms in"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"We introduce Open-H-Embodiment, the largest open dataset of medical robotic video with synchronized kinematics to date, spanning more than 49 institutions and multiple robotic platforms... GR00T-H is the first open foundation vision-language-action model for medical robotics, which is the only evaluated model to achieve full end-to-end task completion on a structured suturing benchmark (25% of trials vs. 0% for all others).","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The collected videos and kinematics from heterogeneous institutions and platforms are sufficiently standardized, free of systematic biases, and representative of real clinical variability to support training of generalizable foundation models that transfer to unseen robots and patients.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Open-H-Embodiment is the largest open multi-embodiment medical robotics dataset, used to train GR00T-H, the first open vision-language-action model that achieves end-to-end suturing completion where prior models fail.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"A large open dataset of medical robot videos and motions from 49 institutions enables the first foundation model to complete suturing tasks end-to-end.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"03584d62b79760e06ba393c35d5dcc33bbfedd8307460da4e2120c516634b112"},"source":{"id":"2604.21017","kind":"arxiv","version":3},"verdict":{"id":"a7e54ea6-5aa4-40d7-a4b1-70e5d8140c8d","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-09T23:48:45.691752Z","strongest_claim":"We introduce Open-H-Embodiment, the largest open dataset of medical robotic video with synchronized kinematics to date, spanning more than 49 institutions and multiple robotic platforms... GR00T-H is the first open foundation vision-language-action model for medical robotics, which is the only evaluated model to achieve full end-to-end task completion on a structured suturing benchmark (25% of trials vs. 0% for all others).","one_line_summary":"Open-H-Embodiment is the largest open multi-embodiment medical robotics dataset, used to train GR00T-H, the first open vision-language-action model that achieves end-to-end suturing completion where prior models fail.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The collected videos and kinematics from heterogeneous institutions and platforms are sufficiently standardized, free of systematic biases, and representative of real clinical variability to support training of generalizable foundation models that transfer to unseen robots and patients.","pith_extraction_headline":"A large open dataset of medical robot videos and motions from 49 institutions enables the first foundation model to complete suturing tasks end-to-end."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2604.21017/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"ai_meta_artifact","ran_at":"2026-05-21T13:38:59.376773Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-20T01:27:02.042032Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"f4837ec7c1d1d4026b6216353ea905485844bab030bac0329842d89ce3c4dff2"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}