{"work":{"id":"86320616-05fa-4c08-8bc0-4377e53bdba5","openalex_id":null,"doi":null,"arxiv_id":"2412.13877","raw_key":null,"title":"RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation","authors":null,"authors_text":"Kun Wu, Chengkai Hou, Jiaming Liu, Zhengping Che, Xiaozhu Ju, Zhuqin Yang","year":2024,"venue":"cs.RO","abstract":"In this paper, we introduce RoboMIND (Multi-embodiment Intelligence Normative Data for Robot Manipulation), a dataset containing 107k demonstration trajectories across 479 diverse tasks involving 96 object classes. RoboMIND is collected through human teleoperation and encompasses comprehensive robotic-related information, including multi-view observations, proprioceptive robot state information, and linguistic task descriptions. To ensure data consistency and reliability for imitation learning, RoboMIND is built on a unified data collection platform and a standardized protocol, covering four distinct robotic embodiments: the Franka Emika Panda, the UR5e, the AgileX dual-arm robot, and a humanoid robot with dual dexterous hands. Our dataset also includes 5k real-world failure demonstrations, each accompanied by detailed causes, enabling failure reflection and correction during policy learning. Additionally, we created a digital twin environment in the Isaac Sim simulator, replicating the real-world tasks and assets, which facilitates the low-cost collection of additional training data and enables efficient evaluation. To demonstrate the quality and diversity of our dataset, we conducted extensive experiments using various imitation learning methods for single-task settings and state-of-the-art Vision-Language-Action (VLA) models for multi-task scenarios. By leveraging RoboMIND, the VLA models achieved high manipulation success rates and demonstrated strong generalization capabilities. To the best of our knowledge, RoboMIND is the largest multi-embodiment teleoperation dataset collected on a unified platform, providing large-scale and high-quality robotic training data. Our project is at https://x-humanoid-robomind.github.io/.","external_url":"https://arxiv.org/abs/2412.13877","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-05-21T03:39:29.629675+00:00","pith_arxiv_id":"2412.13877","created_at":"2026-05-10T10:49:56.034927+00:00","updated_at":"2026-06-05T21:23:00.469572+00:00","title_quality_ok":true,"display_title":"RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation","render_title":"RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation"},"hub":{"state":{"work_id":"86320616-05fa-4c08-8bc0-4377e53bdba5","tier":"hub","tier_reason":"10+ Pith inbound or 1,000+ external citations","pith_inbound_count":27,"external_cited_by_count":null,"distinct_field_count":4,"first_pith_cited_at":"2025-03-09T15:40:29+00:00","last_pith_cited_at":"2026-05-20T17:10:31+00:00","author_build_status":"not_needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"not_needed","reader_status":"not_needed","recognition_status":"not_needed","updated_at":"2026-06-06T11:30:41.633598+00:00","tier_text":"hub"},"tier":"hub","role_counts":[{"context_role":"dataset","n":8},{"context_role":"background","n":7}],"polarity_counts":[{"context_polarity":"background","n":7},{"context_polarity":"use_dataset","n":7},{"context_polarity":"unclear","n":1}],"runs":{},"summary":{},"graph":{},"authors":[]}}