Recognition: no theorem link
BiCoord: A Bimanual Manipulation Benchmark towards Long-Horizon Spatial-Temporal Coordination
Pith reviewed 2026-05-10 18:52 UTC · model grok-4.3
The pith
Bimanual robot policies fail on tasks requiring sustained tight coordination between two arms over long sequences.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BiCoord is a benchmark for long-horizon tightly coordinated bimanual manipulation that includes diverse tasks requiring continuous inter-arm dependency and dynamic role exchange across multiple sub-goals, together with quantitative metrics that evaluate coordination from temporal, spatial, and spatial-temporal perspectives. Experiments show that representative policies such as DP, RDT, Pi0, and OpenVLA-OFT struggle with the long-duration and highly coupled tasks.
What carries the argument
The BiCoord benchmark, built from tasks that enforce continuous arm-to-arm dependency and role exchange, paired with a metric suite that separately quantifies timing, spatial alignment, and their interaction.
If this is right
- Methods for bimanual control must add explicit handling of long-term arm interdependencies rather than treating arms independently.
- The new metrics provide a concrete way to measure and compare progress toward better coordination.
- Long-horizon tasks with role exchange become a standard test for whether learned policies can sustain cooperative behavior across changing goals.
Where Pith is reading between the lines
- Improved performance on BiCoord tasks could support more reliable two-arm systems for assembly or household tasks that currently require human-level timing.
- The benchmark format could be extended to include contact-rich actions or sensor noise to test whether coordination gains survive real-world conditions.
Load-bearing premise
The chosen tasks and metrics capture the essential spatial-temporal coupling present in real-world bimanual actions.
What would settle it
A standard policy trained without special coordination modules that nevertheless scores well on all BiCoord tasks and metrics would show the claimed fundamental challenges do not hold.
Figures
read the original abstract
Bimanual manipulation, i.e., the coordinated use of two robotic arms to complete tasks, is essential for achieving human-level dexterity in robotics. Recent simulation benchmarks, e.g., RoboTwin and RLBench2, have advanced data-driven learning for bimanual manipulation. However, existing tasks are short-horizon and only loosely coordinated, failing to capture the spatial-temporal coupling inherent in real-world bimanual behaviors. To address this gap, we introduce BiCoord, a benchmark for long-horizon and tightly coordinated bimanual manipulation. Specifically, BiCoord comprises diverse tasks that require continuous inter-arm dependency and dynamic role exchange across multiple sub-goals. Also, we propose a suite of quantitative metrics that evaluate coordination from temporal, spatial, and spatial-temporal perspectives, enabling systematic measurement of bimanual cooperation. Experimental results show that representative manipulation policies, e.g., DP, RDT, Pi0, and OpenVLA-OFT, struggle with long-duration and highly coupled tasks, revealing fundamental challenges in achieving long-horizon and tight coordination tasks. We hope BiCoord can serve as a foundation for studying long-horizon cooperative manipulation and inspire future research on coordination-aware robotic learning. All datasets, codes and supplements could be found at https://buaa-colalab.github.io/BiCoord/.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces BiCoord, a new benchmark for long-horizon bimanual manipulation consisting of tasks that require continuous inter-arm dependency and dynamic role exchange across multiple sub-goals. It defines a suite of metrics to quantify coordination along temporal, spatial, and spatial-temporal axes. Experiments on representative policies (DP, RDT, Pi0, OpenVLA-OFT) show poor performance on these tasks, which the authors interpret as evidence of fundamental challenges in long-horizon tight coordination.
Significance. If the BiCoord tasks and metrics demonstrably isolate spatial-temporal coupling beyond generic long-horizon difficulty, and if the reported performance gaps are robust, the benchmark would fill a genuine gap left by short-horizon suites such as RoboTwin and RLBench2. The public release of datasets, code, and supplements is a clear strength that supports reproducibility and future coordination-aware learning research.
major comments (2)
- [Experimental results and task design] The central claim that policy failures reveal specific challenges in 'long-horizon and tight coordination' (abstract and conclusion) rests on the assumption that BiCoord tasks impose continuous inter-arm dependency that cannot be reduced to extended horizon or increased subgoal count. No control tasks or ablations are described that preserve duration and subgoal structure while relaxing simultaneous dependency (e.g., sequential independent-arm execution). Without such isolation, the attribution of struggles to coordination demands rather than known long-horizon planning limitations remains unverified.
- [Metrics section] The proposed temporal/spatial/spatial-temporal metrics are introduced to measure bimanual cooperation, yet the manuscript provides no quantitative validation or baseline comparisons showing that these metrics distinguish tight coupling from loose coordination on the BiCoord tasks themselves.
minor comments (1)
- [Abstract and introduction] The abstract states that 'all datasets, codes and supplements could be found at https://buaa-colalab.github.io/BiCoord/'; the main text should include a concise description of benchmark usage, task parameterization, and metric computation formulas to reduce reliance on the external site.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. We agree that the manuscript would be strengthened by additional experiments isolating coordination demands from long-horizon effects and by explicit validation of the proposed metrics. We address each major comment below and indicate the corresponding revisions.
read point-by-point responses
-
Referee: [Experimental results and task design] The central claim that policy failures reveal specific challenges in 'long-horizon and tight coordination' (abstract and conclusion) rests on the assumption that BiCoord tasks impose continuous inter-arm dependency that cannot be reduced to extended horizon or increased subgoal count. No control tasks or ablations are described that preserve duration and subgoal structure while relaxing simultaneous dependency (e.g., sequential independent-arm execution). Without such isolation, the attribution of struggles to coordination demands rather than known long-horizon planning limitations remains unverified.
Authors: We acknowledge that the current submission lacks explicit control experiments to separate the effects of tight inter-arm coupling from general long-horizon planning difficulties. Although the BiCoord tasks are explicitly designed around continuous dependency and dynamic role exchange (as described in the task definitions), we did not report sequential variants. In the revised manuscript we will add such ablations: for each task we will include a matched sequential version in which the arms execute sub-goals independently while preserving total duration, number of sub-goals, and overall workspace constraints. Direct performance comparisons between the original tightly coupled tasks and these sequential controls will be reported to better attribute the observed policy failures. revision: yes
-
Referee: [Metrics section] The proposed temporal/spatial/spatial-temporal metrics are introduced to measure bimanual cooperation, yet the manuscript provides no quantitative validation or baseline comparisons showing that these metrics distinguish tight coupling from loose coordination on the BiCoord tasks themselves.
Authors: We agree that quantitative validation is necessary to confirm the metrics capture tight versus loose coordination. In the revision we will compute the temporal, spatial, and spatial-temporal metrics on both the original BiCoord tasks and on relaxed variants that reduce coupling (e.g., by relaxing synchronization constraints while keeping the same sub-goal sequence). We will additionally report metric values obtained from human teleoperation demonstrations (high coordination) and from random or single-arm policies (low coordination) to demonstrate differentiation. These results will be included in an expanded metrics section. revision: yes
Circularity Check
Independent benchmark with empirical evaluation; no derivation chain present
full rationale
The paper introduces a new benchmark (BiCoord) consisting of tasks and metrics for bimanual coordination, then reports empirical performance of existing policies on those tasks. No equations, fitted parameters, predictions, or first-principles derivations are claimed. The central statements (existing benchmarks are short-horizon/loose; new tasks require continuous inter-arm dependency; policies struggle) are definitional descriptions of the benchmark plus experimental observations, not reductions of outputs to inputs by construction. Self-citations, if any, are not load-bearing for any result. This is a standard benchmark paper whose content is self-contained against external evaluation.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Simulation environments can adequately model the spatial-temporal coupling of real-world bimanual behaviors
invented entities (1)
-
BiCoord benchmark tasks and coordination metrics
no independent evidence
Reference graph
Works this paper leans on
- [1]
-
[2]
Kevin Black, Noah Brown, Danny Driess, Adnan Esmail, Michael Equi, Chelsea Finn, Niccolo Fusai, Lachy Groom, Karol Hausman, Brian Ichter, et al. 2024. 𝑝𝑖_ 0: A Vision-Language-Action Flow Model for General Robot Control.arXiv preprint arXiv:2410.24164(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[3]
Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, et al. 2023. Rt-2: Vision-language-action models transfer web knowledge to robotic control.arXiv preprint arXiv:2307.15818(2023)
work page internal anchor Pith review arXiv 2023
-
[4]
Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, et al. 2022. Rt-1: Robotics transformer for real-world control at scale.arXiv preprint arXiv:2212.06817(2022)
work page internal anchor Pith review arXiv 2022
-
[5]
Qingwen Bu, Jisong Cai, Li Chen, Xiuqi Cui, Yan Ding, Siyuan Feng, Shenyuan Gao, Xindong He, Xu Huang, Shu Jiang, et al. 2025. Agibot world colosseo: A large-scale manipulation platform for scalable and intelligent embodied systems. arXiv preprint arXiv:2503.06669(2025)
work page internal anchor Pith review arXiv 2025
-
[6]
Konstantinos Chatzilygeroudis, Bernardo Fichera, Ilaria Lauzana, Fanjun Bu, Kunpeng Yao, Farshad Khadivar, and Aude Billard. 2020. Benchmark for bimanual robotic manipulation of semi-deformable objects.IEEE Robotics and Automation Letters5, 2 (2020), 2443–2450
2020
-
[7]
Tianxing Chen, Zanxin Chen, Baijun Chen, Zijian Cai, Yibin Liu, Zixuan Li, Qiwei Liang, Xianliang Lin, Yiheng Ge, Zhenyu Gu, et al. 2025. Robotwin 2.0: A scalable data generator and benchmark with strong domain randomization for robust bimanual robotic manipulation.arXiv preprint arXiv:2506.18088(2025)
work page internal anchor Pith review arXiv 2025
-
[8]
Tianxing Chen, Yao Mu, Zhixuan Liang, Zanxin Chen, Shijia Peng, Qiangyu Chen, Mingkun Xu, Ruizhen Hu, Hongyuan Zhang, Xuelong Li, et al . 2025. G3flow: Generative 3d semantic flow for pose-aware and generalizable object manipulation. InProceedings of the Computer Vision and Pattern Recognition Conference. 1735–1744
2025
-
[9]
Cheng Chi, Zhenjia Xu, Siyuan Feng, Eric Cousineau, Yilun Du, Benjamin Burch- fiel, Russ Tedrake, and Shuran Song. 2023. Diffusion policy: Visuomotor policy learning via action diffusion.The International Journal of Robotics Research(2023), 02783649241273668
2023
-
[10]
Sudeep Dasari, Jianren Wang, Joyce Hong, Shikhar Bahl, Yixin Lin, Austin S Wang, Abitha Thankaraj, Karanbir Singh Chahal, Berk Calli, Saurabh Gupta, et al
-
[11]
RB2: Robotic Manipulation Benchmarking with a Twist.NeurIPS 2021 Datasets and Benchmarks Track(2021)
2021
-
[12]
Frederik Ebert, Yanlai Yang, Karl Schmeckpeper, Bernadette Bucher, Georgios Georgakis, Kostas Daniilidis, Chelsea Finn, and Sergey Levine. 2021. Bridge data: Boosting generalization of robotic skills with cross-domain datasets.arXiv preprint arXiv:2109.13396(2021)
work page internal anchor Pith review arXiv 2021
-
[13]
Zicong Fan, Omid Taheri, Dimitrios Tzionas, Muhammed Kocabas, Manuel Kauf- mann, Michael J Black, and Otmar Hilliges. 2023. ARCTIC: A dataset for dexterous bimanual hand-object manipulation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 12943–12954
2023
-
[14]
Zipeng Fu, Tony Z Zhao, and Chelsea Finn. 2024. Mobile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation.arXiv preprint arXiv:2401.02117(2024)
work page internal anchor Pith review arXiv 2024
-
[15]
Jianfeng Gao, Xiaoshu Jin, Franziska Krebs, Noémie Jaquier, and Tamim Asfour
-
[16]
In2024 IEEE International Conference on Robotics and Automation (ICRA)
Bi-kvil: Keypoints-based visual imitation learning of bimanual manipulation tasks. In2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 16850–16857
-
[17]
Haoran Geng, Feishi Wang, Songlin Wei, Yuyang Li, Bangjun Wang, Boshi An, Charlie Tianyue Cheng, Haozhe Lou, Peihao Li, Yen-Jen Wang, et al. 2025. Robo- Verse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning.arXiv preprint arXiv:2504.18904(2025)
-
[18]
Jennifer Grannen, Yilin Wu, Brandon Vu, and Dorsa Sadigh. 2023. Stabilize to act: Learning to coordinate for bimanual manipulation. InConference on Robot Learning. PMLR, 563–576
2023
-
[19]
Markus Grotz, Mohit Shridhar, Yu-Wei Chao, Tamim Asfour, and Dieter Fox
-
[20]
InCoRL 2024 Workshop on Whole-body Control and Bimanual Manipulation: Applications in Humanoids and Beyond
Peract2: Benchmarking and learning for robotic bimanual manipulation tasks. InCoRL 2024 Workshop on Whole-body Control and Bimanual Manipulation: Applications in Humanoids and Beyond
2024
- [21]
- [22]
-
[23]
Stephen James, Zicong Ma, David Rovick Arrojo, and Andrew J Davison. 2020. Rlbench: The robot learning benchmark & learning environment.IEEE Robotics and Automation Letters5, 2 (2020), 3019–3026
2020
-
[24]
Jian-Jian Jiang, Xiao-Ming Wu, Yi-Xiang He, Ling-An Zeng, Yi-Lin Wei, Dandan Zhang, and Wei-Shi Zheng. 2025. Rethinking bimanual robotic manipulation: Learning with decoupled interaction framework. InProceedings of the IEEE/CVF International Conference on Computer Vision. 12427–12437
2025
-
[25]
Tsung-Wei Ke, Nikolaos Gkanatsios, and Katerina Fragkiadaki. 2025. 3D Diffuser Actor: Policy Diffusion with 3D Scene Representations. InConference on Robot Learning. PMLR, 1949–1974
2025
-
[26]
Alexander Khazatsky, Karl Pertsch, Suraj Nair, Ashwin Balakrishna, Sudeep Dasari, Siddharth Karamcheti, Soroush Nasiriany, Mohan Kumar Srirama, Lawrence Yunliang Chen, Kirsty Ellis, et al . 2024. DROID: A large-scale in- the-wild robot manipulation dataset. InRobotics: Science and Systems
2024
-
[27]
Moo Jin Kim, Chelsea Finn, and Percy Liang. 2025. Fine-tuning vision-language- action models: Optimizing speed and success.arXiv preprint arXiv:2502.19645 (2025)
work page internal anchor Pith review arXiv 2025
-
[28]
Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan P Foster, Pannag R Sanketi, Quan Vuong, et al. [n. d.]. OpenVLA: An Open-Source Vision-Language-Action Model. In8th Annual Conference on Robot Learning
- [29]
-
[30]
Qixiu Li, Yaobo Liang, Zeyu Wang, Lin Luo, Xi Chen, Mozheng Liao, Fangyun Wei, Yu Deng, Sicheng Xu, Yizhong Zhang, et al. 2024. Cogact: A foundational vision-language-action model for synergizing cognition and action in robotic manipulation.arXiv preprint arXiv:2411.19650(2024)
work page Pith review arXiv 2024
- [31]
-
[32]
Yunfei Li, Chaoyi Pan, Huazhe Xu, Xiaolong Wang, and Yi Wu. 2023. Efficient bimanual handover and rearrangement via symmetry-aware actor-critic learning. In2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 3867–3874
2023
-
[33]
Zhixuan Liang, Yao Mu, Mingyu Ding, Fei Ni, Masayoshi Tomizuka, and Ping Luo. 2023. AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners. InInternational Conference on Machine Learning. PMLR, 20725–20745
2023
-
[34]
Zhixuan Liang, Yao Mu, Hengbo Ma, Masayoshi Tomizuka, Mingyu Ding, and Ping Luo. 2024. Skilldiffuser: Interpretable hierarchical planning via skill abstrac- tions in diffusion-based task execution. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16467–16476
2024
-
[35]
Zhixuan Liang, Yao Mu, Yixiao Wang, Tianxing Chen, Wenqi Shao, Wei Zhan, Masayoshi Tomizuka, Ping Luo, and Mingyu Ding. 2025. DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation. In Proceedings of the Computer Vision and Pattern Recognition Conference. 1745–1755
2025
-
[36]
Bo Liu, Yifeng Zhu, Chongkai Gao, Yihao Feng, Qiang Liu, Yuke Zhu, and Peter Stone. 2023. Libero: Benchmarking knowledge transfer for lifelong robot learning. Advances in Neural Information Processing Systems36 (2023), 44776–44791
2023
-
[37]
I-Chun Arthur Liu, Jason Chen, Gaurav S Sukhatme, and Daniel Seita. 2025. D- CODA: Diffusion for Coordinated Dual-Arm Data Augmentation. InConference on Robot Learning. PMLR, 3569–3588
2025
-
[38]
I-Chun Arthur Liu, Sicheng He, Daniel Seita, and Gaurav S Sukhatme. 2025. VoxAct-B: Voxel-Based Acting and Stabilizing Policy for Bimanual Manipulation. InConference on Robot Learning. PMLR, 4354–4370
2025
-
[39]
Junjia Liu, Yiting Chen, Zhipeng Dong, Shixiong Wang, Sylvain Calinon, Miao Li, and Fei Chen. 2022. Robot cooking with stir-fry: Bimanual non-prehensile manipulation of semi-fluid objects.IEEE Robotics and Automation Letters7, 2 (2022), 5159–5166
2022
-
[40]
Songming Liu, Lingxuan Wu, Bangguo Li, Hengkai Tan, Huayu Chen, Zhengyi Wang, Ke Xu, Hang Su, and Jun Zhu. 2024. Rdt-1b: a diffusion foundation model for bimanual manipulation.arXiv preprint arXiv:2410.07864(2024)
work page internal anchor Pith review arXiv 2024
-
[41]
Guanxing Lu, Tengbo Yu, Haoyuan Deng, Season Si Chen, Yansong Tang, and Ziwei Wang. 2025. Anybimanual: Transferring unimanual policy for general bimanual manipulation. InProceedings of the IEEE/CVF International Conference on Computer Vision. 13662–13672
2025
-
[42]
Qi Lv, Hao Li, Xiang Deng, Rui Shao, Yinchuan Li, Jianye Hao, Longxiang Gao, Michael Yu Wang, and Liqiang Nie. 2025. Spatial-temporal graph diffusion policy with kinematic modeling for bimanual robotic manipulation. InProceedings of the Computer Vision and Pattern Recognition Conference. 17394–17404
2025
-
[43]
Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, Nikita Rudin, Arthur Allshire, Ankur Handa, et al
-
[44]
InNeurIPS Datasets and Benchmarks
Isaac Gym: High Performance GPU Based Physics Simulation For Robot Learning. InNeurIPS Datasets and Benchmarks
-
[45]
Oier Mees, Lukas Hermann, Erick Rosete-Beas, and Wolfram Burgard. 2022. Calvin: A benchmark for language-conditioned policy learning for long-horizon robot manipulation tasks.IEEE Robotics and Automation Letters7, 3 (2022), 7327–7334
2022
-
[46]
Mayank Mittal, Calvin Yu, Qinxi Yu, Jingzhou Liu, Nikita Rudin, David Hoeller, Jia Lin Yuan, Ritvik Singh, Yunrong Guo, Hammad Mazhar, et al. 2023. Orbit: A unified simulation framework for interactive robot learning environments.IEEE Robotics and Automation Letters8, 6 (2023), 3740–3747
2023
-
[47]
Yao Mu, Tianxing Chen, Zanxin Chen, Shijia Peng, Zhiqian Lan, Zeyu Gao, Zhixuan Liang, Qiaojun Yu, Yude Zou, Mingkun Xu, et al . 2025. RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins.Proceedings of the IEEE/CVF conference on computer vision and pattern recognition(2025)
2025
-
[48]
Soroush Nasiriany, Abhiram Maddukuri, Lance Zhang, Adeet Parikh, Aaron Lo, Abhishek Joshi, Ajay Mandlekar, and Yuke Zhu. 2024. RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots. InRobotics: Science and Systems (RSS)
2024
-
[49]
Abby O’Neill, Abdul Rehman, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, Ajinkya Jain, et al. 2024. Open x-embodiment: Robotic learning datasets and rt-x models: Open x-embodiment collaboration 0. In2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 6892–6903
2024
- [50]
-
[51]
Christian Smith, Yiannis Karayiannidis, Lazaros Nalpantidis, Xavi Gratal, Peng Qi, Dimos V Dimarogonas, and Danica Kragic. 2012. Dual arm manipulation—A survey.Robotics and Autonomous systems60, 10 (2012), 1340–1353
2012
-
[52]
Stone Tao, Fanbo Xiang, Arth Shukla, Yuzhe Qin, Xander Hinrichsen, Xiaodi Yuan, Chen Bao, Xinsong Lin, Yulin Liu, Tse kai Chan, Yuan Gao, Xuanlin Li, Tongzhou Mu, Nan Xiao, Arnav Gurha, Zhiao Huang, Roberto Calandra, Rui Chen, Shan Luo, and Hao Su. 2024. ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI.arXiv pre...
-
[53]
Octo Model Team, Dibya Ghosh, Homer Walke, Karl Pertsch, Kevin Black, Oier Mees, Sudeep Dasari, Joey Hejna, Tobias Kreiman, Charles Xu, et al. 2024. Octo: An open-source generalist robot policy.arXiv preprint arXiv:2405.12213(2024)
work page internal anchor Pith review arXiv 2024
-
[54]
Emanuel Todorov, Tom Erez, and Yuval Tassa. 2012. Mujoco: A physics engine for model-based control. InIEEE/RSJ International Conference on Intelligent Robots and Systems. 5026–5033
2012
-
[55]
Homer Rich Walke, Kevin Black, Tony Z Zhao, Quan Vuong, Chongyi Zheng, Philippe Hansen-Estruch, Andre Wang He, Vivek Myers, Moo Jin Kim, Max Du, et al. 2023. Bridgedata v2: A dataset for robot learning at scale. InConference on Robot Learning. PMLR, 1723–1736
2023
-
[56]
Chenxi Wang, Hongjie Fang, Hao-Shu Fang, and Cewu Lu. 2024. Rise: 3d per- ception makes real-world robot imitation simple and effective. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2870– 2877
2024
-
[57]
Dian Wang, Colin Kohler, Xupeng Zhu, Mingxi Jia, and Robert Platt. 2022. Bul- letarm: An open-source robotic manipulation benchmark and learning frame- work. InThe International Symposium of Robotics Research. Springer, 335–350
2022
-
[58]
Junjie Wen, Yichen Zhu, Jinming Li, Zhibin Tang, Chaomin Shen, and Feifei Feng. 2025. DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control. InConference on Robot Learning. PMLR, 3094–3114
2025
-
[59]
Junjie Wen, Yichen Zhu, Jinming Li, Minjie Zhu, Zhibin Tang, Kun Wu, Zhiyuan Xu, Ning Liu, Ran Cheng, Chaomin Shen, Yaxin Peng, Feifei Feng, and Jian Tang
-
[60]
TinyVLA: Toward Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation.IEEE Robotics and Automation Letters10, 4 (2025), 3988–3995. doi:10.1109/LRA.2025.3544909
- [61]
-
[62]
Guibas, and Hao Su
Fanbo Xiang, He Wang, Yuzhe Qin, Austin Wang, Hejia Zhang, Yikuan Xia, Binbin Lin, Yuzhe Wu, Chengcheng Tang, Yixin Zhu, Li Yi, Leonidas J. Guibas, and Hao Su. 2020. SAPIEN: A SimulAted Part-based Interactive ENvironment.Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
2020
- [63]
-
[64]
Seonghyeon Ye, Joel Jang, Byeongguk Jeon, Se June Joo, Jianwei Yang, Baolin Peng, Ajay Mandlekar, Reuben Tan, Yu-Wei Chao, Bill Yuchen Lin, et al. [n. d.]. Latent Action Pretraining From Videos. InCoRL 2024 Workshop on Whole-body Control and Bimanual Manipulation: Applications in Humanoids and Beyond
2024
-
[65]
Tengbo Yu, Guanxing Lu, Zaijia Yang, Haoyuan Deng, Season Si Chen, Jiwen Lu, Wenbo Ding, Guoqiang Hu, Yansong Tang, and Ziwei Wang. 2025. Mani- Gaussian++: General robotic bimanual manipulation with hierarchical Gaussian world model. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 12232–12239
2025
-
[66]
Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Karol Hausman, Chelsea Finn, and Sergey Levine. 2020. Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning. InConference on robot learning. PMLR, 1094–1100
2020
-
[67]
Kevin Zakka, Philipp Wu, Laura Smith, Nimrod Gileadi, Taylor Howell, Xue Bin Peng, Sumeet Singh, Yuval Tassa, Pete Florence, Andy Zeng, et al. 2023. RoboPi- anist: Dexterous Piano Playing with Deep Reinforcement Learning. InConference on Robot Learning. PMLR, 2975–2994
2023
-
[68]
Yanjie Ze, Gu Zhang, Kangning Zhang, Chenyuan Hu, Muhan Wang, and Huazhe Xu. 2024. 3d diffusion policy.arXiv preprint arXiv:2403.03954(2024)
work page internal anchor Pith review arXiv 2024
-
[69]
Tony Z Zhao, Vikash Kumar, Sergey Levine, and Chelsea Finn. 2023. Learning fine-grained bimanual manipulation with low-cost hardware.arXiv preprint arXiv:2304.13705(2023)
work page internal anchor Pith review arXiv 2023
- [70]
-
[71]
Yuke Zhu, Josiah Wong, Ajay Mandlekar, and Roberto Martín-Martín. 2020. robosuite: A Modular Simulation Framework and Benchmark for Robot Learning. InarXiv preprint arXiv:2009.12293
work page internal anchor Pith review arXiv 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.