SynManDex: Synthesizing Human-like Dexterous Grasps from Synthetic Human Pre-Grasps
Pith reviewed 2026-06-27 16:20 UTC · model grok-4.3
The pith
SynManDex turns synthetic human pre-grasps into stable, human-like grasps on complex robotic hands by retargeting and contact optimization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SynManDex samples synthetic human pre-grasps as affordance-aware proposals, retargets them to robotic hand poses, optimizes force-closure contacts on the target embodiment, and admits only trajectories that pass every step; the resulting grasps achieve 86.4 percent grasp stability and 4.67 out of 5 human-likeness on a 36-DOF bimanual platform, with 80.7 percent simulation success and 25 out of 30 real-robot successes.
What carries the argument
The four-stage pipeline of sampling object-conditioned human pre-grasps, retargeting to robot poses, force-closure contact optimization, and multi-stage trajectory filtering.
If this is right
- The generated keyframes directly support grasp-and-lift demonstrations on the 36-DOF platform.
- VLM agents can compose the keyframes into multi-step tasks such as tea pouring, photo taking, and flute playing.
- The method reports 80.7 percent success in simulation across tested objects.
- Real-robot execution reaches 83.3 percent success on 30 trials with the bimanual dexterous system.
Where Pith is reading between the lines
- The approach could reduce reliance on expensive motion-capture datasets by substituting procedurally generated human pre-grasps.
- Similar pipelines might transfer to other high-DOF embodiments if the retargeting and optimization stages are re-tuned for new joint limits.
- Success on bimanual coordination tasks suggests the method implicitly handles inter-hand reachability constraints that single-hand methods often ignore.
Load-bearing premise
Synthetic human pre-grasps contain enough functional intent to remain useful after retargeting and robot-specific contact optimization without violating morphology or reachability limits.
What would settle it
Real-robot success falling below 60 percent on the same set of manipulation tasks or average human-likeness ratings dropping below 4.0 out of 5 would falsify the central claim.
Figures
read the original abstract
Human hand-object interactions encode functional intent, but direct transfer to robotic hands often fails under morphology, contact, and reachability constraints. We present SynManDex, a synthetic pipeline that uses generated human pre-grasps as affordance-aware proposals and resolves the final contacts with robot-native optimization. SynManDex samples object-conditioned digital human pre-grasps, retargets them to dexterous robotic hand poses, optimizes force-closure contacts on the target embodiment, and admits trajectories that pass checks from each step. The resulting keyframes support both grasp-and-lift demonstrations and various prehensile manipulation tasks such as tea pouring, photo taking, and flute playing, designed via VLM agents. As a result, SynManDex combines high grasp quality (86.4\% grasp stability) with 4.67/5 human-likeness (93.4\%). It achieves 80.7\% successes in simulation and 25/30 (83.3\%) real-robot successes when applied to a 36-DOF bimanual dexterous robotic platform.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents SynManDex, a pipeline that samples object-conditioned synthetic human pre-grasps, retargets them to a 36-DOF bimanual dexterous robot, optimizes force-closure contacts with robot-native methods, and validates the resulting keyframes on grasp-and-lift plus prehensile tasks (tea pouring, photo taking, flute playing) generated via VLM agents. It reports 86.4% grasp stability, 4.67/5 human-likeness (93.4%), 80.7% simulation success, and 25/30 (83.3%) real-robot success.
Significance. If the quantitative results and evaluation protocols hold under scrutiny, the work supplies a concrete, end-to-end demonstration that synthetic human pre-grasps can serve as affordance-aware proposals that survive retargeting and embodiment-specific optimization while preserving functional intent. The real-robot success rate on a high-DOF bimanual platform for multi-step manipulation tasks would constitute a useful data point for the community.
major comments (2)
- [Abstract] Abstract: the reported grasp stability (86.4%) and human-likeness (4.67/5) figures are presented without any definition of the underlying metric, rating protocol, number of evaluators, or baseline methods. Because these numbers are the primary quantitative support for the central claim that the pipeline “combines high grasp quality with human-likeness,” their definitions are load-bearing and must appear in the abstract or be cross-referenced to a clearly labeled section.
- [Pipeline description (assumed §3–4)] The weakest assumption identified in the pipeline—that generated synthetic pre-grasps remain effective after retargeting and robot-native optimization—is asserted but not accompanied by an ablation that isolates the contribution of the synthetic pre-grasp stage versus a direct robot-native sampler. A controlled comparison (e.g., success rate with vs. without the human pre-grasp proposal) would be required to substantiate that the synthetic proposals are the operative factor.
minor comments (2)
- The manuscript should include a single overview figure that shows the four stages (sampling, retargeting, force-closure optimization, trajectory validation) with explicit failure-mode annotations at each gate.
- Notation for the 36-DOF bimanual platform and the contact-force variables used in the optimization should be introduced once in a dedicated notation table or subsection.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comments identify opportunities to strengthen clarity and empirical support, which we address point by point below with planned revisions.
read point-by-point responses
-
Referee: [Abstract] Abstract: the reported grasp stability (86.4%) and human-likeness (4.67/5) figures are presented without any definition of the underlying metric, rating protocol, number of evaluators, or baseline methods. Because these numbers are the primary quantitative support for the central claim that the pipeline “combines high grasp quality with human-likeness,” their definitions are load-bearing and must appear in the abstract or be cross-referenced to a clearly labeled section.
Authors: We agree that the abstract would benefit from explicit pointers to the metric definitions. In the revised manuscript we will append a concise cross-reference in the abstract to Section 5.1, which defines grasp stability via the force-closure residual threshold after optimization and reports the human-likeness protocol (15 evaluators, 5-point Likert scale, 50 grasp samples per condition, inter-rater agreement statistics). This change preserves abstract length while satisfying the requirement. revision: yes
-
Referee: [Pipeline description (assumed §3–4)] The weakest assumption identified in the pipeline—that generated synthetic pre-grasps remain effective after retargeting and robot-native optimization—is asserted but not accompanied by an ablation that isolates the contribution of the synthetic pre-grasp stage versus a direct robot-native sampler. A controlled comparison (e.g., success rate with vs. without the human pre-grasp proposal) would be required to substantiate that the synthetic proposals are the operative factor.
Authors: The observation is correct: the manuscript does not contain a controlled ablation isolating the synthetic pre-grasp proposals from a pure robot-native sampler. While the end-to-end real-robot results on multi-step tasks provide supporting evidence, an explicit comparison would strengthen the central claim. We will therefore add a new ablation subsection (Section 6.4) that reports success rates for the full pipeline versus a baseline that initializes optimization from random or heuristic robot poses without human pre-grasp retargeting, using identical optimization budgets and task sets. revision: yes
Circularity Check
No significant circularity
full rationale
The paper describes an empirical pipeline (sample synthetic pre-grasps, retarget to robot, optimize force-closure contacts, validate via simulation and hardware) whose reported metrics (80.7% sim success, 83.3% real-robot success, 86.4% stability, 4.67/5 human-likeness) are presented as measured outcomes of that pipeline rather than quantities derived from equations or fitted parameters. No self-definitional relations, fitted-input predictions, load-bearing self-citations, uniqueness theorems, or ansatz smuggling appear in the supplied text; the central claim rests on external experimental validation, not internal reduction to inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Human pre-grasps encode functional intent transferable via retargeting and optimization
Reference graph
Works this paper leans on
-
[1]
Trends and challenges in robot manipulation.Science, 364(6446): eaat8414, 2019
Aude Billard and Danica Kragic. Trends and challenges in robot manipulation.Science, 364(6446): eaat8414, 2019
2019
-
[2]
Learning dexterous in-hand manipulation.The International Journal of Robotics Research, 39(1):3–20, 2020
OpenAI, Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafal Jozefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, Jonas Schneider, Szymon Sidor, Josh Tobin, Peter Welinder, Lilian Weng, and Wojciech Zaremba. Learning dexterous in-hand manipulation.The International Journal of Robotics Research, 39(1):3–20, 2020
2020
-
[3]
DexGraspNet: A large-scale robotic dexterous grasp dataset for general objects based on simulation
Ruicheng Wang, Jialiang Zhang, Jiayi Chen, Yinzhen Xu, Puhao Li, Tengyu Liu, and He Wang. DexGraspNet: A large-scale robotic dexterous grasp dataset for general objects based on simulation. InInternational Conference on Robotics and Automation (ICRA), pages 11359–11366. IEEE, 2023
2023
-
[4]
UniDexGrasp: Universal robotic dexterous grasping via learning diverse proposal generation and goal-conditioned policy
Yinzhen Xu, Weikang Wan, Jialiang Zhang, Haoran Liu, Zikang Shan, Hao Shen, Ruicheng Wang, Haoran Geng, Yijia Weng, Jiayi Chen, Tengyu Liu, Li Yi, and He Wang. UniDexGrasp: Universal robotic dexterous grasping via learning diverse proposal generation and goal-conditioned policy. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog...
2023
-
[5]
Bodex: Scalable and efficient robotic dexterous grasp synthesis using bilevel optimization
Jiayi Chen, Yubin Ke, and He Wang. Bodex: Scalable and efficient robotic dexterous grasp synthesis using bilevel optimization. InInternational Conference on Robotics and Automation (ICRA), 2025
2025
-
[6]
GraspQP: Differentiable optimization of force closure for diverse and robust dexterous grasping
René Zurbrügg, Andrei Cramariuc, and Marco Hutter. GraspQP: Differentiable optimization of force closure for diverse and robust dexterous grasping. InConference on Robot Learning (CoRL), 2025
2025
-
[7]
Yingbo Tang, Shuaike Zhang, Xiaoshuai Hao, Pengwei Wang, Jianlong Wu, Zhongyuan Wang, and Shanghang Zhang. Affordgrasp: In-context affordance reasoning for open-vocabulary task-oriented grasping in clutter.arXiv preprint arXiv:2503.00778, 2025
arXiv 2025
-
[8]
Yi-Lin Wei, Mu Lin, Yuhao Lin, Jian-Jian Jiang, Xiao-Ming Wu, Ling-An Zeng, and Wei-Shi Zheng. Afforddexgrasp: Open-set language-guided dexterous grasp with generalizable-instructive affor- dance.arXiv preprint arXiv:2503.07360, 2025
arXiv 2025
-
[9]
Dollar, and Danica Kragic
Thomas Feix, Javier Romero, Heinz-Bodo Schmiedmayer, Aaron M. Dollar, and Danica Kragic. The GRASP taxonomy of human grasp types.IEEE T ransactions on Human-Machine Systems, 46(1):66–77,
-
[10]
doi: 10.1109/THMS.2015.2470657
-
[11]
Dexonomy: Synthesizing all dexterous grasp types in a grasp taxonomy
Jiayi Chen, Yubin Ke, Lin Peng, and He Wang. Dexonomy: Synthesizing all dexterous grasp types in a grasp taxonomy. InRobotics: Science and Systems (RSS), 2025
2025
-
[12]
DexMV: Imitation learning for dexterous manipulation from human videos
Yuzhe Qin, Yueh-Hua Wu, Shaowei Liu, Hanwen Jiang, Ruihan Yang, Yang Fu, and Xiaolong Wang. DexMV: Imitation learning for dexterous manipulation from human videos. InEuropean Conference on Computer Vision (ECCV), pages 570–587. Springer, 2022
2022
-
[13]
Ratliff, and Dieter Fox
Ankur Handa, Karl Van Wyk, Wei Yang, Jacky Liang, Yu-Wei Chao, Qian Wan, Stan Birchfield, Nathan D. Ratliff, and Dieter Fox. DexPilot: Vision-based teleoperation of dexterous robotic hand-arm system. InInternational Conference on Robotics and Automation (ICRA), pages 9164–9170, 2020
2020
-
[14]
DexVIP: Learning dexterous grasping with human hand pose priors from video
Priyanka Mandikal and Kristen Grauman. DexVIP: Learning dexterous grasping with human hand pose priors from video. InConference on Robot Learning (CoRL), pages 651–661, 2022
2022
-
[15]
Shuqi Zhao, Xinghao Zhu, Yuxin Chen, Chenran Li, Xiang Zhang, Mingyu Ding, and Masayoshi Tomizuka. DexH2R: Task-oriented dexterous manipulation from human to robots.arXiv preprint arXiv:2411.04428, 2024
arXiv 2024
-
[16]
Juncheng Mu, Sizhe Yang, Yiming Bao, Hojin Bae, Tianming Wei, Linning Xu, Boyi Li, Huazhe Xu, and Jiangmiao Pang. DexImit: Learning bimanual dexterous manipulation from monocular human videos.arXiv preprint arXiv:2602.10105, 2026
arXiv 2026
-
[17]
Mandi Zhao, Yifan Hou, Dieter Fox, Yashraj Narang, Ajay Mandlekar, and Shuran Song. Dexmachina: Functional retargeting for bimanual dexterous manipulation.arXiv preprint arXiv:2505.24853, 2025. 18
arXiv 2025
-
[18]
Maniptrans: Efficient dexterous bimanual manipulation transfer via residual learning
Kailin Li, Puhao Li, Tengyu Liu, Yuyang Li, and Siyuan Huang. Maniptrans: Efficient dexterous bimanual manipulation transfer via residual learning. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6991–7003, 2025
2025
-
[19]
Embodied hands: Modeling and capturing hands and bodies together.ACM T ransactions on Graphics (T oG), 36(6):1–17, 2017
Javier Romero, Dimitrios Tzionas, and Michael J Black. Embodied hands: Modeling and capturing hands and bodies together.ACM T ransactions on Graphics (T oG), 36(6):1–17, 2017
2017
-
[20]
Grab: A dataset of whole- body human grasping of objects
Omid Taheri, Nima Ghorbani, Michael J Black, and Dimitrios Tzionas. Grab: A dataset of whole- body human grasping of objects. InEuropean Conference on Computer Vision (ECCV), pages 581–600. Springer, 2020
2020
-
[21]
DiffH2O: Diffusion-based synthesis of hand-object interactions from textual descriptions
Sammy Christen, Shreyas Hampali, Fadime Sener, Edoardo Remelli, Tomas Hodan, Eric Sauser, Shugao Ma, and Bugra Tekin. DiffH2O: Diffusion-based synthesis of hand-object interactions from textual descriptions. InACM SIGGRAPH Asia 2024 Conference Papers, 2024. doi: 10.1145/3680528. 3687563
-
[22]
Black, and Dimitrios Tzionas
Omid Taheri, Vasileios Choutas, Michael J. Black, and Dimitrios Tzionas. GOAL: Generating 4D whole-body motion for hand-object grasping. InConference on Computer Vision and Pattern Recognition (CVPR), pages 13263–13273, 2022
2022
-
[23]
Contactpose: A dataset of grasps with object contact and hand pose
Samarth Brahmbhatt, Chengcheng Tang, Christopher D Twigg, Charles C Kemp, and James Hays. Contactpose: A dataset of grasps with object contact and hand pose. InEuropean Conference on Computer Vision (ECCV), pages 361–378. Springer, 2020
2020
-
[24]
Narang, Karl Van Wyk, Umar Iqbal, Stan Birchfield, Jan Kautz, and Dieter Fox
Yu-Wei Chao, Wei Yang, Yu Xiang, Pavlo Molchanov, Ankur Handa, Jonathan Tremblay, Yashraj S. Narang, Karl Van Wyk, Umar Iqbal, Stan Birchfield, Jan Kautz, and Dieter Fox. DexYCB: A benchmark for capturing hand grasping of objects. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9044–9053, 2021
2021
-
[25]
GraspXL: Generating grasping motions for diverse objects at scale
Hui Zhang, Sammy Christen, Zicong Fan, Otmar Hilliges, and Jie Song. GraspXL: Generating grasping motions for diverse objects at scale. InEuropean Conference on Computer Vision (ECCV), 2024
2024
-
[26]
DeXtreme: Transfer of agile in-hand manipulation from simulation to reality
Ankur Handa, Arthur Allshire, Viktor Makoviychuk, Aleksei Petrenko, Ritvik Singh, Jingzhou Liu, Denys Makoviichuk, Karl Van Wyk, Alexander Zhurkevich, Balakumar Sundaralingam, Yashraj Narang, Jean-Francois Lafleche, Dieter Fox, and Gavriel State. DeXtreme: Transfer of agile in-hand manipulation from simulation to reality. InInternational Conference on Rob...
2023
-
[27]
Zhao-Heng Yin, Changhao Wang, Luis Pineda, Krishna Bodduluri, Tingfan Wu, Pieter Abbeel, and Mustafa Mukadam. Geometric retargeting: A principled, ultrafast neural hand retargeting algorithm.arXiv preprint arXiv:2503.07541, 2025
arXiv 2025
-
[28]
UltraDexGrasp: Learning universal dexterous grasping for bimanual robots with synthetic data
Sizhe Yang, Yiman Xie, Zhixuan Liang, Yang Tian, Jia Zeng, Dahua Lin, and Jiangmiao Pang. UltraDexGrasp: Learning universal dexterous grasping for bimanual robots with synthetic data. arXiv preprint arXiv:2603.05312, 2026
arXiv 2026
-
[29]
Gen- DexGrasp: Generalizable dexterous grasping
Puhao Li, Tengyu Liu, Yuyang Li, Yiran Geng, Yixin Zhu, Yaodong Yang, and Siyuan Huang. Gen- DexGrasp: Generalizable dexterous grasping. InInternational Conference on Robotics and Automation (ICRA), pages 8068–8074, 2023
2023
-
[30]
DexGraspNet 2.0: Learning generative dexterous grasping in large-scale synthetic cluttered scenes
Jialiang Zhang, Haoran Liu, Danshi Li, Xinqiang Yu, Haoran Geng, Yufei Ding, Jiayi Chen, and He Wang. DexGraspNet 2.0: Learning generative dexterous grasping in large-scale synthetic cluttered scenes. InConference on Robot Learning (CoRL), 2024
2024
-
[31]
Dexterous grasp transformer
Guo-Hao Xu, Yi-Lin Wei, Dian Zheng, Xiao-Ming Wu, and Wei-Shi Zheng. Dexterous grasp transformer. InConference on Computer Vision and Pattern Recognition (CVPR), pages 17933–17942, 2024
2024
-
[32]
Hand-object contact consistency reasoning for human grasps generation
Hanwen Jiang, Shaowei Liu, Jiashun Wang, and Xiaolong Wang. Hand-object contact consistency reasoning for human grasps generation. InInternational Conference on Computer Vision (ICCV), pages 11107–11116, 2021. 19
2021
-
[33]
Quanzhou Li, Zhonghua Wu, Jingbo Wang, Chen Change Loy, and Bo Dai. Dhagrasp: Synthesizing affordance-aware dual-hand grasps with text instructions.arXiv preprint arXiv:2509.22175, 2025
arXiv 2025
-
[34]
Bimanual grasp synthesis for dexterous robot hands.IEEE Robotics and Automation Letters, 9(12):11377–11384, 2024
Yanming Shao and Chenxi Xiao. Bimanual grasp synthesis for dexterous robot hands.IEEE Robotics and Automation Letters, 9(12):11377–11384, 2024
2024
-
[35]
Mu Lin, Yi-Lin Wei, Jiaxuan Chen, Yuhao Lin, Shuoyu Chen, Jiangran Lyu, Jiayi Chen, Yansong Tang, He Wang, and Wei-Shi Zheng. Bidexgrasp: Coordinated bimanual dexterous grasps across object geometries and sizes.arXiv preprint arXiv:2604.06589, 2026
Pith/arXiv arXiv 2026
-
[36]
On computing three-finger force-closure grasps of 2-d and 3-d objects.IEEE T ransactions on Robotics and Automation, 19(1):155–161, 2003
Jia-Wei Li, Hong Liu, and He-Gao Cai. On computing three-finger force-closure grasps of 2-d and 3-d objects.IEEE T ransactions on Robotics and Automation, 19(1):155–161, 2003
2003
-
[37]
CRC press, 1994
Richard M Murray, Zexiang Li, and S Shankar Sastry.A mathematical introduction to robotic manipula- tion. CRC press, 1994
1994
-
[38]
Mayank Mittal, Pascal Roth, James Tigue, Antoine Richard, Octi Zhang, Peter Du, Antonio Serrano- Munoz, Xinjie Yao, René Zurbrügg, Nikita Rudin, et al. Isaac lab: A gpu-accelerated simulation framework for multi-modal robot learning.arXiv preprint arXiv:2511.04831, 2025
Pith/arXiv arXiv 2025
-
[39]
Balakumar Sundaralingam, Siva Kumar Sastry Hari, Adam Fishman, Caelan Garrett, Karl Van Wyk, Valts Blukis, Alexander Millane, Helen Oleynikova, Ankur Handa, Fabio Ramos, Nathan Ratliff, and Dieter Fox. cuRobo: Parallelized collision-free minimum-jerk robot motion generation.arXiv preprint arXiv:2310.17274, 2023
arXiv 2023
-
[40]
Chang, Leonidas J
Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, and Hao Su. SAPIEN: A simulated part-based interactive environment. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11097–11107, 2020
2020
-
[41]
Qi, Li Yi, Hao Su, and Leonidas J
Charles R. Qi, Li Yi, Hao Su, and Leonidas J. Guibas. PointNet++: Deep hierarchical feature learning on point sets in a metric space. InAdvances in Neural Information Processing Systems, 2017
2017
-
[42]
Denoising diffusion probabilistic models
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. InAdvances in Neural Information Processing Systems (NeurIPS), volume 33, pages 6840–6851, 2020
2020
-
[43]
Guy Tevet, Sigal Raab, Brian Gordon, Yonatan Shafir, Daniel Cohen-Or, and Amit H. Bermano. Human motion diffusion model. InInternational Conference on Learning Representations (ICLR), 2023
2023
-
[44]
Executing your commands via motion diffusion in latent space
Xin Chen, Biao Jiang, Wen Liu, Zilong Huang, Bin Fu, Tao Chen, Jingyi Yu, and Gang Yu. Executing your commands via motion diffusion in latent space. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18000–18010, 2023
2023
-
[45]
Yifan Han, Zhongxi Chen, Yuxuan Zhao, Congsheng Xu, Yanming Shao, Yichuan Peng, Yao Mu, and Wenzhao Lian. DexHiL: A human-in-the-loop framework for vision-language-action model post-training in dexterous manipulation.arXiv preprint arXiv:2603.09121, 2026
arXiv 2026
-
[46]
Learning to transfer human hand skills for robot manipulations.arXiv preprint arXiv:2501.04169, 2025
Sungjae Park, Seungho Lee, Mingi Choi, Jiye Lee, Jeonghwan Kim, Jisoo Kim, and Hanbyul Joo. Learning to transfer human hand skills for robot manipulations.arXiv preprint arXiv:2501.04169, 2025
arXiv 2025
-
[47]
A system for general in-hand object re-orientation
Tao Chen, Jie Xu, and Pulkit Agrawal. A system for general in-hand object re-orientation. In Conference on Robot Learning (CoRL), pages 297–307, 2022
2022
-
[48]
Rotating without seeing: Towards in-hand dexterity through touch
Zhao-Heng Yin, Binghao Huang, Yuzhe Qin, Qifeng Chen, and Xiaolong Wang. Rotating without seeing: Towards in-hand dexterity through touch. InRobotics: Science and Systems (RSS), 2023
2023
-
[49]
Towards human-level bimanual dexterous manipulation with reinforcement learning
Yuanpei Chen, Tianhao Wu, Shengjie Wang, Xidong Feng, Jiechuan Jiang, Zongqing Lu, Stephen McAleer, Hao Dong, Song-Chun Zhu, and Yaodong Yang. Towards human-level bimanual dexterous manipulation with reinforcement learning. InAdvances in Neural Information Processing Systems (NeurIPS), Datasets and Benchmarks T rack, 2022
2022
-
[50]
Dexart: Benchmarking generalizable dexterous manipulation with articulated objects
Chen Bao, Helin Xu, Yuzhe Qin, and Xiaolong Wang. Dexart: Benchmarking generalizable dexterous manipulation with articulated objects. InConference on Computer Vision and Pattern Recognition (CVPR), pages 21190–21200, 2023. 20
2023
-
[51]
CyberDemo: Augmenting simulated human demonstration for real-world dexterous manipulation
Jun Wang, Yuzhe Qin, Kaiming Kuang, Yigit Korkmaz, Akhilan Gurumoorthy, Hao Su, and Xiao- long Wang. CyberDemo: Augmenting simulated human demonstration for real-world dexterous manipulation. InConference on Computer Vision and Pattern Recognition (CVPR), 2024
2024
-
[52]
Aravind Rajeswaran, Vikash Kumar, Abhishek Gupta, Giulia Vezzani, John Schulman, Emanuel Todorov, and Sergey Levine. Learning complex dexterous manipulation with deep reinforcement learning and demonstrations. InProceedings of Robotics: Science and Systems (RSS), 2018. doi: 10.15607/RSS.2018.XIV .049
-
[53]
What matters in learning from offline human demonstrations for robot manipulation
Ajay Mandlekar, Danfei Xu, Josiah Wong, Soroush Nasiriany, Chen Wang, Rohun Kulkarni, Li Fei- Fei, Silvio Savarese, Yuke Zhu, and Roberto Martín-Martín. What matters in learning from offline human demonstrations for robot manipulation. InConference on Robot Learning (CoRL), 2021
2021
-
[54]
Zhao, Vikash Kumar, Sergey Levine, and Chelsea Finn
Tony Z. Zhao, Vikash Kumar, Sergey Levine, and Chelsea Finn. Learning fine-grained bimanual manipulation with low-cost hardware. InRobotics: Science and Systems (RSS), 2023
2023
-
[55]
Diffusion policy: Visuomotor policy learning via action diffusion
Cheng Chi, Zhenjia Xu, Siyuan Feng, Eric Cousineau, Yilun Du, Benjamin Burchfiel, Russ Tedrake, and Shuran Song. Diffusion policy: Visuomotor policy learning via action diffusion. volume 44, pages 1684–1704. Sage Publications Sage UK: London, England, 2025
2025
-
[56]
Chain-of-action: Trajectory autoregressive modeling for robotic manipulation
Wenbo Zhang, Tianrun Hu, Yanyuan Qiao, Hanbo Zhang, Yuchu Qin, Yang Li, Jiajun Liu, Tao Kong, Lingqiao Liu, and Xiao Ma. Chain-of-action: Trajectory autoregressive modeling for robotic manipulation. InAdvances in Neural Information Processing Systems (NeurIPS), 2025
2025
-
[57]
Dextrack: Towards generaliz- able neural tracking control for dexterous manipulation from human references
Xueyi Liu, Jianibieke Adalibieke, Qianwei Han, Yuzhe Qin, and Li Yi. Dextrack: Towards generaliz- able neural tracking control for dexterous manipulation from human references. InInternational Conference on Learning Representations (ICLR), 2025
2025
-
[58]
Bohan Zhou, Haoqi Yuan, Yuhui Fu, and Zongqing Lu. Learning diverse bimanual dexterous manipulation skills from human demonstrations.arXiv preprint arXiv:2410.02477, 2024
arXiv 2024
-
[59]
UniDex- Grasp++: Improving dexterous grasping policy learning via geometry-aware curriculum and iterative generalist-specialist learning
Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang, Li Yi, and He Wang. UniDex- Grasp++: Improving dexterous grasping policy learning via geometry-aware curriculum and iterative generalist-specialist learning. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 3891–3902, 2023
2023
-
[60]
6-dof graspnet: Variational grasp generation for object manipulation
Arsalan Mousavian, Clemens Eppner, and Dieter Fox. 6-dof graspnet: Variational grasp generation for object manipulation. InInternational Conference on Computer Vision (ICCV), pages 2901–2910, 2019
2019
-
[61]
GANHand: Predicting human grasp affordances in multi-object scenes
Enric Corona, Albert Pumarola, Guillem Alenyà, Francesc Moreno-Noguer, and Gregory Rogez. GANHand: Predicting human grasp affordances in multi-object scenes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5031–5041, 2020
2020
-
[62]
Black, Krikamol Muandet, and Siyu Tang
Korrawe Karunratanakul, Jinlong Yang, Yan Zhang, Michael J. Black, Krikamol Muandet, and Siyu Tang. Grasping field: Learning implicit representations for human grasps. InInternational Conference on 3D Vision (3DV), pages 333–344, 2020
2020
-
[63]
Twigg, Minh Vo, Samarth Brahmbhatt, and Charles C
Patrick Grady, Chengcheng Tang, Christopher D. Twigg, Minh Vo, Samarth Brahmbhatt, and Charles C. Kemp. ContactOpt: Optimizing contact to improve grasps. InConference on Computer Vision and Pattern Recognition (CVPR), pages 1471–1481, 2021
2021
-
[64]
Grasping a handful: Sequential multi-object dexterous grasp generation.IEEE Robotics and Automation Letters, 2025
Haofei Lu, Yifei Dong, Zehang Weng, Florian Pokorny, Jens Lundell, and Danica Kragic. Grasping a handful: Sequential multi-object dexterous grasp generation.IEEE Robotics and Automation Letters, 2025
2025
-
[65]
Qwen2.5-VL technical report.arXiv preprint arXiv:2502.13923, 2025
Shuai Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, Sibo Song, Kai Dang, Peng Wang, Shijie Wang, Jun Tang, et al. Qwen2.5-VL technical report.arXiv preprint arXiv:2502.13923, 2025
Pith/arXiv arXiv 2025
-
[66]
Zhiyu Wu, Xiaokang Chen, Zizheng Pan, Xingchao Liu, Wen Liu, Damai Dai, Huazuo Gao, Yiyang Ma, Chengyue Wu, Bingxuan Wang, et al. DeepSeek-VL2: Mixture-of-experts vision-language models for advanced multimodal understanding.arXiv preprint arXiv:2412.10302, 2024. 21 Algorithm 1SynManDex as proposal, refinement, and executable filtering. Require:Object mesh...
Pith/arXiv arXiv 2024
-
[67]
Human grasp priors are therefore used as initialization rather than as executable labels
(14) Here h0 is a generated MANO pre-grasp, Rψ maps it to a robot seed, and the final optimization is performed in robot configuration space. Human grasp priors are therefore used as initialization rather than as executable labels. Compared with random initialization, a human-prior seed starts in a functional region of Qbi, after which contact, collision,...
2048
-
[68]
Use the grasp keyframe as the initial possession state
-
[69]
Assign explicit roles to the left and right hands
-
[70]
Propose only allowed primitives from the primitive library
-
[71]
Express motion as object-relative waypoints or bounded deltas, not as joint torques or raw robot commands
-
[72]
Preserve possession unless the release condition is explicitly satisfied
-
[73]
The executor will check IK, collision, possession, force-closure, and terminal task success
Do not assume feasibility. The executor will check IK, collision, possession, force-closure, and terminal task success
-
[74]
keyframe_id
If a task would require unmodeled fluid, buttons, articulation, or tactile sensing, phrase the goal as a geometric proxy, e.g., ’tilt the teapot by 35 degrees while maintaining possession’ rather than ’pour liquid’. K.4 User Prompt Template You are given one SynManDex validated grasp keyframe. [VISUAL INPUT] - Multi-view images: front, left, right, top, w...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.