pith. sign in

arxiv: 2606.07389 · v1 · pith:Z3I5VI3Wnew · submitted 2026-06-05 · 💻 cs.RO

Simulation-Driven Imitation Learning for Biosignals-Free Shared-Autonomy Prosthetic Grasping

Pith reviewed 2026-06-27 21:29 UTC · model grok-4.3

classification 💻 cs.RO
keywords prosthetic graspingimitation learningsim-to-real transfershared autonomybiosignals-free controlreach-to-graspsimulation frameworkupper-limb prosthetics
0
0 comments X

The pith

Simulation generates diverse demonstrations that let imitation learning policies for biosignals-free prosthetic grasping reach over 90 percent success after real-world transfer.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to show that a simulation framework can automatically create large numbers of varied reach-to-grasp demonstrations from a virtual wrist camera, replacing the need for costly real human data collection. These demonstrations train imitation learning policies that transfer to actual prosthetic hands. If the claim holds, shared-autonomy control becomes more practical because training scales without depending on physiological signals or extensive user recordings. A sympathetic reader would care because the work targets the data bottleneck that currently limits natural, low-effort prosthetic manipulation.

Core claim

The authors present a simulation framework that combines physically feasible grasp synthesis, retargeted natural reaching trajectories, and reach-grasp-lift execution inside procedurally generated indoor scenes. Wrist-view images, proprioception, and actions are recorded to form a large demonstration dataset. Policies trained on this data achieve over 90 percent grasp success in three realistic settings, exceed baseline imitation learning methods, and show improved generalization to new objects and scenes.

What carries the argument

The simulation framework that automatically produces diverse reach-to-grasp demonstrations from a wrist-mounted virtual camera using grasp synthesis and trajectory retargeting.

If this is right

  • Prosthetic grasping policies can be trained at scale without collecting large volumes of real-world human demonstrations.
  • The resulting policies achieve high success rates when transferred to physical upper-limb prosthetics in realistic conditions.
  • Imitation learning methods gain stronger object and scene generalization from the consistent simulated data than from limited real demonstrations.
  • Biosignals-free shared-autonomy control becomes feasible for reach-to-grasp tasks using only wrist-view observations and proprioception.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same simulation approach might lower the cost of adapting control policies to different prosthetic hardware designs.
  • Similar automated demonstration generation could support training for other manipulation tasks where real data is difficult to obtain.
  • Longer-term testing with users who have varying levels of amputation could clarify how well the policies feel natural in daily use.

Load-bearing premise

The simulated demonstrations are rich enough and consistent enough for imitation learning policies to transfer to real prosthetic hands without large performance losses from domain differences.

What would settle it

Deploying the trained policy on a physical prosthetic hand and measuring grasp success rates well below 90 percent across varied real objects and indoor scenes would show the simulation data is insufficient.

Figures

Figures reproduced from arXiv: 2606.07389 by Hanli Zhao, Huiling Chen, Kaijie Shi, Ting Zou, Vinicius Prado da Fonseca, Wanglong Lu, Xianta Jiang.

Figure 1
Figure 1. Figure 1: Overview of the proposed simulated data collection framework. Given an object and a hand model, we first synthesize feasible grasps (a). A human wrist trajectory for reaching and grasping (b) is then sampled and retargeted to the synthesize grasp before being executed in the simulation environment (c). During this process, wrist-view images, proprioceptive measurements and action targets are recorded to bu… view at source ↗
Figure 2
Figure 2. Figure 2: 200 objects used in this paper. sequence in simulation (Section III-D), while recording wrist￾camera images, proprioception, and joint actions to construct the dataset for training autonomous grasping models. A. Objects and Prosthetic hand Objects. As shown in [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Wide-angle corner views of ten indoor rooms were generated with Infinigen Indoors [59] and used in our experiments. to other multi-DOF prosthetic hands with minimal changes, primarily in action-space definition, kinematic retargeting, and sensor calibration, provided that a suitable 3D model of the target hand is available for simulation. System Identification. We calibrate joint dynamics in simulation via… view at source ↗
Figure 4
Figure 4. Figure 4: Generalization performance for models trained on diverse number of objects, and rooms. ’SR’ means success rate, ’CR’ means close before lifting rate, and ’OR’ means open during reaching rate. (42.40% vs. 44.20%), while VTM-VAE maintains markedly higher CR (94.50% vs. 85.96%) and similar OR (92.70% vs. 93.90%). In contrast, the prosthetic-specific diffusion baseline HannesImitation underperforms across all … view at source ↗
Figure 5
Figure 5. Figure 5: Objects, wearable prosthetic hand, and three scenes used in the realistic-setting experiments. A. Objects, prosthetic hand, and scenes Objects [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
read the original abstract

Biosignals-free shared-autonomy control of upper-limb prosthetic hands aims to enable natural and low-effort manipulation without relying on EMG or other physiological signals. Recent imitation-learning-based approaches have shown promising results, but their scalability is limited by the cost and variability of collecting large amounts of real-world human demonstration data. In this work, we present a scalable simulation framework that automatically generates diverse reach-to-grasp demonstrations from a wrist-mounted virtual camera. The framework combines physically feasible grasp synthesis, natural reaching trajectories retargeting, and reach--grasp--lift execution in procedurally generated indoor environments. It records wrist-view observations, proprioception, and actions to build a large-scale demonstration dataset for imitation learning. Through extensive simulation benchmarks, we evaluate object and scene generalization and compare several representative state-of-the-art imitation learning methods. Results show that the simulated demonstrations are sufficiently rich and consistent for effective policy learning. In three realistic settings, the learned sim-to-real policy achieves over 90\% grasp success, surpasses baseline methods, and exhibits stronger generalization, highlighting the promise of simulation-driven training for biosignals-free shared-autonomy prosthetic grasping. The demonstrations are available at \href{https://sites.google.com/view/sim-prosthetic-grasp/home}{https://sites.google.com/view/sim-prosthetic-grasp/home}.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces a scalable simulation framework for automatically generating diverse reach-to-grasp demonstrations using physically feasible grasp synthesis, natural trajectory retargeting, and procedural indoor environments with wrist-mounted camera observations. These demonstrations train imitation learning policies for biosignals-free shared-autonomy prosthetic grasping. Simulation benchmarks compare state-of-the-art IL methods on object and scene generalization, and real-world experiments in three settings report over 90% grasp success for the sim-to-real policy, outperforming baselines with stronger generalization. Demonstrations are made publicly available.

Significance. If the reported sim-to-real transfer holds, the work provides a practical path to scale imitation learning for prosthetic control by replacing costly real-world human demonstrations with procedurally generated simulation data. This could lower barriers to developing natural, low-effort shared-autonomy systems. The public release of the demonstration dataset supports reproducibility and further research in the area.

major comments (2)
  1. [Abstract, §4] Abstract and §4 (Results): The central claim that simulated demonstrations are 'sufficiently rich and consistent' for >90% real-world grasp success and superior generalization rests on the three realistic settings; however, without explicit quantification of domain gap (e.g., distribution of object poses, lighting variance, or wrist-camera calibration differences between sim and real), it is difficult to determine whether the success generalizes beyond the tested cases or reflects post-hoc environment selection.
  2. [§3, §5] §3 (Framework) and §5 (Experiments): The pipeline combines grasp synthesis, retargeting, and IL training, but the manuscript does not report an ablation isolating the contribution of procedural environment generation versus grasp synthesis alone; this leaves open whether the reported gains over baselines are driven primarily by data volume or by the specific procedural diversity.
minor comments (2)
  1. [Abstract] The abstract states results for 'several representative state-of-the-art imitation learning methods' but does not name them; adding the specific algorithms (e.g., BC, GAIL, or others) in the abstract would improve clarity.
  2. [§4] Figure captions and §4 should explicitly state the number of trials per setting and whether success is measured over consecutive lifts or single grasps to allow direct comparison with prior prosthetic grasping literature.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment and recommendation for minor revision. We address each major comment below and will update the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract, §4] Abstract and §4 (Results): The central claim that simulated demonstrations are 'sufficiently rich and consistent' for >90% real-world grasp success and superior generalization rests on the three realistic settings; however, without explicit quantification of domain gap (e.g., distribution of object poses, lighting variance, or wrist-camera calibration differences between sim and real), it is difficult to determine whether the success generalizes beyond the tested cases or reflects post-hoc environment selection.

    Authors: We agree that explicit quantification of the domain gap would provide stronger support for the generalization claims. The three settings were selected as representative indoor scenarios with varied objects and layouts rather than post-hoc choices, and the consistent >90% success rates across them indicate effective sim-to-real transfer. In the revised manuscript, we will add quantitative analysis of key domain-gap factors, including statistics on object pose distributions, lighting variance, and wrist-camera calibration differences between simulation and real setups. revision: yes

  2. Referee: [§3, §5] §3 (Framework) and §5 (Experiments): The pipeline combines grasp synthesis, retargeting, and IL training, but the manuscript does not report an ablation isolating the contribution of procedural environment generation versus grasp synthesis alone; this leaves open whether the reported gains over baselines are driven primarily by data volume or by the specific procedural diversity.

    Authors: We concur that an ablation isolating the role of procedural environment generation would clarify the source of performance gains. The reported results evaluate the integrated pipeline against baselines but do not separate the contributions of procedural diversity from grasp synthesis. In the revision, we will add an ablation comparing policies trained on grasp-synthesis-only demonstrations versus the full procedurally generated dataset to quantify the impact of the procedural component. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is an empirical simulation study that generates demonstration data procedurally and evaluates imitation learning policies via benchmarks and real-world transfer experiments. No equations, derivations, or parameter-fitting steps are present in the abstract or described pipeline, so no claimed prediction reduces to its inputs by construction. The central success claims rest on experimental outcomes rather than self-referential definitions or self-citation chains that would force the result.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only access prevents identification of specific free parameters, axioms, or invented entities; no mathematical derivations or modeling assumptions are detailed.

pith-pipeline@v0.9.1-grok · 5788 in / 992 out tokens · 15381 ms · 2026-06-27T21:29:18.726406+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

64 extracted references · 7 canonical work pages · 3 internal anchors

  1. [1]

    Proprioceptive sonomyographic control: A novel method for intuitive and proportional control of multiple degrees-of-freedom for individuals with upper extremity limb loss,

    A. S. Dhawan, B. Mukherjee, S. Patwardhan, N. Akhlaghi, G. Diao, G. Levay, R. Holley, W. M. Joiner, M. Harris-Love, and S. Sikdar, “Proprioceptive sonomyographic control: A novel method for intuitive and proportional control of multiple degrees-of-freedom for individuals with upper extremity limb loss,”Scientific reports, vol. 9, no. 1, p. 9499, 2019

  2. [2]

    The extraction of neural information from the surface emg for the control of upper-limb prostheses: emerging avenues and challenges,

    D. Farina, N. Jiang, H. Rehbaum, A. Holobar, B. Graimann, H. Dietl, and O. C. Aszmann, “The extraction of neural information from the surface emg for the control of upper-limb prostheses: emerging avenues and challenges,”IEEE transactions on neural systems and rehabilitation engineering, vol. 22, no. 4, pp. 797–809, 2014

  3. [3]

    Online myoelectric control of a dexterous hand prosthesis by transradial amputees,

    C. Cipriani, C. Antfolk, M. Controzzi, G. Lundborg, B. Ros ´en, M. C. Carrozza, and F. Sebelius, “Online myoelectric control of a dexterous hand prosthesis by transradial amputees,”IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 19, no. 3, pp. 260–270, 2011

  4. [4]

    Prosthetic myoelectric control strategies: a clinical perspective,

    A. D. Roche, H. Rehbaum, D. Farina, and O. C. Aszmann, “Prosthetic myoelectric control strategies: a clinical perspective,”Current Surgery Reports, vol. 2, pp. 1–11, 2014. 10

  5. [5]

    Musclenet: mapping electromyography to kinematic and dynamic biomechanical variables by machine learning,

    A. Nasr, S. Bell, J. He, R. L. Whittaker, N. Jiang, C. R. Dickerson, and J. McPhee, “Musclenet: mapping electromyography to kinematic and dynamic biomechanical variables by machine learning,”Journal of Neural Engineering, vol. 18, no. 4, p. 0460d3, 2021

  6. [6]

    Evaluation of the forearm emg signal features for the control of a prosthetic hand,

    R. Boostani and M. H. Moradi, “Evaluation of the forearm emg signal features for the control of a prosthetic hand,”Physiological Measurement, vol. 24, no. 2, p. 309, 2003

  7. [7]

    A method for the control of multigrasp myoelectric prosthetic hands,

    S. A. Dalley, H. A. Varol, and M. Goldfarb, “A method for the control of multigrasp myoelectric prosthetic hands,”IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 20, no. 1, pp. 58–67, 2011

  8. [8]

    The optimal controller delay for myoelec- tric prostheses,

    T. R. Farrell and R. F. Weir, “The optimal controller delay for myoelec- tric prostheses,”IEEE Transactions on neural systems and rehabilitation engineering, vol. 15, no. 1, pp. 111–118, 2007

  9. [9]

    Effect of muscle fatigue on surface electromyography-based hand grasp force estimation,

    J. Wang, M. Pang, P. Yu, B. Tang, K. Xiang, and Z. Ju, “Effect of muscle fatigue on surface electromyography-based hand grasp force estimation,” Applied Bionics and Biomechanics, vol. 2021, no. 1, p. 8817480, 2021

  10. [10]

    Simultaneous semg recognition of gestures and force levels for interaction with prosthetic hand,

    B. Fang, C. Wang, F. Sun, Z. Chen, J. Shan, H. Liu, W. Ding, and W. Liang, “Simultaneous semg recognition of gestures and force levels for interaction with prosthetic hand,”IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 30, pp. 2426–2436, 2022

  11. [11]

    Cognitive vision system for control of dexterous prosthetic hands: experimental evaluation,

    S. Do ˇsen, C. Cipriani, M. Kosti ´c, M. Controzzi, M. C. Carrozza, and D. B. Popovi ´c, “Cognitive vision system for control of dexterous prosthetic hands: experimental evaluation,”Journal of Neuroengineering and Rehabilitation, vol. 7, pp. 1–14, 2010

  12. [12]

    Vision-based assistance for myoelectric hand control,

    Y . He, R. Kubozono, O. Fukuda, N. Yamaguchi, and H. Okumura, “Vision-based assistance for myoelectric hand control,”IEEE Access, vol. 8, pp. 201 956–201 965, 2020

  13. [13]

    Toward biosignals-free autonomous prosthetic hand control via imita- tion learning,

    K. Shi, W. Lu, H. Zhao, V . Prado da Fonseca, T. Zou, and X. Jiang, “Toward biosignals-free autonomous prosthetic hand control via imita- tion learning,”IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 33, pp. 3544–3554, 2025

  14. [14]

    Hannesimitation: Grasping with the hannes prosthetic hand via imita- tion learning,

    C. Alessi, F. Vasile, F. Ceola, G. Pasquale, N. Boccardo, and L. Natale, “Hannesimitation: Grasping with the hannes prosthetic hand via imita- tion learning,” in2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2025, pp. 10 085–10 092

  15. [15]

    Robot learning in homes: Improving generalization and reducing dataset bias,

    A. Gupta, A. Murali, D. P. Gandhi, and L. Pinto, “Robot learning in homes: Improving generalization and reducing dataset bias,”Advances in neural information processing systems, vol. 31, 2018

  16. [16]

    Domain randomization for transferring deep neural networks from simulation to the real world,

    J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain randomization for transferring deep neural networks from simulation to the real world,” in2017 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2017, pp. 23–30

  17. [17]

    Vision-based manipulators need to also see from their hands

    K. Hsu, M. J. Kim, R. Rafailov, J. Wu, and C. Finn, “Vision-based manipulators need to also see from their hands.” inICLR, 2022

  18. [18]

    OpenVLA: An Open-Source Vision-Language-Action Model

    M. Kim, K. Pertsch, S. Karamcheti, T. Xiao, A. Balakrishna, S. Nair, R. Rafailov, E. Foster, G. Lam, P. Sanketi, Q. Vuong, T. Kollar, B. Burch- fiel, R. Tedrake, D. Sadigh, S. Levine, P. Liang, and C. Finn, “Open- vla: An open-source vision-language-action model,”arXiv preprint arXiv:2406.09246, 2024

  19. [19]

    Virtual reality for evaluating prosthetic hand control strategies: A preliminary report,

    J. Xie and X. Hu, “Virtual reality for evaluating prosthetic hand control strategies: A preliminary report,” in2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 2021, pp. 6263–6266

  20. [20]

    A low-cost real-time research platform for emg pattern recognition-based prosthetic hand,

    P. Geethanjali and K. Ray, “A low-cost real-time research platform for emg pattern recognition-based prosthetic hand,”IEEE/ASME Transac- tions on Mechatronics, vol. 20, no. 4, pp. 1948–1955, 2014

  21. [21]

    A classification method for myoelectric control of hand prostheses inspired by muscle coordination,

    G. K. Patel, C. Castellini, J. M. Hahne, D. Farina, and S. Dosen, “A classification method for myoelectric control of hand prostheses inspired by muscle coordination,”IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 26, no. 9, pp. 1745–1755, 2018

  22. [22]

    Concurrent adaptation of human and machine improves simultaneous and proportional myoelectric control,

    J. M. Hahne, S. D ¨ahne, H.-J. Hwang, K.-R. M ¨uller, and L. C. Parra, “Concurrent adaptation of human and machine improves simultaneous and proportional myoelectric control,”IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 23, no. 4, pp. 618–627, 2015

  23. [23]

    Myoelectric control of artificial limbs—is there a need to change focus?[in the spotlight],

    N. Jiang, S. Dosen, K.-R. Muller, and D. Farina, “Myoelectric control of artificial limbs—is there a need to change focus?[in the spotlight],” IEEE Signal Processing Magazine, vol. 29, no. 5, pp. 152–150, 2012

  24. [24]

    In- tuitive, online, simultaneous, and proportional myoelectric control over two degrees-of-freedom in upper limb amputees,

    N. Jiang, H. Rehbaum, I. Vujaklija, B. Graimann, and D. Farina, “In- tuitive, online, simultaneous, and proportional myoelectric control over two degrees-of-freedom in upper limb amputees,”IEEE transactions on neural systems and rehabilitation engineering, vol. 22, no. 3, pp. 501– 510, 2013

  25. [25]

    A powered prosthetic hand with vision system for enhancing the anthropo- pathic grasp,

    Y . Xu, X. Wang, J. Li, X. Zhang, F. Li, Q. Gao, C. Fu, and Y . Leng, “A powered prosthetic hand with vision system for enhancing the anthropo- pathic grasp,”IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2025

  26. [26]

    Exploiting arm posture synergies in activities of daily living to control the wrist rotation in upper limb prostheses: A feasibility study,

    F. Montagnani, M. Controzzi, and C. Cipriani, “Exploiting arm posture synergies in activities of daily living to control the wrist rotation in upper limb prostheses: A feasibility study,” in2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, 2015, pp. 2462–2465

  27. [27]

    The synergy complement control approach for seamless limb-driven prostheses,

    J. K ¨uhn, T. Hu, A. T ¨odtheide, E. Pozo Fortuni ´c, E. Jensen, and S. Haddadin, “The synergy complement control approach for seamless limb-driven prostheses,”Nature Machine Intelligence, vol. 6, no. 4, pp. 481–492, 2024

  28. [28]

    Assessment of an automatic prosthetic elbow control strategy using residual limb motion for transhumeral amputated individuals with socket or osseointegrated prostheses,

    M. Merad, E. De Montalivet, M. Legrand, E. Mastinu, M. Ortiz-Catalan, A. Touillet, N. Martinet, J. Paysant, A. Roby-Brami, and N. Jarrasse, “Assessment of an automatic prosthetic elbow control strategy using residual limb motion for transhumeral amputated individuals with socket or osseointegrated prostheses,”IEEE Transactions on Medical Robotics and Bion...

  29. [29]

    Imu-based wrist rotation control of a transradial myoelectric prosthesis,

    D. A. Bennett and M. Goldfarb, “Imu-based wrist rotation control of a transradial myoelectric prosthesis,”IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 26, no. 2, pp. 419–427, 2017

  30. [30]

    Dynamic switching and real-time machine learning for im- proved human control of assistive biomedical robots,

    P. M. Pilarski, M. R. Dawson, T. Degris, J. P. Carey, and R. S. Sutton, “Dynamic switching and real-time machine learning for im- proved human control of assistive biomedical robots,” in2012 4th IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics. IEEE, 2012, pp. 296–302

  31. [31]

    Explorations of autonomous prosthetic grasping via proximity vision and deep learning,

    E. Mastinu, A. Coletti, J. van den Berg, and C. Cipriani, “Explorations of autonomous prosthetic grasping via proximity vision and deep learning,” IEEE Transactions on Medical Robotics and Bionics, 2024

  32. [32]

    Proximity perception-based grasping intel- ligence: toward the seamless control of a dexterous prosthetic hand,

    S.-H. Heo and H.-S. Park, “Proximity perception-based grasping intel- ligence: toward the seamless control of a dexterous prosthetic hand,” IEEE/ASME Transactions on Mechatronics, vol. 29, no. 3, pp. 2079– 2090, 2023

  33. [33]

    Graspnet-1billion: A large- scale benchmark for general object grasping,

    H.-S. Fang, C. Wang, M. Gou, and C. Lu, “Graspnet-1billion: A large- scale benchmark for general object grasping,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 11 444–11 453

  34. [34]

    Bodex: Scalable and efficient robotic dexterous grasp synthesis using bilevel optimization,

    J. Chen, Y . Ke, and H. Wang, “Bodex: Scalable and efficient robotic dexterous grasp synthesis using bilevel optimization,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 01–08

  35. [35]

    Dexgraspnet: A large-scale robotic dexterous grasp dataset for general objects based on simulation,

    R. Wang, J. Zhang, J. Chen, Y . Xu, P. Li, T. Liu, and H. Wang, “Dexgraspnet: A large-scale robotic dexterous grasp dataset for general objects based on simulation,”arXiv preprint arXiv:2210.02697, 2022

  36. [36]

    Struc- tured local feature-conditioned 6-dof variational grasp detection network in cluttered scenes,

    H. Liu, H. Li, C. Jiang, S. Xue, Y . Zhao, X. Huang, and Z. Jiang, “Struc- tured local feature-conditioned 6-dof variational grasp detection network in cluttered scenes,”IEEE/ASME Transactions on Mechatronics, 2024

  37. [37]

    Trustworthy robotic grasp- ing: A credibility alignment framework via self-regulation encoding,

    H. Yu, X. Zhang, Z. Zhao, and C. He, “Trustworthy robotic grasp- ing: A credibility alignment framework via self-regulation encoding,” IEEE/ASME Transactions on Mechatronics, 2025

  38. [38]

    Anthropomorphic grasp motion planning for humanoid robots via learned riemannian metric and dextrous grasp evaluator,

    W. Xu, Z. Geng, X. Shi, W. Guo, and X. Sheng, “Anthropomorphic grasp motion planning for humanoid robots via learned riemannian metric and dextrous grasp evaluator,”IEEE/ASME Transactions on Mechatronics, 2025

  39. [39]

    Toward collision-aware robotic fragile fruit grasping: A sim-to-real framework for perception, reasoning, and execution,

    Q. Wang, K. Bai, L. Zhang, Q. Li, A. Knoll, J. Zhang, Y . Ying, and M. Zhou, “Toward collision-aware robotic fragile fruit grasping: A sim-to-real framework for perception, reasoning, and execution,” IEEE/ASME Transactions on Mechatronics, 2025

  40. [40]

    Dexgraspnet 2.0: Learning generative dexterous grasping in large-scale synthetic cluttered scenes,

    J. Zhang, H. Liu, D. Li, X. Yu, H. Geng, Y . Ding, J. Chen, and H. Wang, “Dexgraspnet 2.0: Learning generative dexterous grasping in large-scale synthetic cluttered scenes,” in8th Annual Conference on Robot Learning, 2024

  41. [41]

    Domain randomization and generative models for robotic grasping,

    J. Tobin, L. Biewald, R. Duan, M. Andrychowicz, A. Handa, V . Ku- mar, B. McGrew, A. Ray, J. Schneider, P. Welinderet al., “Domain randomization and generative models for robotic grasping,” in2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018, pp. 3482–3489

  42. [42]

    Domain randomization for sim2real transfer of automatically generated grasping datasets,

    J. Huber, F. H ´el´enon, H. Watrelot, F. B. Amar, and S. Doncieux, “Domain randomization for sim2real transfer of automatically generated grasping datasets,” in2024 IEEE international conference on robotics and automation (ICRA). IEEE, 2024, pp. 4112–4118

  43. [43]

    Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware

    T. Z. Zhao, V . Kumar, S. Levine, and C. Finn, “Learning fine- grained bimanual manipulation with low-cost hardware,”arXiv preprint arXiv:2304.13705, 2023

  44. [44]

    Human-in-the-loop task and motion planning for imitation learning,

    A. Mandlekar, C. R. Garrett, D. Xu, and D. Fox, “Human-in-the-loop task and motion planning for imitation learning,” inConference on Robot Learning. PMLR, 2023, pp. 3030–3060

  45. [45]

    Data-driven planning via imitation learning,

    S. Choudhury, M. Bhardwaj, S. Arora, A. Kapoor, G. Ranade, S. Scherer, and D. Dey, “Data-driven planning via imitation learning,”The Interna- 11 tional Journal of Robotics Research, vol. 37, no. 13-14, pp. 1632–1672, 2018

  46. [46]

    A survey of imitation learning: Algorithms, recent developments, and challenges,

    M. Zare, P. M. Kebria, A. Khosravi, and S. Nahavandi, “A survey of imitation learning: Algorithms, recent developments, and challenges,” IEEE Transactions on Cybernetics, vol. 54, no. 12, pp. 7173–7186, 2024

  47. [47]

    Diffusion policy: Visuomotor policy learning via ac- tion diffusion,

    C. Chi, Z. Xu, S. Feng, E. Cousineau, Y . Du, B. Burchfiel, R. Tedrake, and S. Song, “Diffusion policy: Visuomotor policy learning via ac- tion diffusion,”The International Journal of Robotics Research, p. 02783649241273668, 2023

  48. [48]

    Behavioral Cloning from Observation

    F. Torabi, G. Warnell, and P. Stone, “Behavioral cloning from observa- tion,”arXiv preprint arXiv:1805.01954, 2018

  49. [49]

    A reduction of imitation learning and structured prediction to no-regret online learning,

    S. Ross, G. Gordon, and D. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” inProceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011, pp. 627–635

  50. [50]

    Hg-dagger: Interactive imitation learning with human experts,

    M. Kelly, C. Sidrane, K. Driggs-Campbell, and M. J. Kochenderfer, “Hg-dagger: Interactive imitation learning with human experts,” in2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 8077–8083

  51. [51]

    End-to-end training of deep visuomotor policies,

    S. Levine, C. Finn, T. Darrell, and P. Abbeel, “End-to-end training of deep visuomotor policies,”Journal of Machine Learning Research, vol. 17, no. 39, pp. 1–40, 2016

  52. [52]

    Sim-to-real via sim-to-sim: Data-efficient robotic grasping via randomized-to-canonical adaptation networks,

    S. James, P. Wohlhart, M. Kalakrishnan, D. Kalashnikov, A. Irpan, J. Ibarz, S. Levine, R. Hadsell, and K. Bousmalis, “Sim-to-real via sim-to-sim: Data-efficient robotic grasping via randomized-to-canonical adaptation networks,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12 627–12 637

  53. [53]

    A real-to-sim-to-real approach to robotic manipulation with vlm-generated iterative keypoint rewards,

    S. Patel, X. Yin, W. Huang, S. Garg, H. Nayyeri, L. Fei-Fei, S. Lazebnik, and Y . Li, “A real-to-sim-to-real approach to robotic manipulation with vlm-generated iterative keypoint rewards,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 8258–8266

  54. [54]

    Transferring policy of deep reinforcement learning from simulation to reality for robotics,

    H. Ju, R. Juan, R. Gomez, K. Nakamura, and G. Li, “Transferring policy of deep reinforcement learning from simulation to reality for robotics,” Nature Machine Intelligence, vol. 4, no. 12, pp. 1077–1087, 2022

  55. [55]

    Residual off-policy rl for finetuning behavior cloning policies,

    L. Ankile, Z. Jiang, R. Duan, G. Shi, P. Abbeel, and A. Nagabandi, “Residual off-policy rl for finetuning behavior cloning policies,”arXiv preprint arXiv:2509.19301, 2025

  56. [56]

    Google scanned objects: A high- quality dataset of 3d scanned household items,

    L. Downs, A. Francis, N. Koenig, B. Kinman, R. Hickman, K. Reymann, T. B. McHugh, and V . Vanhoucke, “Google scanned objects: A high- quality dataset of 3d scanned household items,” in2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 2553–2560

  57. [57]

    Multi- grippergrasp: A dataset for robotic grasping from parallel jaw grippers to dexterous hands,

    L. F. Casas, N. Khargonkar, B. Prabhakaran, and Y . Xiang, “Multi- grippergrasp: A dataset for robotic grasping from parallel jaw grippers to dexterous hands,” in2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2024, pp. 2978–2984

  58. [58]

    The world’s first touch-sensing bionic hand,

    PSYONIC, “The world’s first touch-sensing bionic hand,” [Online]. Available: https://www.psyonic.io/, 2024, accessed: 2026-01-19

  59. [59]

    Infinigen indoors: Photorealistic indoor scenes using procedural generation,

    A. Raistrick, L. Mei, K. Kayan, D. Yan, Y . Zuo, B. Han, H. Wen, M. Parakh, S. Alexandropoulos, L. Lipsonet al., “Infinigen indoors: Photorealistic indoor scenes using procedural generation,” inProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 21 783–21 794

  60. [60]

    Bodex: Scalable and efficient robotic dexterous grasp synthesis using bilevel optimization,

    J. Chen, Y . Ke, and H. Wang, “Bodex: Scalable and efficient robotic dexterous grasp synthesis using bilevel optimization,”arXiv preprint arXiv:2412.16490, 2024

  61. [61]

    Bring your own grasp generator: Leveraging robot grasp generation for prosthetic grasping,

    G. Stracquadanio, F. Vasile, E. Maiettini, N. Boccardo, and L. Natale, “Bring your own grasp generator: Leveraging robot grasp generation for prosthetic grasping,”arXiv preprint arXiv:2503.00466, 2025

  62. [62]

    Contact- graspnet: Efficient 6-dof grasp generation in cluttered scenes,

    M. Sundermeyer, A. Mousavian, R. Triebel, and D. Fox, “Contact- graspnet: Efficient 6-dof grasp generation in cluttered scenes,” in2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 13 438–13 444

  63. [63]

    Learning score-based grasping primitive for human-assisting dexterous grasping,

    T. Wu, M. Wu, J. Zhang, Y . Gan, and H. Dong, “Learning score-based grasping primitive for human-assisting dexterous grasping,”Advances in Neural Information Processing Systems, vol. 36, pp. 22 132–22 150, 2023

  64. [64]

    Dexycb: A benchmark for capturing hand grasping of objects,

    Y .-W. Chao, W. Yang, Y . Xiang, P. Molchanov, A. Handa, J. Tremblay, Y . S. Narang, K. Van Wyk, U. Iqbal, S. Birchfieldet al., “Dexycb: A benchmark for capturing hand grasping of objects,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 9044–9053