EaDex: A Cross-Embodiment Dexterous Manipulation Framework from Low-Cost Demonstrations

Chengdong Wu; Qian Zhao; Xin Tong; Yang Yang; Yingtian Li

arxiv: 2606.03268 · v1 · pith:UUKC4XQVnew · submitted 2026-06-02 · 💻 cs.RO

EaDex: A Cross-Embodiment Dexterous Manipulation Framework from Low-Cost Demonstrations

Qian Zhao , Xin Tong , Chengdong Wu , Yang Yang , Yingtian Li This is my paper

Pith reviewed 2026-06-28 09:54 UTC · model grok-4.3

classification 💻 cs.RO

keywords dexterous manipulationcross-embodiment learninglow-cost demonstrationsdemonstration annealingreinforcement learningimitation learningRGB-D captureMANO hand model

0 comments

The pith

EaDex learns dexterous manipulation across robot hands from single-camera human demos via contact-based annealing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes EaDex as a way to cut the high costs of data collection and training that have limited dexterous manipulation. Human motions are recorded with one RGB-D camera, then modeled, normalized, and retargeted to produce usable demonstrations for different robot hands. Training begins under demonstration guidance but shifts toward independent optimization once contact rewards accumulate through a dynamic annealing process. The method is shown to work on nine combinations of three hands and three object-opening tasks. A reader would care because cheaper data pipelines and smoother imitation-to-reinforcement transitions could make advanced robot skills more practical to develop.

Core claim

EaDex is a multi-embodiment dexterous manipulation learning framework that captures human hand motions using only a single RGB-D camera and constructs structured demonstration data through MANO-based hand modeling, data normalization, and motion retargeting; at the learning level it introduces a contact-reward-based dynamic demonstration annealing mechanism that guides early-stage exploration under demonstration and gradually transitions to autonomous optimization with accumulating contact rewards, achieving a 55.3% relative improvement over the baseline without demonstration annealing on three dexterous hands and three articulated object-opening tasks covering nine cross-embodiment settings

What carries the argument

The contact-reward-based dynamic demonstration annealing mechanism, which starts training under demonstration guidance and reduces that guidance as contact rewards accumulate to enable autonomous optimization.

If this is right

Demonstration data can be generated rapidly from low-cost camera captures, shortening overall training time.
The same pipeline supports nine distinct cross-embodiment settings on three different dexterous hands and three object-opening tasks.
The annealing strategy yields a 55.3 percent relative performance gain compared with training that lacks it.
Both the low-cost data pipeline and the annealing approach are validated as effective for dexterous manipulation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same low-cost capture and retargeting steps could be applied to non-opening manipulation skills without major redesign.
Contact-driven annealing might transfer to other imitation-plus-reinforcement setups where embodiment changes frequently.
Lower data costs could shorten the time from lab prototype to varied real-world dexterous deployments.
Testing the framework on tasks without clear contact events would reveal how far the reward signal generalizes.

Load-bearing premise

The contact reward definition supplies a reliable, unbiased signal for deciding when to reduce demonstration guidance without requiring per-embodiment tuning.

What would settle it

Training runs across the nine settings that remove the annealing schedule or alter the contact reward definition and produce no measurable gain over the non-annealed baseline.

Figures

Figures reproduced from arXiv: 2606.03268 by Chengdong Wu, Qian Zhao, Xin Tong, Yang Yang, Yingtian Li.

**Figure 1.** Figure 1: Overview of EaDex. An end-to-end dexterous manipulation framework that captures human hand demonstrations using a single RGB-D camera and performs corresponding manipulation tasks across dexterous hands with different embodiments. arXiv:2606.03268v1 [cs.RO] 2 Jun 2026 [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: Hand Gesture Demonstration Reconstruction. EaDex enables precise replication of various hand gestures. Left: Keypoints of human hands captured by the RGB-D camera in the real environment. Middle: Hand models reconstructed in Isaac Gym from the captured keypoints. Right: Visualization using the ARCTIC format viewer after saving the data. propose a contact-reward-based dynamic demonstration annealing mechani… view at source ↗

**Figure 3.** Figure 3: Dataset construction process. Predefined articulated object trajectories are first generated, after which the operator controls both hands to follow the object motion and produce feasible bimanual manipulation demonstrations under low-cost RGB-D capture. All interaction data are stored in ARCTIC format for downstream motion retargeting and policy learning. 3.2 Contact-Reward-Based Dynamic Demonstration An… view at source ↗

**Figure 4.** Figure 4: PPO + Demo Annealing. The policy is optimized using PPO with task, imitation, behavior cloning, and contact rewards. During training, contact progress is monitored to trigger adaptive annealing of imitation-related rewards, gradually reducing demonstration dependence while preserving task and contact objectives. 4 Experiments This section evaluates the effectiveness of EaDex from three perspectives. Firs… view at source ↗

**Figure 5.** Figure 5: Experimental Results for Cross-Embodiment Manipulation. Experiments are conducted on our custom dataset. Left: Task success rates across embodiments and tasks with demonstration annealing. Right: Ablation with and without annealing. 4.3 Ablation Study on Dynamic Demonstration Annealing To evaluate the effectiveness of the contact-reward-based dynamic demonstration annealing mechanism, we perform ablatio… view at source ↗

read the original abstract

Dexterous manipulation learning has long been hindered by the high costs of data and training, as pure reinforcement learning typically requires large-scale interactive exploration and imitation learning depends on high-quality demonstrations that are expensive to collect. To address this problem, we propose EaDex, a multi-embodiment dexterous manipulation learning framework under low-cost demonstration conditions, which enables rapid generation of demonstration data and consequently reduces training time for efficient dexterous manipulation. At the data level, EaDex captures human hand motions using only a single RGB-D camera and constructs structured demonstration data through MANO-based hand modeling, data normalization, and motion retargeting. At the learning level, we introduce a contact-reward-based dynamic demonstration annealing mechanism, which guides early-stage exploration under demonstration and gradually transitions to autonomous optimization with accumulating contact rewards. Using our custom dataset, we evaluate EaDex on three dexterous hands and three articulated object-opening tasks, covering nine cross-embodiment manipulation settings, achieving a 55.3% relative improvement over the baseline without demonstration annealing. These results validate the effectiveness of the proposed low-cost demonstration pipeline and the dynamic demonstration annealing strategy for dexterous manipulation learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EaDex gives a workable low-cost capture pipeline with MANO retargeting and contact-based annealing, but the 55.3% claim lacks baselines, reward equations, and checks for embodiment-specific tuning.

read the letter

EaDex combines single RGB-D capture, MANO hand modeling, normalization, retargeting, and a contact-reward annealing schedule to move from imitation to RL across three different robot hands and three object tasks. The central result is the 55.3% relative gain over a no-annealing baseline in nine settings.

The pipeline itself is the clearest contribution. Capturing usable demonstrations with one camera and retargeting them is a direct response to the data-cost problem in dexterous work, and the evaluation scope across multiple embodiments is reasonable for the claim.

The soft spots sit in the empirical support. The abstract gives the improvement number but no description of the baseline algorithm, task success criteria, number of trials, or error bars. More importantly, the contact-reward annealing is presented as the driver of the gain, yet the abstract supplies no equation, normalization, or threshold details. If contact detection or scaling differs by hand mesh or force model, the reported improvement could depend on per-embodiment choices even while the paper claims the schedule needs no tuning. That matches the stress-test concern and makes the central result hard to assess from what is shown.

The work is aimed at researchers building dexterous manipulation systems who need cheaper demonstration pipelines. A reader already working on imitation-to-RL transitions would find the concrete steps useful even if the numbers require verification.

The paper deserves a serious referee. The problem is real, the approach is concrete, and the nine-setting evaluation is a start, but the methods and reward definition need to be expanded before the gain can be trusted.

Referee Report

3 major / 1 minor

Summary. The paper proposes EaDex, a cross-embodiment dexterous manipulation framework that generates low-cost demonstrations from a single RGB-D camera via MANO-based modeling, normalization, and retargeting, then uses a contact-reward-based dynamic demonstration annealing mechanism to guide early training before shifting to autonomous optimization. It evaluates the approach on three dexterous hands and three articulated object-opening tasks (nine settings total) and reports a 55.3% relative improvement over a baseline without annealing.

Significance. If the annealing mechanism proves general and the empirical gains hold under standard reporting, the low-cost data pipeline and embodiment-agnostic training strategy could meaningfully reduce data and compute barriers in dexterous manipulation. The multi-embodiment scope across nine settings is a strength worth building upon.

major comments (3)

[Abstract] Abstract: the central claim of a 55.3% relative improvement supplies no baseline algorithm name or reference, no task-success definition, no trial count, no error bars, and no statistical test, rendering the headline result unverifiable from the provided information.
[Learning level] Learning level (contact-reward annealing): no equation, pseudocode, normalization details, or threshold values are given for the contact reward or the annealing schedule. This is load-bearing for the claim that the mechanism requires no embodiment-specific tuning across the three hands, because contact detection or scaling could embed hand-specific biases.
[Evaluation] Evaluation section: no ablation isolating the annealing contribution from the data pipeline is described, nor are full baseline implementations or hyperparameter settings reported, preventing assessment of whether the reported gain is attributable to the proposed mechanism.

minor comments (1)

[Abstract] The abstract would be clearer if it briefly named the three hands and three tasks rather than referring only to 'our custom dataset.'

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on improving the verifiability and technical detail of the manuscript. We address each major comment below and indicate the planned revisions.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim of a 55.3% relative improvement supplies no baseline algorithm name or reference, no task-success definition, no trial count, no error bars, and no statistical test, rendering the headline result unverifiable from the provided information.

Authors: We agree that the abstract would benefit from additional context to make the headline result verifiable on its own. In the revised manuscript we will expand the abstract to name the baseline (without demonstration annealing), define task success, report trial counts, include error bars, and reference statistical tests. These details already appear in the evaluation section and will be highlighted in the abstract. revision: yes
Referee: [Learning level] Learning level (contact-reward annealing): no equation, pseudocode, normalization details, or threshold values are given for the contact reward or the annealing schedule. This is load-bearing for the claim that the mechanism requires no embodiment-specific tuning across the three hands, because contact detection or scaling could embed hand-specific biases.

Authors: We acknowledge that the current description lacks the mathematical and implementation details needed to substantiate the embodiment-agnostic claim. We will add the contact-reward equation, pseudocode for the dynamic annealing schedule, normalization procedures, and threshold values to the methods section. These additions will explicitly show that no hand-specific tuning is required. revision: yes
Referee: [Evaluation] Evaluation section: no ablation isolating the annealing contribution from the data pipeline is described, nor are full baseline implementations or hyperparameter settings reported, preventing assessment of whether the reported gain is attributable to the proposed mechanism.

Authors: We agree that an explicit ablation isolating the annealing mechanism and fuller reporting of baselines and hyperparameters would strengthen the evaluation. We will add the requested ablation study, baseline implementation details, and hyperparameter tables in the revised manuscript to allow clear attribution of the performance gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical evaluation stands independent of inputs

full rationale

The paper presents EaDex as a framework with a data pipeline (RGB-D capture, MANO modeling, retargeting) and a contact-reward annealing mechanism, evaluated empirically on a custom dataset across nine cross-embodiment settings. The reported 55.3% relative improvement is stated as a measured outcome versus a baseline without annealing. No equations, parameter fits renamed as predictions, self-definitional reductions, or load-bearing self-citations appear in the provided text. The central claim rests on experimental results rather than any derivation that collapses to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract-only review; ledger is necessarily incomplete. The central claim rests on unstated assumptions about retargeting fidelity and reward design that cannot be audited.

axioms (2)

domain assumption MANO hand model plus normalization and retargeting preserves task-relevant motion information across embodiments
Invoked for data construction step; if false the demonstrations would be invalid for the target robots.
ad hoc to paper Contact reward can be defined unambiguously and used to anneal demonstrations without embodiment-specific tuning
Core of the learning-level contribution; location implied in the annealing mechanism description.

pith-pipeline@v0.9.1-grok · 5751 in / 1295 out tokens · 19460 ms · 2026-06-28T09:54:52.585301+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 14 canonical work pages · 4 internal anchors

[2]

T. G. W. Lum, M. Matak, V . Makoviychuk, A. Handa, A. Allshire, T. Hermans, N. D. Ratliff, and K. Van Wyk. Dextrah-g: Pixels-to-action dexterous arm-hand grasping with geometric fabrics.arXiv preprint arXiv:2407.02274, 2024

work page arXiv 2024
[3]

Y . Qin, B. Huang, Z.-H. Yin, H. Su, and X. Wang. Dexpoint: Generalizable point cloud rein- forcement learning for sim-to-real dexterous manipulation. InConference on Robot Learning, pages 594–605. PMLR, 2023

2023
[4]

T. Chen, J. Xu, and P. Agrawal. A system for general in-hand object re-orientation. InConfer- ence on robot learning, pages 297–307. PMLR, 2022

2022
[5]

Huang, I

W. Huang, I. Mordatch, P. Abbeel, and D. Pathak. Generalization in dexterous manipulation via geometry-aware multi-task learning.arXiv preprint arXiv:2111.03062, 2021

work page arXiv 2021
[6]

Zhang, Z

T. Zhang, Z. McCarthy, O. Jow, D. Lee, X. Chen, K. Goldberg, and P. Abbeel. Deep imita- tion learning for complex manipulation tasks from virtual reality teleoperation. In2018 IEEE international conference on robotics and automation (ICRA), pages 5628–5635. Ieee, 2018

2018
[7]

Radosavovic, X

I. Radosavovic, X. Wang, L. Pinto, and J. Malik. State-only imitation learning for dexterous manipulation. In2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 7865–7871. IEEE, 2021

2021
[8]

E. Johns. Coarse-to-fine imitation learning: Robot manipulation from a single demonstration. In2021 IEEE international conference on robotics and automation (ICRA), pages 4613–4619. IEEE, 2021

2021
[9]

L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey.Journal of artificial intelligence research, 4:237–285, 1996

1996
[10]

Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations

A. Rajeswaran, V . Kumar, A. Gupta, G. Vezzani, J. Schulman, E. Todorov, and S. Levine. Learning complex dexterous manipulation with deep reinforcement learning and demonstra- tions.arXiv preprint arXiv:1709.10087, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[11]

H. Zhu, A. Gupta, A. Rajeswaran, S. Levine, and V . Kumar. Dexterous manipulation with deep reinforcement learning: Efficient, general, and low-cost. In2019 International Conference on Robotics and Automation (ICRA), pages 3651–3657. IEEE, 2019

2019
[12]

Z. Fan, O. Taheri, D. Tzionas, M. Kocabas, M. Kaufmann, M. J. Black, and O. Hilliges. Arctic: A dataset for dexterous bimanual hand-object manipulation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12943–12954, 2023

2023
[13]

X. Zhan, L. Yang, Y . Zhao, K. Mao, H. Xu, Z. Lin, K. Li, and C. Lu. Oakink2: A dataset of bimanual hands-object manipulation in complex task completion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 445–456, 2024

2024
[14]

Mandi, Y

Z. Mandi, Y . Hou, D. Fox, Y . Narang, A. Mandlekar, and S. Song. Dexmachina: Functional retargeting for bimanual dexterous manipulation.arXiv preprint arXiv:2505.24853, 2025

work page arXiv 2025
[15]

T. Yuan, B. Guan, W. Ye, Z. Tian, Y . Yang, W. Zhou, Z. Li, Y . Huang, P. Wang, C. Zhao, et al. Unibyd: A unified framework for learning robotic manipulation across embodiments beyond imitation of human demonstrations.arXiv preprint arXiv:2512.11609, 2025

work page arXiv 2025
[16]

T. Tao, M. K. Srirama, J. J. Liu, K. Shaw, and D. Pathak. Dexwild: Dexterous human interac- tions for in-the-wild robot policies.arXiv preprint arXiv:2505.07813, 2025. 9

work page internal anchor Pith review Pith/arXiv arXiv 2025
[17]

O. M. Andrychowicz, B. Baker, M. Chociej, R. Jozefowicz, B. McGrew, J. Pachocki, A. Petron, M. Plappert, G. Powell, A. Ray, et al. Learning dexterous in-hand manipulation. The International Journal of Robotics Research, 39(1):3–20, 2020

2020
[18]

H. Qi, A. Kumar, R. Calandra, Y . Ma, and J. Malik. In-hand object rotation via rapid motor adaptation. InConference on Robot Learning, pages 1722–1732. PMLR, 2023

2023
[19]

Z.-H. Yin, B. Huang, Y . Qin, Q. Chen, and X. Wang. Rotating without seeing: Towards in-hand dexterity through touch.arXiv preprint arXiv:2303.10880, 2023

work page arXiv 2023
[20]

H. Yuan, B. Zhou, Y . Fu, and Z. Lu. Cross-embodiment dexterous grasping with reinforce- ment learning. InInternational Conference on Learning Representations, volume 2025, pages 81413–81434, 2025

2025
[21]

T. Zhu, R. Wu, J. Hang, X. Lin, and Y . Sun. Toward human-like grasp: Functional grasp by dexterous robotic hand via object-hand semantic representation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(10):12521–12534, 2023

2023
[22]

Mandikal and K

P. Mandikal and K. Grauman. Learning dexterous grasping with object-centric visual affor- dances. In2021 IEEE international conference on robotics and automation (ICRA), pages 6169–6176. IEEE, 2021

2021
[23]

Zhang, S

H. Zhang, S. Christen, Z. Fan, O. Hilliges, and J. Song. Graspxl: Generating grasping motions for diverse objects at scale. InEuropean Conference on Computer Vision, pages 386–403. Springer, 2024

2024
[24]

K. Xu, Z. Hu, R. Doshi, A. Rovinsky, V . Kumar, A. Gupta, and S. Levine. Dexterous manipula- tion from images: Autonomous real-world rl via substep guidance. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 5938–5945. IEEE, 2023

2023
[25]

C. Bao, H. Xu, Y . Qin, and X. Wang. Dexart: Benchmarking generalizable dexterous manip- ulation with articulated objects. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21190–21200, 2023

2023
[26]

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

A. Khazatsky, K. Pertsch, S. Nair, A. Balakrishna, S. Dasari, S. Karamcheti, S. Nasiriany, M. K. Srirama, L. Y . Chen, K. Ellis, et al. Droid: A large-scale in-the-wild robot manipulation dataset.arXiv preprint arXiv:2403.12945, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[27]

C. Wang, H. Shi, W. Wang, R. Zhang, L. Fei-Fei, and C. K. Liu. Dexcap: Scalable and portable mocap data collection system for dexterous manipulation.arXiv preprint arXiv:2403.07788, 2024

work page arXiv 2024
[28]

Qin, Y .-H

Y . Qin, Y .-H. Wu, S. Liu, H. Jiang, R. Yang, Y . Fu, and X. Wang. Dexmv: Imitation learning for dexterous manipulation from human videos. InEuropean Conference on Computer Vision, pages 570–587. Springer, 2022

2022
[29]

Grauman, A

K. Grauman, A. Westbury, E. Byrne, Z. Chavis, A. Furnari, R. Girdhar, J. Hamburger, H. Jiang, M. Liu, X. Liu, et al. Ego4d: Around the world in 3,000 hours of egocentric video. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 18995–19012, 2022

2022
[30]

Y . Qin, H. Su, and X. Wang. From one hand to multiple hands: Imitation learning for dexterous manipulation from single-camera teleoperation.IEEE Robotics and Automation Letters, 7(4): 10873–10881, 2022

2022
[31]

X. Gao, K. Yao, F. Khadivar, and A. Billard. Enhancing dexterity in confined spaces: Real- time motion planning for multifingered in-hand manipulation.IEEE Robotics & Automation Magazine, 31(4):100–112, 2024. 10

2024
[32]

W. Jin. Complementarity-free multi-contact modeling and optimization for dexterous manip- ulation.arXiv preprint arXiv:2408.07855, 2024

work page arXiv 2024
[33]

Patel, S

A. Patel, S. L. Shield, S. Kazi, A. M. Johnson, and L. T. Biegler. Contact-implicit trajectory optimization using orthogonal collocation.IEEE Robotics and Automation Letters, 4(2):2242– 2249, 2019

2019
[34]

F. Yang, T. Power, S. A. Marinovic, S. Iba, R. S. Zarrin, and D. Berenson. Multi-finger manip- ulation via trajectory optimization with differentiable rolling and geometric constraints.IEEE Robotics and Automation Letters, 2025

2025
[35]

Zhang, D

M. Zhang, D. K. Jha, A. U. Raghunathan, and K. Hauser. Simultaneous trajectory optimiza- tion and contact selection for contact-rich manipulation with high-fidelity geometry.IEEE Transactions on Robotics, 2025

2025
[36]

B. Ai, S. Tian, H. Shi, Y . Wang, T. Pfaff, C. Tan, H. I. Christensen, H. Su, J. Wu, and Y . Li. A review of learning-based dynamics models for robotic manipulation.Science Robotics, 10 (106):eadt1497, 2025

2025
[37]

Mediapipe hands: On-device real-time hand tracking,

F. Zhang, V . Bazarevsky, A. Vakunov, A. Tkachenka, G. Sung, C.-L. Chang, and M. Grund- mann. Mediapipe hands: On-device real-time hand tracking.arXiv preprint arXiv:2006.10214, 2020

work page arXiv 2006
[38]

Embodied hands: Modeling and capturing hands and bodies together.arXiv preprint arXiv:2201.02610,

J. Romero, D. Tzionas, and M. J. Black. Embodied hands: Modeling and capturing hands and bodies together.arXiv preprint arXiv:2201.02610, 2022

work page arXiv 2022
[39]

Proximal Policy Optimization Algorithms

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[40]

Y . Qin, W. Yang, B. Huang, K. Van Wyk, H. Su, X. Wang, Y .-W. Chao, and D. Fox. Anyteleop: A general vision-based dexterous robot arm-hand teleoperation system.arXiv preprint arXiv:2307.04577, 2023. 11

work page arXiv 2023

[1] [2]

T. G. W. Lum, M. Matak, V . Makoviychuk, A. Handa, A. Allshire, T. Hermans, N. D. Ratliff, and K. Van Wyk. Dextrah-g: Pixels-to-action dexterous arm-hand grasping with geometric fabrics.arXiv preprint arXiv:2407.02274, 2024

work page arXiv 2024

[2] [3]

Y . Qin, B. Huang, Z.-H. Yin, H. Su, and X. Wang. Dexpoint: Generalizable point cloud rein- forcement learning for sim-to-real dexterous manipulation. InConference on Robot Learning, pages 594–605. PMLR, 2023

2023

[3] [4]

T. Chen, J. Xu, and P. Agrawal. A system for general in-hand object re-orientation. InConfer- ence on robot learning, pages 297–307. PMLR, 2022

2022

[4] [5]

Huang, I

W. Huang, I. Mordatch, P. Abbeel, and D. Pathak. Generalization in dexterous manipulation via geometry-aware multi-task learning.arXiv preprint arXiv:2111.03062, 2021

work page arXiv 2021

[5] [6]

Zhang, Z

T. Zhang, Z. McCarthy, O. Jow, D. Lee, X. Chen, K. Goldberg, and P. Abbeel. Deep imita- tion learning for complex manipulation tasks from virtual reality teleoperation. In2018 IEEE international conference on robotics and automation (ICRA), pages 5628–5635. Ieee, 2018

2018

[6] [7]

Radosavovic, X

I. Radosavovic, X. Wang, L. Pinto, and J. Malik. State-only imitation learning for dexterous manipulation. In2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 7865–7871. IEEE, 2021

2021

[7] [8]

E. Johns. Coarse-to-fine imitation learning: Robot manipulation from a single demonstration. In2021 IEEE international conference on robotics and automation (ICRA), pages 4613–4619. IEEE, 2021

2021

[8] [9]

L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey.Journal of artificial intelligence research, 4:237–285, 1996

1996

[9] [10]

Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations

A. Rajeswaran, V . Kumar, A. Gupta, G. Vezzani, J. Schulman, E. Todorov, and S. Levine. Learning complex dexterous manipulation with deep reinforcement learning and demonstra- tions.arXiv preprint arXiv:1709.10087, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[10] [11]

H. Zhu, A. Gupta, A. Rajeswaran, S. Levine, and V . Kumar. Dexterous manipulation with deep reinforcement learning: Efficient, general, and low-cost. In2019 International Conference on Robotics and Automation (ICRA), pages 3651–3657. IEEE, 2019

2019

[11] [12]

Z. Fan, O. Taheri, D. Tzionas, M. Kocabas, M. Kaufmann, M. J. Black, and O. Hilliges. Arctic: A dataset for dexterous bimanual hand-object manipulation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12943–12954, 2023

2023

[12] [13]

X. Zhan, L. Yang, Y . Zhao, K. Mao, H. Xu, Z. Lin, K. Li, and C. Lu. Oakink2: A dataset of bimanual hands-object manipulation in complex task completion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 445–456, 2024

2024

[13] [14]

Mandi, Y

Z. Mandi, Y . Hou, D. Fox, Y . Narang, A. Mandlekar, and S. Song. Dexmachina: Functional retargeting for bimanual dexterous manipulation.arXiv preprint arXiv:2505.24853, 2025

work page arXiv 2025

[14] [15]

T. Yuan, B. Guan, W. Ye, Z. Tian, Y . Yang, W. Zhou, Z. Li, Y . Huang, P. Wang, C. Zhao, et al. Unibyd: A unified framework for learning robotic manipulation across embodiments beyond imitation of human demonstrations.arXiv preprint arXiv:2512.11609, 2025

work page arXiv 2025

[15] [16]

T. Tao, M. K. Srirama, J. J. Liu, K. Shaw, and D. Pathak. Dexwild: Dexterous human interac- tions for in-the-wild robot policies.arXiv preprint arXiv:2505.07813, 2025. 9

work page internal anchor Pith review Pith/arXiv arXiv 2025

[16] [17]

O. M. Andrychowicz, B. Baker, M. Chociej, R. Jozefowicz, B. McGrew, J. Pachocki, A. Petron, M. Plappert, G. Powell, A. Ray, et al. Learning dexterous in-hand manipulation. The International Journal of Robotics Research, 39(1):3–20, 2020

2020

[17] [18]

H. Qi, A. Kumar, R. Calandra, Y . Ma, and J. Malik. In-hand object rotation via rapid motor adaptation. InConference on Robot Learning, pages 1722–1732. PMLR, 2023

2023

[18] [19]

Z.-H. Yin, B. Huang, Y . Qin, Q. Chen, and X. Wang. Rotating without seeing: Towards in-hand dexterity through touch.arXiv preprint arXiv:2303.10880, 2023

work page arXiv 2023

[19] [20]

H. Yuan, B. Zhou, Y . Fu, and Z. Lu. Cross-embodiment dexterous grasping with reinforce- ment learning. InInternational Conference on Learning Representations, volume 2025, pages 81413–81434, 2025

2025

[20] [21]

T. Zhu, R. Wu, J. Hang, X. Lin, and Y . Sun. Toward human-like grasp: Functional grasp by dexterous robotic hand via object-hand semantic representation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(10):12521–12534, 2023

2023

[21] [22]

Mandikal and K

P. Mandikal and K. Grauman. Learning dexterous grasping with object-centric visual affor- dances. In2021 IEEE international conference on robotics and automation (ICRA), pages 6169–6176. IEEE, 2021

2021

[22] [23]

Zhang, S

H. Zhang, S. Christen, Z. Fan, O. Hilliges, and J. Song. Graspxl: Generating grasping motions for diverse objects at scale. InEuropean Conference on Computer Vision, pages 386–403. Springer, 2024

2024

[23] [24]

K. Xu, Z. Hu, R. Doshi, A. Rovinsky, V . Kumar, A. Gupta, and S. Levine. Dexterous manipula- tion from images: Autonomous real-world rl via substep guidance. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 5938–5945. IEEE, 2023

2023

[24] [25]

C. Bao, H. Xu, Y . Qin, and X. Wang. Dexart: Benchmarking generalizable dexterous manip- ulation with articulated objects. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21190–21200, 2023

2023

[25] [26]

DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

A. Khazatsky, K. Pertsch, S. Nair, A. Balakrishna, S. Dasari, S. Karamcheti, S. Nasiriany, M. K. Srirama, L. Y . Chen, K. Ellis, et al. Droid: A large-scale in-the-wild robot manipulation dataset.arXiv preprint arXiv:2403.12945, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[26] [27]

C. Wang, H. Shi, W. Wang, R. Zhang, L. Fei-Fei, and C. K. Liu. Dexcap: Scalable and portable mocap data collection system for dexterous manipulation.arXiv preprint arXiv:2403.07788, 2024

work page arXiv 2024

[27] [28]

Qin, Y .-H

Y . Qin, Y .-H. Wu, S. Liu, H. Jiang, R. Yang, Y . Fu, and X. Wang. Dexmv: Imitation learning for dexterous manipulation from human videos. InEuropean Conference on Computer Vision, pages 570–587. Springer, 2022

2022

[28] [29]

Grauman, A

K. Grauman, A. Westbury, E. Byrne, Z. Chavis, A. Furnari, R. Girdhar, J. Hamburger, H. Jiang, M. Liu, X. Liu, et al. Ego4d: Around the world in 3,000 hours of egocentric video. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 18995–19012, 2022

2022

[29] [30]

Y . Qin, H. Su, and X. Wang. From one hand to multiple hands: Imitation learning for dexterous manipulation from single-camera teleoperation.IEEE Robotics and Automation Letters, 7(4): 10873–10881, 2022

2022

[30] [31]

X. Gao, K. Yao, F. Khadivar, and A. Billard. Enhancing dexterity in confined spaces: Real- time motion planning for multifingered in-hand manipulation.IEEE Robotics & Automation Magazine, 31(4):100–112, 2024. 10

2024

[31] [32]

W. Jin. Complementarity-free multi-contact modeling and optimization for dexterous manip- ulation.arXiv preprint arXiv:2408.07855, 2024

work page arXiv 2024

[32] [33]

Patel, S

A. Patel, S. L. Shield, S. Kazi, A. M. Johnson, and L. T. Biegler. Contact-implicit trajectory optimization using orthogonal collocation.IEEE Robotics and Automation Letters, 4(2):2242– 2249, 2019

2019

[33] [34]

F. Yang, T. Power, S. A. Marinovic, S. Iba, R. S. Zarrin, and D. Berenson. Multi-finger manip- ulation via trajectory optimization with differentiable rolling and geometric constraints.IEEE Robotics and Automation Letters, 2025

2025

[34] [35]

Zhang, D

M. Zhang, D. K. Jha, A. U. Raghunathan, and K. Hauser. Simultaneous trajectory optimiza- tion and contact selection for contact-rich manipulation with high-fidelity geometry.IEEE Transactions on Robotics, 2025

2025

[35] [36]

B. Ai, S. Tian, H. Shi, Y . Wang, T. Pfaff, C. Tan, H. I. Christensen, H. Su, J. Wu, and Y . Li. A review of learning-based dynamics models for robotic manipulation.Science Robotics, 10 (106):eadt1497, 2025

2025

[36] [37]

Mediapipe hands: On-device real-time hand tracking,

F. Zhang, V . Bazarevsky, A. Vakunov, A. Tkachenka, G. Sung, C.-L. Chang, and M. Grund- mann. Mediapipe hands: On-device real-time hand tracking.arXiv preprint arXiv:2006.10214, 2020

work page arXiv 2006

[37] [38]

Embodied hands: Modeling and capturing hands and bodies together.arXiv preprint arXiv:2201.02610,

J. Romero, D. Tzionas, and M. J. Black. Embodied hands: Modeling and capturing hands and bodies together.arXiv preprint arXiv:2201.02610, 2022

work page arXiv 2022

[38] [39]

Proximal Policy Optimization Algorithms

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[39] [40]

Y . Qin, W. Yang, B. Huang, K. Van Wyk, H. Su, X. Wang, Y .-W. Chao, and D. Fox. Anyteleop: A general vision-based dexterous robot arm-hand teleoperation system.arXiv preprint arXiv:2307.04577, 2023. 11

work page arXiv 2023