Video2Sim2Real turns a single human video into a deployable robot manipulation skill by reconstructing a digital twin, anchoring motions to object-centric simulator configurations, and bridging sim-to-real gaps with imitation learning and residual RL.
DexImit: Learning bimanual dexterous manipulation from monocular human videos.arXiv preprint arXiv:2602.10105, 2026
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.RO 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
LUCID learns embodiment-agnostic intent models from unstructured human videos to train dexterous robot policies in simulation, enabling zero-shot transfer on real-world tasks like stirring and wiping.
SynManDex generates human-like dexterous grasps for robots from synthetic human pre-grasps via retargeting and force-closure optimization, reporting 86.4% stability, 4.67/5 human-likeness, 80.7% sim success, and 83.3% real-robot success.
citing papers explorer
-
Video2Sim2Real: Full-Stack Autonomous Dexterous Skill Acquisition from a Single Human Video
Video2Sim2Real turns a single human video into a deployable robot manipulation skill by reconstructing a digital twin, anchoring motions to object-centric simulator configurations, and bridging sim-to-real gaps with imitation learning and residual RL.
-
LUCID: Learning Embodiment-Agnostic Intent Models from Unstructured Human Videos for Scalable Dexterous Robot Skill Acquisition
LUCID learns embodiment-agnostic intent models from unstructured human videos to train dexterous robot policies in simulation, enabling zero-shot transfer on real-world tasks like stirring and wiping.
-
SynManDex: Synthesizing Human-like Dexterous Grasps from Synthetic Human Pre-Grasps
SynManDex generates human-like dexterous grasps for robots from synthetic human pre-grasps via retargeting and force-closure optimization, reporting 86.4% stability, 4.67/5 human-likeness, 80.7% sim success, and 83.3% real-robot success.