EgoAERO: Learning Dexterous Manipulation from a Single Egocentric Video without Object Assets

Haoran Lv; Hengyi Zhang; Hui Xu; Jianxing Liu; Shiyu Gao; Xinrui Zhang; Xueyao Wan; Yan Ding; Yang Xie; Yichen Niu

arxiv: 2606.08057 · v1 · pith:MFJ274YPnew · submitted 2026-06-06 · 💻 cs.RO · cs.AI

EgoAERO: Learning Dexterous Manipulation from a Single Egocentric Video without Object Assets

Yichen Niu , Haoran Lv , Xinrui Zhang , Xueyao Wan , Shiyu Gao , Ying Ai , Hui Xu , Yongqi Hu

show 7 more authors

Hengyi Zhang Yang Xie Zhaxizhuoma Yue Zhao Zhenshan Bing Yan Ding Jianxing Liu

This is my paper

classification 💻 cs.RO cs.AI

keywords dexterousobjectegoaeroegocentriclearningmanipulationassetsrgb-d

0 comments

read the original abstract

Egocentric RGB-D videos offer a natural source of human dexterous manipulation demonstrations, but existing data is difficult to use for robot learning because object pose, geometry, and contact information are often missing or require pre-scanned object assets. We present EgoAERO, the first framework that learns dexterous manipulation from a single egocentric RGB-D human demonstration without object assets. EgoAERO reconstructs contact-consistent hand-object trajectories through asset-free object tracking and reconstruction, ego motion compensation, and adaptive contact optimization, then converts them into robot policies using two-stage residual learning. We further introduce an online quality assessment mechanism and construct EgoDex-R, a large-scale egocentric dataset with 4.3M RGB-D frames for dexterous policy learning. Simulation and real-world experiments show that EgoAERO enables single-demonstration dexterous manipulation and achieves downstream performance close to CAD-based reconstructions on HOI4D.

This paper has not been read by Pith yet.

EgoAERO: Learning Dexterous Manipulation from a Single Egocentric Video without Object Assets

discussion (0)