Egozero: Robot learning from smart glasses

· 2025 · arXiv 2505.20290

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

read on arXiv browse 11 citing papers

citation-role summary

background 4

citation-polarity summary

background 3 unclear 1

representative citing papers

EgoEngine: From Egocentric Human Videos to High-Fidelity Dexterous Robot Demonstrations

cs.RO · 2026-06-10 · unverdicted · novelty 7.0

EgoEngine transforms egocentric human videos into high-fidelity robot data enabling zero-shot visuomotor dexterous policy learning without real-robot demonstrations.

DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos

cs.RO · 2026-02-06 · unverdicted · novelty 7.0

DreamDojo is a foundation world model pretrained on the largest human video dataset to date that uses continuous latent actions to transfer interaction knowledge and achieves controllable physics simulation after robot post-training.

ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining

cs.RO · 2026-06-15 · unverdicted · novelty 6.0

ACE-Ego-0 is a VLA pretraining framework that turns egocentric human videos into robot-format pseudo-actions via a video-to-action pipeline and trains jointly with robot data under a reliability-aware objective.

Hand-centric Human-to-Robot Trajectory Transfer from Video Demonstrations via Open-World Contact Localization

cs.RO · 2026-06-09 · unverdicted · novelty 6.0

HOWTransfer recovers 3D hand motion from video, localizes contact intervals via hand-object cues, generates multi-modal grasp hypotheses, and edits trajectories to produce diverse robot-executable motions achieving 86% success.

WARPED: Wrist-Aligned Rendering for Robot Policy Learning from Egocentric Human Demonstrations

cs.RO · 2026-04-12 · unverdicted · novelty 6.0

WARPED synthesizes realistic wrist-view observations from monocular egocentric human videos via foundation models, hand-object tracking, retargeting, and Gaussian Splatting to train visuomotor policies that match teleoperation success rates on five tabletop tasks with 5-8x less collection effort.

ActiveGlasses: Learning Manipulation with Active Vision from Ego-centric Human Demonstration

cs.RO · 2026-04-09 · unverdicted · novelty 6.0

ActiveGlasses learns robot manipulation from ego-centric human demos captured with active vision via smart glasses, achieving zero-shot transfer using object-centric point-cloud policies.

EgoVerse: An Egocentric Human Dataset for Robot Learning from Around the World

cs.RO · 2026-04-08 · unverdicted · novelty 6.0

EgoVerse releases 1,362 hours of standardized egocentric human data across 1,965 tasks and shows via multi-lab experiments that robot policy performance scales with human data volume when the data aligns with robot objectives.

X-Diffusion: Training Diffusion Policies on Cross-Embodiment Human Demonstrations

cs.RO · 2025-11-06 · unverdicted · novelty 6.0

X-Diffusion adapts Ambient Diffusion to selectively train on noised human actions for cross-embodiment robot policies, yielding 16% higher average success rates than naive co-training or manual filtering across five real-world manipulation tasks.

WARP: Whole-Body Retargeting for Learning from Offline Human Demonstrations

cs.RO · 2026-06-29 · unverdicted · novelty 5.0

WARP is an offline retargeting method using a SEW geometric solver to produce consistent whole-body robot trajectories from human demonstrations for zero-shot mobile manipulation.

HumanEgo: Zero-Shot Robot Learning from Minutes of Human Egocentric Videos

cs.RO · 2026-05-24 · unverdicted · novelty 5.0

HumanEgo reports 92.5% average success on four real robot tasks using only 15-30 minutes of human video per task and zero robot data, with zero-shot transfer to new robots and cameras.

Human2Any: Human-to-Robot Transfer via Constraint-Aware Compositional Planning

cs.RO · 2026-06-27 · unverdicted · novelty 4.0

Human2Any transfers human video demonstrations to robots by representing tasks as object-object interactions and composing learned priors with robot-side planning.

citing papers explorer

Showing 11 of 11 citing papers.

EgoEngine: From Egocentric Human Videos to High-Fidelity Dexterous Robot Demonstrations cs.RO · 2026-06-10 · unverdicted · none · ref 16
EgoEngine transforms egocentric human videos into high-fidelity robot data enabling zero-shot visuomotor dexterous policy learning without real-robot demonstrations.
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos cs.RO · 2026-02-06 · unverdicted · none · ref 65
DreamDojo is a foundation world model pretrained on the largest human video dataset to date that uses continuous latent actions to transfer interaction knowledge and achieves controllable physics simulation after robot post-training.
ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining cs.RO · 2026-06-15 · unverdicted · none · ref 44
ACE-Ego-0 is a VLA pretraining framework that turns egocentric human videos into robot-format pseudo-actions via a video-to-action pipeline and trains jointly with robot data under a reliability-aware objective.
Hand-centric Human-to-Robot Trajectory Transfer from Video Demonstrations via Open-World Contact Localization cs.RO · 2026-06-09 · unverdicted · none · ref 16
HOWTransfer recovers 3D hand motion from video, localizes contact intervals via hand-object cues, generates multi-modal grasp hypotheses, and edits trajectories to produce diverse robot-executable motions achieving 86% success.
WARPED: Wrist-Aligned Rendering for Robot Policy Learning from Egocentric Human Demonstrations cs.RO · 2026-04-12 · unverdicted · none · ref 68
WARPED synthesizes realistic wrist-view observations from monocular egocentric human videos via foundation models, hand-object tracking, retargeting, and Gaussian Splatting to train visuomotor policies that match teleoperation success rates on five tabletop tasks with 5-8x less collection effort.
ActiveGlasses: Learning Manipulation with Active Vision from Ego-centric Human Demonstration cs.RO · 2026-04-09 · unverdicted · none · ref 17
ActiveGlasses learns robot manipulation from ego-centric human demos captured with active vision via smart glasses, achieving zero-shot transfer using object-centric point-cloud policies.
EgoVerse: An Egocentric Human Dataset for Robot Learning from Around the World cs.RO · 2026-04-08 · unverdicted · none · ref 35
EgoVerse releases 1,362 hours of standardized egocentric human data across 1,965 tasks and shows via multi-lab experiments that robot policy performance scales with human data volume when the data aligns with robot objectives.
X-Diffusion: Training Diffusion Policies on Cross-Embodiment Human Demonstrations cs.RO · 2025-11-06 · unverdicted · none · ref 8
X-Diffusion adapts Ambient Diffusion to selectively train on noised human actions for cross-embodiment robot policies, yielding 16% higher average success rates than naive co-training or manual filtering across five real-world manipulation tasks.
WARP: Whole-Body Retargeting for Learning from Offline Human Demonstrations cs.RO · 2026-06-29 · unverdicted · none · ref 6
WARP is an offline retargeting method using a SEW geometric solver to produce consistent whole-body robot trajectories from human demonstrations for zero-shot mobile manipulation.
HumanEgo: Zero-Shot Robot Learning from Minutes of Human Egocentric Videos cs.RO · 2026-05-24 · unverdicted · none · ref 21
HumanEgo reports 92.5% average success on four real robot tasks using only 15-30 minutes of human video per task and zero robot data, with zero-shot transfer to new robots and cameras.
Human2Any: Human-to-Robot Transfer via Constraint-Aware Compositional Planning cs.RO · 2026-06-27 · unverdicted · none · ref 23
Human2Any transfers human video demonstrations to robots by representing tasks as object-object interactions and composing learned priors with robot-side planning.

Egozero: Robot learning from smart glasses

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer