pith. sign in

Canonical reference

Como: Learning continuous latent motion from internet videos for scalable robot learning

Canonical reference. 83% of citing Pith papers cite this work as background.

9 Pith papers citing it
Background 83% of classified citations

citation-role summary

background 5 baseline 1

citation-polarity summary

fields

cs.RO 7 cs.CV 2

years

2026 7 2025 2

verdicts

UNVERDICTED 9

representative citing papers

DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos

cs.RO · 2026-02-06 · unverdicted · novelty 7.0

DreamDojo is a foundation world model pretrained on the largest human video dataset to date that uses continuous latent actions to transfer interaction knowledge and achieves controllable physics simulation after robot post-training.

GazeVLA: Learning Human Intention for Robotic Manipulation

cs.RO · 2026-04-24 · unverdicted · novelty 6.0

GazeVLA pretrains on large human egocentric datasets to capture gaze-based intention, then finetunes on limited robot data with chain-of-thought reasoning to achieve better robotic manipulation performance than baselines.

Motus: A Unified Latent Action World Model

cs.CV · 2025-12-15 · unverdicted · novelty 5.0

Motus unifies understanding, video generation, and action in one latent world model via MoT experts and optical-flow latent actions, reporting gains over prior methods in simulation and real robots.

citing papers explorer

Showing 9 of 9 citing papers.