Action Images turn robot arm motions into interpretable multiview pixel videos, letting video backbones serve as zero-shot policies for end-to-end robot learning.
IEEE Robotics and Automation Letters5(2), 3019–3026 (2020)
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
dataset 1
citation-polarity summary
fields
cs.CV 2years
2026 2roles
dataset 1polarities
use dataset 1representative citing papers
citing papers explorer
-
Action Images: End-to-End Policy Learning via Multiview Video Generation
Action Images turn robot arm motions into interpretable multiview pixel videos, letting video backbones serve as zero-shot policies for end-to-end robot learning.
- GEM-4D: Geometry-Enhanced Video World Models for Robot Manipulation