The unsurprising eﬀec- tiveness of pre-trained vision models for control.Preprint arXiv:2203.03580

· 2022 · arXiv 2203.03580

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training

cs.RO · 2022-09-30 · unverdicted · novelty 7.0

VIP learns a visual embedding from human videos whose distance defines dense, smooth rewards for arbitrary goal-image robot tasks without task-specific fine-tuning.

A Generalist Agent

cs.AI · 2022-05-12 · accept · novelty 7.0

Gato is a multi-modal, multi-task, multi-embodiment generalist policy using one transformer network to handle text, vision, games, and robotics tasks.

Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation

cs.RO · 2024-09-24 · unverdicted · novelty 6.0

Gen2Act enables generalizable robot manipulation for unseen objects and novel motions by using zero-shot human video generation from web data to condition a policy trained on an order of magnitude less robot interaction data.

citing papers explorer

Showing 3 of 3 citing papers.

VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training cs.RO · 2022-09-30 · unverdicted · none · ref 21
VIP learns a visual embedding from human videos whose distance defines dense, smooth rewards for arbitrary goal-image robot tasks without task-specific fine-tuning.
A Generalist Agent cs.AI · 2022-05-12 · accept · none · ref 45
Gato is a multi-modal, multi-task, multi-embodiment generalist policy using one transformer network to handle text, vision, games, and robotics tasks.
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation cs.RO · 2024-09-24 · unverdicted · none · ref 34
Gen2Act enables generalizable robot manipulation for unseen objects and novel motions by using zero-shot human video generation from web data to condition a policy trained on an order of magnitude less robot interaction data.

The unsurprising eﬀec- tiveness of pre-trained vision models for control.Preprint arXiv:2203.03580

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer