Rt-2: Vision-language-action models transfer web- knowledge to robot control, 2023

Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Jake Kuang, Sergey Levine, Yao Lu, Linda Luu, Karina Nguyen, Xi Vincent, Pierre · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Look, Zoom, Understand: The Robotic Eyeball for Embodied Perception

cs.RO · 2025-11-19 · conditional · novelty 6.0

EyeVLA transfers open-world VLM understanding to a PTZ camera control policy via hierarchical action tokens and GRPO reinforcement learning, reaching 96% task completion on 50 real scenes with only 500 training samples.

citing papers explorer

Showing 1 of 1 citing paper.

Look, Zoom, Understand: The Robotic Eyeball for Embodied Perception cs.RO · 2025-11-19 · conditional · none · ref 4
EyeVLA transfers open-world VLM understanding to a PTZ camera control policy via hierarchical action tokens and GRPO reinforcement learning, reaching 96% task completion on 50 real scenes with only 500 training samples.

Rt-2: Vision-language-action models transfer web- knowledge to robot control, 2023

fields

years

verdicts

representative citing papers

citing papers explorer