Llara: Supercharging robot learning data for vision- language policy

Xiang Li, Cristina Mata, Jongwoo Park, Kumara Kahatapitiya, Yoo Sung Jang, Jinghuan Shang, Kanchana Ranasinghe, Ryan Burgert, Mu Cai, Yong Jae Lee, et al · 2024 · arXiv 2406.20095

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

UniLACT: Depth-Aware RGB Latent Action Learning for Vision-Language-Action Models

cs.RO · 2026-02-23 · unverdicted · novelty 7.0

UniLACT improves VLA models by adding depth-aware unified latent action pretraining that outperforms RGB-only baselines on seen and unseen manipulation tasks.

PALM: Progress-Aware Policy Learning via Affordance Reasoning for Long-Horizon Robotic Manipulation

cs.RO · 2026-01-11 · unverdicted · novelty 6.0

PALM improves long-horizon robotic manipulation success by distilling affordance representations for object interaction and predicting within-subtask progress in a VLA model.

GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data

cs.RO · 2025-05-06 · unverdicted · novelty 6.0

GraspVLA shows that pretraining a grasping model on a billion synthetic action frames enables zero-shot open-vocabulary performance and sim-to-real transfer.

$\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization

cs.LG · 2025-04-22 · unverdicted · novelty 6.0

π_{0.5} is a VLA model that achieves long-horizon dexterous manipulation in entirely new homes through co-training on heterogeneous tasks and multi-source data including web and semantic predictions.

TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies

cs.RO · 2024-12-13 · conditional · novelty 6.0

Visual trace prompting improves spatial-temporal awareness in VLA models, delivering 10% gains on SimplerEnv and 3.5x on real-robot tasks.

ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning

cs.CV · 2025-07-22 · unverdicted · novelty 5.0

ThinkAct introduces reinforced visual latent planning in a dual VLA system to enable better long-horizon reasoning and adaptation for embodied tasks.

citing papers explorer

Showing 6 of 6 citing papers.

UniLACT: Depth-Aware RGB Latent Action Learning for Vision-Language-Action Models cs.RO · 2026-02-23 · unverdicted · none · ref 23
UniLACT improves VLA models by adding depth-aware unified latent action pretraining that outperforms RGB-only baselines on seen and unseen manipulation tasks.
PALM: Progress-Aware Policy Learning via Affordance Reasoning for Long-Horizon Robotic Manipulation cs.RO · 2026-01-11 · unverdicted · none · ref 70
PALM improves long-horizon robotic manipulation success by distilling affordance representations for object interaction and predicting within-subtask progress in a VLA model.
GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data cs.RO · 2025-05-06 · unverdicted · none · ref 20
GraspVLA shows that pretraining a grasping model on a billion synthetic action frames enables zero-shot open-vocabulary performance and sim-to-real transfer.
$\pi_{0.5}$: a Vision-Language-Action Model with Open-World Generalization cs.LG · 2025-04-22 · unverdicted · none · ref 46
π_{0.5} is a VLA model that achieves long-horizon dexterous manipulation in entirely new homes through co-training on heterogeneous tasks and multi-source data including web and semantic predictions.
TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies cs.RO · 2024-12-13 · conditional · none · ref 84
Visual trace prompting improves spatial-temporal awareness in VLA models, delivering 10% gains on SimplerEnv and 3.5x on real-robot tasks.
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning cs.CV · 2025-07-22 · unverdicted · none · ref 19
ThinkAct introduces reinforced visual latent planning in a dual VLA system to enable better long-horizon reasoning and adaptation for embodied tasks.

Llara: Supercharging robot learning data for vision- language policy

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer