RynnVLA-001: Using human demonstrations to improve robot manipulation

· 2025 · arXiv 2509.15212

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

read on arXiv browse 8 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

LIBERO-Safety: A Comprehensive Benchmark for Physical and Semantic Safety in Vision-Language-Action Models

cs.RO · 2026-06-22 · unverdicted · novelty 7.0

LIBERO-Safety supplies a scalable benchmark, data-generation pipeline, and 19,664-demonstration dataset that exposes a generalization-safety tension in current VLA models where diverse training improves collision avoidance but task success stays limited by trajectory quality and semantic understandi

One Token Per Frame: Reconsidering Visual Bandwidth in World Models for VLA Policy

cs.CV · 2026-05-08 · conditional · novelty 7.0 · 3 refs

Reducing visual input to one token per frame in VLA world models maintains or improves long-horizon performance on MetaWorld, LIBERO, and real-robot tasks.

APT: Action Expert Pretraining Improves Instruction Generalization of Vision-Language-Action Policies

cs.RO · 2026-06-10 · unverdicted · novelty 6.0

APT pretrains the action expert as a vision-action prior on frozen VLM features then adds language through gated fusion to improve OOD instruction generalization in continuous-action VLA policies.

ELAN4D: Embodiment-Centric 4D Supervision for Vision-Language-Action Models via Plug-and-Play Adaptation

cs.RO · 2026-05-28 · unverdicted · novelty 6.0

ELAN4D introduces plug-and-play 4D keypoint track supervision from forward kinematics to enhance VLA policy generalization in robotic manipulation tasks.

Adaptive Action Chunking at Inference-time for Vision-Language-Action Models

cs.RO · 2026-04-05 · unverdicted · novelty 6.0

Adaptive Action Chunking uses action entropy to dynamically adjust chunk sizes in VLA models, improving performance on simulated and real robotic manipulation tasks.

HiF-VLA: Hindsight, Insight and Foresight through Motion Representation for Vision-Language-Action Models

cs.RO · 2025-12-10 · unverdicted · novelty 6.0

HiF-VLA improves long-horizon robotic manipulation by encoding past motion as hindsight priors and anticipating future motion through foresight reasoning inside a VLA framework.

GEM: Generative Supervision Helps Embodied Intelligence

cs.CV · 2026-05-27 · unverdicted · novelty 5.0

GEM adds generative depth supervision to VLM pre-training and reports improved results on embodied benchmarks plus real-world robot execution.

Controllable Egocentric Video Generation via Occlusion-Aware Sparse 3D Hand Joints

cs.CV · 2026-03-12

citing papers explorer

Showing 6 of 6 citing papers after filters.

LIBERO-Safety: A Comprehensive Benchmark for Physical and Semantic Safety in Vision-Language-Action Models cs.RO · 2026-06-22 · unverdicted · none · ref 21
LIBERO-Safety supplies a scalable benchmark, data-generation pipeline, and 19,664-demonstration dataset that exposes a generalization-safety tension in current VLA models where diverse training improves collision avoidance but task success stays limited by trajectory quality and semantic understandi
APT: Action Expert Pretraining Improves Instruction Generalization of Vision-Language-Action Policies cs.RO · 2026-06-10 · unverdicted · none · ref 40
APT pretrains the action expert as a vision-action prior on frozen VLM features then adds language through gated fusion to improve OOD instruction generalization in continuous-action VLA policies.
ELAN4D: Embodiment-Centric 4D Supervision for Vision-Language-Action Models via Plug-and-Play Adaptation cs.RO · 2026-05-28 · unverdicted · none · ref 31
ELAN4D introduces plug-and-play 4D keypoint track supervision from forward kinematics to enhance VLA policy generalization in robotic manipulation tasks.
Adaptive Action Chunking at Inference-time for Vision-Language-Action Models cs.RO · 2026-04-05 · unverdicted · none · ref 14
Adaptive Action Chunking uses action entropy to dynamically adjust chunk sizes in VLA models, improving performance on simulated and real robotic manipulation tasks.
HiF-VLA: Hindsight, Insight and Foresight through Motion Representation for Vision-Language-Action Models cs.RO · 2025-12-10 · unverdicted · none · ref 16
HiF-VLA improves long-horizon robotic manipulation by encoding past motion as hindsight priors and anticipating future motion through foresight reasoning inside a VLA framework.
GEM: Generative Supervision Helps Embodied Intelligence cs.CV · 2026-05-27 · unverdicted · none · ref 29
GEM adds generative depth supervision to VLM pre-training and reports improved results on embodied benchmarks plus real-world robot execution.

RynnVLA-001: Using human demonstrations to improve robot manipulation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer