pith. sign in

4d-vla: Spatiotemporal vision- language-action pretraining with cross-scene calibration.ArXiv, abs/2506.22242

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

citation-role summary

background 4

citation-polarity summary

fields

cs.RO 7 cs.CV 2

years

2026 6 2025 3

roles

background 4

polarities

background 4

representative citing papers

ST-$\pi$: Structured SpatioTemporal VLA for Robotic Manipulation

cs.RO · 2026-04-20 · unverdicted · novelty 6.0

ST-π structures VLA models by having a spatiotemporal VLM produce causally ordered chunk-level prompts that guide a dual-generator action expert to jointly handle spatial and temporal control in robotic manipulation.

VLANeXt: Recipes for Building Strong VLA Models

cs.CV · 2026-02-20 · conditional · novelty 6.0

VLANeXt distills 12 design insights from a unified VLA study into a model that outperforms prior methods on LIBERO benchmarks while releasing code for further exploration.

citing papers explorer

Showing 9 of 9 citing papers.