pith. sign in

Spatial-aware vla pretraining through visual-physical alignment from human videos

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

background 3

citation-polarity summary

fields

cs.RO 3

years

2026 3

verdicts

UNVERDICTED 3

roles

background 3

polarities

background 3

representative citing papers

Unmasking the Illusion of Embodied Reasoning in Vision-Language-Action Models

cs.RO · 2026-04-20 · unverdicted · novelty 6.0

State-of-the-art vision-language-action models catastrophically fail dynamic embodied reasoning due to lexical-kinematic shortcuts, behavioral inertia, and semantic feature collapse caused by architectural bottlenecks, as shown by the new BeTTER benchmark with real-world validation.

citing papers explorer

Showing 3 of 3 citing papers.