arXiv preprint arXiv:2506.05883 , year=

HMVLM: Multistage reasoning-enhanced vision-language model for long-tailed driving scenarios , author= · 2025 · arXiv 2506.05883

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

Action Emergence from Streaming Intent

cs.RO · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

A new VLA model called SI uses a four-step chain-of-thought to derive driving intent and applies it via classifier-free guidance to a flow-matching trajectory generator, showing competitive Waymo scores and intent-controllable plans.

MindVLA-U1: VLA Beats VA with Unified Streaming Architecture for Autonomous Driving

cs.RO · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

MindVLA-U1 is the first unified streaming VLA architecture that surpasses human drivers on WOD-E2E planning metrics while matching VA latency and preserving language interfaces.

citing papers explorer

Showing 2 of 2 citing papers.

Action Emergence from Streaming Intent cs.RO · 2026-05-12 · unverdicted · none · ref 64 · 2 links
A new VLA model called SI uses a four-step chain-of-thought to derive driving intent and applies it via classifier-free guidance to a flow-matching trajectory generator, showing competitive Waymo scores and intent-controllable plans.
MindVLA-U1: VLA Beats VA with Unified Streaming Architecture for Autonomous Driving cs.RO · 2026-05-12 · unverdicted · none · ref 62 · 2 links
MindVLA-U1 is the first unified streaming VLA architecture that surpasses human drivers on WOD-E2E planning metrics while matching VA latency and preserving language interfaces.

arXiv preprint arXiv:2506.05883 , year=

fields

years

verdicts

representative citing papers

citing papers explorer