pith. sign in

EvoDriveVLA: Evolving Driving VLA Models via Collaborative Perception-Planning Distillation

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it
abstract

Vision-Language-Action models have shown great promise for autonomous driving, yet they suffer from degraded perception after unfreezing the visual encoder and struggle with accumulated instability in long-term planning. To address these challenges, we propose EvoDriveVLA-a novel collaborative perception-planning distillation framework that integrates self-anchored perceptual constraints and future-informed trajectory optimization. Specifically, self-anchored visual distillation leverages self-anchor teacher to deliver visual anchoring constraints, regularizing student representations via trajectory-guided key-region awareness. In parallel, future-informed trajectory distillation employs a future-aware oracle teacher with coarse-to-fine trajectory refinement and Monte Carlo dropout sampling to synthesize reasoning trajectories that model future evolutions, enabling the student model to internalize the future-aware insights of the teacher. EvoDriveVLA achieves SOTA performance in nuScenes open-loop evaluation and significantly enhances performance in NAVSIM closed-loop evaluation. Our code is available at: https://github.com/hey-cjj/EvoDriveVLA.

fields

cs.CV 1 cs.RO 1

years

2026 2

verdicts

UNVERDICTED 2

clear filters

representative citing papers

citing papers explorer

Showing 1 of 1 citing paper after filters.