VADv2 introduces a probabilistic planning model that discretizes the high-dimensional action space into tokens, interacts them with scene tokens to predict action distributions, and reports SOTA closed-loop results on CARLA Town05 and Bench2Drive.
Drive like a human: Rethink- ing autonomous driving with large language models
6 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
GPT-3.5 is turned into an autonomous-vehicle motion planner by representing driving scenes and trajectories as language tokens and applying a prompting-reasoning-finetuning pipeline, with results shown on nuScenes.
UniDrive fuses temporal scene dynamics from video with high-res spatial details via gated cross-attention to jointly generate risk captions and grounded object boxes, outperforming baselines on DRAMA-Reasoning with advantages in small-object detection and zero-shot transfer.
On-policy GKD trains 5x smaller student LLMs to nearly match large teacher performance in AV motion planning on nuScenes while beating a dense-feedback RL baseline.
A framework called LPC uses deep Koopman operators and actor-critic learning to create closed-loop policies for AV motion planning with safety constraints, shown in simulations and real-world tests.
Introduces structured NuScenes-S dataset and 0.9B FastDrive VLM claiming 20% higher decision accuracy and over 10x inference speedup versus larger unstructured VLMs.
citing papers explorer
-
VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning
VADv2 introduces a probabilistic planning model that discretizes the high-dimensional action space into tokens, interacts them with scene tokens to predict action distributions, and reports SOTA closed-loop results on CARLA Town05 and Bench2Drive.
-
UniDrive: A Unified Vision-Language and Grounding Framework for Interpretable Risk Understanding in Autonomous Driving
UniDrive fuses temporal scene dynamics from video with high-res spatial details via gated cross-attention to jointly generate risk captions and grounded object boxes, outperforming baselines on DRAMA-Reasoning with advantages in small-object detection and zero-shot transfer.
-
On-Policy Distillation of Language Models for Autonomous Vehicle Motion Planning
On-policy GKD trains 5x smaller student LLMs to nearly match large teacher performance in AV motion planning on nuScenes while beating a dense-feedback RL baseline.
-
Learning Predictive Control with Deep Koopman Operators for Autonomous Vehicle Motion Planning
A framework called LPC uses deep Koopman operators and actor-critic learning to create closed-loop policies for AV motion planning with safety constraints, shown in simulations and real-world tests.
-
Structured Labeling Enables Faster Vision-Language Models for End-to-End Autonomous Driving
Introduces structured NuScenes-S dataset and 0.9B FastDrive VLM claiming 20% higher decision accuracy and over 10x inference speedup versus larger unstructured VLMs.