hub

VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

· 2024 · cs.CV · arXiv 2402.13243

19 Pith papers cite this work. Polarity classification is still indexing.

19 Pith papers citing it

open full Pith review browse 19 citing papers arXiv PDF

abstract

Learning a human-like driving policy from large-scale driving demonstrations is promising, but the uncertainty and non-deterministic nature of planning make it challenging. Existing learning-based planning methods follow a deterministic paradigm to directly regress the action, failing to cope with the uncertainty problem. In this work, we propose a probabilistic planning model for end-to-end autonomous driving, termed VADv2. We resort to a probabilistic field function to model the mapping from the action space to the probabilistic distribution. Since the planning action space is a high-dimensional continuous spatiotemporal space and hard to tackle, we first discretize the planning action space to a large planning vocabulary and then tokenize the planning vocabulary into planning tokens. Planning tokens interact with scene tokens and output the probabilistic distribution of action. Mass driving demonstrations are leveraged to supervise the distribution. VADv2 achieves state-of-the-art closed-loop performance on the CARLA Town05 benchmark, significantly outperforming existing methods, and also leads the recent Bench2Drive benchmark. We further provide comprehensive evaluations on NAVSIM and a large-scale 3DGS-based benchmark, demonstrating its effectiveness in real-world applications. Code is available at https://github.com/hustvl/VAD.

hub tools

JSON dossier citing papers JSON arXiv source

representative citing papers

SCORP: Scene-Consistent Multi-agent Diffusion Planning with Stable Online Reinforcement Post-Training for Cooperative Driving

cs.RO · 2026-04-13 · unverdicted · novelty 7.0 · 2 refs

SCORP delivers 10-28% gains in safety and 2-7% in efficiency metrics on WOMD by using dual-path scene conditioning in diffusion planning plus variance-gated group-relative policy optimization for closed-loop stability.

The DAWN of World-Action Interactive Models

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

DAWN couples a world predictor with a world-conditioned action denoiser in latent space so that each refines the other recursively, yielding strong planning and safety results on autonomous driving benchmarks.

DriveFuture: Future-Aware Latent World Models for Autonomous Driving

cs.CV · 2026-05-10 · unverdicted · novelty 6.0

DriveFuture achieves SOTA results on NAVSIM by conditioning latent world model states on future predictions to directly inform trajectory planning.

ProDrive: Proactive Planning for Autonomous Driving via Ego-Environment Co-Evolution

cs.RO · 2026-04-28 · unverdicted · novelty 6.0

ProDrive couples a query-centric planner with a BEV world model for end-to-end ego-environment co-evolution, enabling future-outcome assessment that improves safety and efficiency over reactive baselines on NAVSIM v1.

Towards Safe Mobility: A Unified Transportation Foundation Model enabled by Open-Ended Vision-Language Dataset

cs.CV · 2026-04-24 · unverdicted · novelty 6.0

Creates LTD dataset for open-ended traffic VQA and trains UniVLT model to achieve SOTA on unified microscopic AD and macroscopic traffic reasoning tasks.

OneDrive: Unified Multi-Paradigm Driving with Vision-Language-Action Models

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

OneDrive unifies heterogeneous decoding in a single VLM transformer decoder for end-to-end driving, achieving 0.28 L2 error and 0.18 collision rate on nuScenes plus 86.8 PDMS on NAVSIM.

FeaXDrive: Feasibility-aware Trajectory-Centric Diffusion Planning for End-to-End Autonomous Driving

cs.RO · 2026-04-14 · unverdicted · novelty 6.0

FeaXDrive improves end-to-end autonomous driving by shifting diffusion planning to a trajectory-centric formulation with curvature-constrained training, drivable-area guidance, and GRPO post-training, yielding stronger closed-loop performance and feasibility on NAVSIM.

Scaling-Aware Data Selection for End-to-End Autonomous Driving Systems

cs.LG · 2026-04-09 · unverdicted · novelty 6.0

MOSAIC is a scaling-aware data selection framework that outperforms baselines in training end-to-end autonomous driving planners, achieving comparable or better EPDMS scores with up to 80% less data.

Orion-Lite: Distilling LLM Reasoning into Efficient Vision-Only Driving Models

cs.CV · 2026-04-09 · unverdicted · novelty 6.0

Orion-Lite uses latent feature distillation and trajectory supervision to create a vision-only model that surpasses its LLM-based teacher on closed-loop Bench2Drive evaluation, achieving a new SOTA driving score of 80.6.

DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale

cs.CV · 2026-04-01 · unverdicted · novelty 6.0

DVGT-2 is a streaming vision-geometry-action model that jointly reconstructs dense 3D geometry and plans trajectories online, achieving better reconstruction than prior batch methods while transferring directly to planning benchmarks without fine-tuning.

Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation

cs.CV · 2024-06-11 · unverdicted · novelty 6.0

Hydra-MDP uses multi-teacher distillation and a multi-head decoder to learn diverse, metric-specific trajectories in an end-to-end autonomous-driving planner, winning the Navsim challenge.

Causality-Aware End-to-End Autonomous Driving via Ego-Centric Joint Scene Modeling

cs.RO · 2026-05-13 · unverdicted · novelty 5.0

CaAD adds ego-centric joint-causal modeling and causality-aware policy alignment to end-to-end driving, reporting Driving Score 87.53 and Success Rate 71.81 on Bench2Drive plus PDMS 91.1 on NAVSIM.

Driving Intents Amplify Planning-Oriented Reinforcement Learning

cs.RO · 2026-05-12 · unverdicted · novelty 5.0 · 2 refs

DIAL expands continuous-action driving policies via intent-conditioned flow matching and multi-intent GRPO, lifting best-of-N preference scores above human demonstrations for the first time on WOD-E2E.

REAP: Reinforcement-Learning End-to-End Autonomous Parking with Gaussian Splatting Simulator for Real2Sim2Real Transfer

cs.RO · 2026-05-09 · unverdicted · novelty 5.0

REAP trains an end-to-end SAC policy with behavior cloning and collision penalties inside a 3DGS Real2Sim simulator and transfers it to physical vehicles, succeeding in narrow mechanical parking slots.

RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework

cs.CV · 2026-04-16 · unverdicted · novelty 5.0

RAD-2 uses a diffusion generator and RL discriminator to cut collision rates by 56% in closed-loop autonomous driving planning.

CrowdVLA: Embodied Vision-Language-Action Agents for Context-Aware Crowd Simulation

cs.GR · 2026-04-07 · unverdicted · novelty 5.0

CrowdVLA introduces vision-language-action agents for crowd simulation that reason about scene semantics, social norms, and action consequences using fine-tuned models and simulation rollouts.

DeepSight: Long-Horizon World Modeling via Latent States Prediction for End-to-End Autonomous Driving

cs.CV · 2026-05-11 · unverdicted · novelty 4.0

DeepSight uses parallel latent feature prediction in BEV for long-horizon world modeling and adaptive text reasoning to reach state-of-the-art closed-loop performance on the Bench2drive benchmark.

Do Open-Loop Metrics Predict Closed-Loop Driving? A Cross-Benchmark Correlation Study of NAVSIM and Bench2Drive

cs.RO · 2026-04-30 · conditional · novelty 4.0

Cross-benchmark analysis of 8 methods shows NAVSIM PDM Score correlates with Bench2Drive Driving Score at Spearman ρ=0.90, with Ego Progress as the strongest single predictor and a simpler 3-metric formula matching the full score.

MindVLA-U1: VLA Beats VA with Unified Streaming Architecture for Autonomous Driving

cs.RO · 2026-05-12

citing papers explorer

Showing 19 of 19 citing papers.

SCORP: Scene-Consistent Multi-agent Diffusion Planning with Stable Online Reinforcement Post-Training for Cooperative Driving cs.RO · 2026-04-13 · unverdicted · none · ref 19 · 2 links · internal anchor
SCORP delivers 10-28% gains in safety and 2-7% in efficiency metrics on WOMD by using dual-path scene conditioning in diffusion planning plus variance-gated group-relative policy optimization for closed-loop stability.
The DAWN of World-Action Interactive Models cs.CV · 2026-05-12 · unverdicted · none · ref 6 · internal anchor
DAWN couples a world predictor with a world-conditioned action denoiser in latent space so that each refines the other recursively, yielding strong planning and safety results on autonomous driving benchmarks.
DriveFuture: Future-Aware Latent World Models for Autonomous Driving cs.CV · 2026-05-10 · unverdicted · none · ref 16 · internal anchor
DriveFuture achieves SOTA results on NAVSIM by conditioning latent world model states on future predictions to directly inform trajectory planning.
ProDrive: Proactive Planning for Autonomous Driving via Ego-Environment Co-Evolution cs.RO · 2026-04-28 · unverdicted · none · ref 2 · internal anchor
ProDrive couples a query-centric planner with a BEV world model for end-to-end ego-environment co-evolution, enabling future-outcome assessment that improves safety and efficiency over reactive baselines on NAVSIM v1.
Towards Safe Mobility: A Unified Transportation Foundation Model enabled by Open-Ended Vision-Language Dataset cs.CV · 2026-04-24 · unverdicted · none · ref 8 · internal anchor
Creates LTD dataset for open-ended traffic VQA and trains UniVLT model to achieve SOTA on unified microscopic AD and macroscopic traffic reasoning tasks.
OneDrive: Unified Multi-Paradigm Driving with Vision-Language-Action Models cs.CV · 2026-04-20 · unverdicted · none · ref 6 · internal anchor
OneDrive unifies heterogeneous decoding in a single VLM transformer decoder for end-to-end driving, achieving 0.28 L2 error and 0.18 collision rate on nuScenes plus 86.8 PDMS on NAVSIM.
FeaXDrive: Feasibility-aware Trajectory-Centric Diffusion Planning for End-to-End Autonomous Driving cs.RO · 2026-04-14 · unverdicted · none · ref 1 · internal anchor
FeaXDrive improves end-to-end autonomous driving by shifting diffusion planning to a trajectory-centric formulation with curvature-constrained training, drivable-area guidance, and GRPO post-training, yielding stronger closed-loop performance and feasibility on NAVSIM.
Scaling-Aware Data Selection for End-to-End Autonomous Driving Systems cs.LG · 2026-04-09 · unverdicted · none · ref 7 · internal anchor
MOSAIC is a scaling-aware data selection framework that outperforms baselines in training end-to-end autonomous driving planners, achieving comparable or better EPDMS scores with up to 80% less data.
Orion-Lite: Distilling LLM Reasoning into Efficient Vision-Only Driving Models cs.CV · 2026-04-09 · unverdicted · none · ref 4 · internal anchor
Orion-Lite uses latent feature distillation and trajectory supervision to create a vision-only model that surpasses its LLM-based teacher on closed-loop Bench2Drive evaluation, achieving a new SOTA driving score of 80.6.
DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale cs.CV · 2026-04-01 · unverdicted · none · ref 3 · internal anchor
DVGT-2 is a streaming vision-geometry-action model that jointly reconstructs dense 3D geometry and plans trajectories online, achieving better reconstruction than prior batch methods while transferring directly to planning benchmarks without fine-tuning.
Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation cs.CV · 2024-06-11 · unverdicted · none · ref 4 · internal anchor
Hydra-MDP uses multi-teacher distillation and a multi-head decoder to learn diverse, metric-specific trajectories in an end-to-end autonomous-driving planner, winning the Navsim challenge.
Causality-Aware End-to-End Autonomous Driving via Ego-Centric Joint Scene Modeling cs.RO · 2026-05-13 · unverdicted · none · ref 5 · internal anchor
CaAD adds ego-centric joint-causal modeling and causality-aware policy alignment to end-to-end driving, reporting Driving Score 87.53 and Success Rate 71.81 on Bench2Drive plus PDMS 91.1 on NAVSIM.
Driving Intents Amplify Planning-Oriented Reinforcement Learning cs.RO · 2026-05-12 · unverdicted · none · ref 6 · 2 links · internal anchor
DIAL expands continuous-action driving policies via intent-conditioned flow matching and multi-intent GRPO, lifting best-of-N preference scores above human demonstrations for the first time on WOD-E2E.
REAP: Reinforcement-Learning End-to-End Autonomous Parking with Gaussian Splatting Simulator for Real2Sim2Real Transfer cs.RO · 2026-05-09 · unverdicted · none · ref 3 · internal anchor
REAP trains an end-to-end SAC policy with behavior cloning and collision penalties inside a 3DGS Real2Sim simulator and transfers it to physical vehicles, succeeding in narrow mechanical parking slots.
RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework cs.CV · 2026-04-16 · unverdicted · none · ref 2 · internal anchor
RAD-2 uses a diffusion generator and RL discriminator to cut collision rates by 56% in closed-loop autonomous driving planning.
CrowdVLA: Embodied Vision-Language-Action Agents for Context-Aware Crowd Simulation cs.GR · 2026-04-07 · unverdicted · none · ref 4 · internal anchor
CrowdVLA introduces vision-language-action agents for crowd simulation that reason about scene semantics, social norms, and action consequences using fine-tuned models and simulation rollouts.
DeepSight: Long-Horizon World Modeling via Latent States Prediction for End-to-End Autonomous Driving cs.CV · 2026-05-11 · unverdicted · none · ref 142 · internal anchor
DeepSight uses parallel latent feature prediction in BEV for long-horizon world modeling and adaptive text reasoning to reach state-of-the-art closed-loop performance on the Bench2drive benchmark.
Do Open-Loop Metrics Predict Closed-Loop Driving? A Cross-Benchmark Correlation Study of NAVSIM and Bench2Drive cs.RO · 2026-04-30 · conditional · none · ref 19 · internal anchor
Cross-benchmark analysis of 8 methods shows NAVSIM PDM Score correlates with Bench2Drive Driving Score at Spearman ρ=0.90, with Ego Progress as the strongest single predictor and a simpler 3-metric formula matching the full score.
MindVLA-U1: VLA Beats VA with Unified Streaming Architecture for Autonomous Driving cs.RO · 2026-05-12 · unreviewed · ref 3 · internal anchor

VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer