PaIR-Drive runs IL and RL in parallel branches with a tree-structured sampler to reach 91.2 PDMS and 87.9 EPDMS on NAVSIM benchmarks while outperforming sequential RL fine-tuning and correcting some human errors.
hub
turn left
18 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
LWDrive uses future-frame supervision on VLMs to create world-model features that a multi-layer Foresight Cascade Planner refines into final trajectories, reporting 92.0 on NAVSIM and 89.6 on NAVSIM-v2.
GEM is a new LiDAR world model using deformable Mamba that disentangles dynamic and static features to generate high-fidelity simulations and achieve state-of-the-art results on autonomous driving benchmarks.
Orion-Lite uses latent feature distillation and trajectory supervision to create a vision-only model that surpasses its LLM-based teacher on closed-loop Bench2Drive evaluation, achieving a new SOTA driving score of 80.6.
DVGT-2 is a streaming vision-geometry-action model that jointly reconstructs dense 3D geometry and plans trajectories online, achieving better reconstruction than prior batch methods while transferring directly to planning benchmarks without fine-tuning.
DriveLaW unifies video world modeling and trajectory planning by injecting video-generator latents into a diffusion planner, achieving SOTA video prediction and a new record on the NAVSIM planning benchmark.
SpaceDrive integrates 3D positional encodings derived from depth and ego-states into VLMs, replacing digit tokens to improve spatial reasoning and trajectory regression in autonomous driving.
SimScale synthesizes unseen driving states from real logs via neural rendering and reactive environments, generates pseudo-expert trajectories, and shows that co-training on real plus simulated data improves planning robustness and generalization on real benchmarks, with gains scaling by simulation
DriveMoE applies scene-specialized Vision MoE and skill-specialized Action MoE to a VLA baseline to achieve SOTA closed-loop performance on Bench2Drive.
PriorEye augments end-to-end driving models with a dual-memory architecture that stores and gates geospatial visual priors to improve performance and robustness to sensor corruption on NAVSIM-v2.
SparseWorld is a sparse world model with a Sparse Dreamer module that performs autoregressive rollout of future instances to refine motion prediction and planning, reporting 0.05% collision rate on nuScenes open-loop metrics.
HEAT uses a trajectory-driven learning paradigm and a world model predicting future latent features from ego actions to enable a single unified end-to-end autonomous driving model to perform well across heterogeneous domains on nuScenes, NAVSIM, and Waymo benchmarks.
CaAD adds ego-centric joint-causal modeling and causality-aware policy alignment to end-to-end driving, reporting Driving Score 87.53 and PDMS 91.1 on Bench2Drive and NAVSIM.
RAD-2 uses a diffusion generator and RL discriminator to cut collision rates by 56% in closed-loop autonomous driving planning.
CrowdVLA introduces vision-language-action agents for crowd simulation that reason about scene semantics, social norms, and action consequences using fine-tuned models and simulation rollouts.
DynFlowDrive models action-conditioned scene transitions via rectified flow in latent space and adds stability-aware trajectory selection, showing gains on nuScenes and NavSim without added inference cost.
Cross-benchmark analysis of 8 methods shows NAVSIM PDM Score correlates with Bench2Drive Driving Score at Spearman ρ=0.90, with Ego Progress as the strongest single predictor and a simpler 3-metric formula matching the full score.
citing papers explorer
-
DriveLaW:Unifying Planning and Video Generation in a Latent Driving World
DriveLaW unifies video world modeling and trajectory planning by injecting video-generator latents into a diffusion planner, achieving SOTA video prediction and a new record on the NAVSIM planning benchmark.
-
SpaceDrive: Infusing Spatial Awareness into VLM-based Autonomous Driving
SpaceDrive integrates 3D positional encodings derived from depth and ego-states into VLMs, replacing digit tokens to improve spatial reasoning and trajectory regression in autonomous driving.
-
SimScale: Learning to Drive via Real-World Simulation at Scale
SimScale synthesizes unseen driving states from real logs via neural rendering and reactive environments, generates pseudo-expert trajectories, and shows that co-training on real plus simulated data improves planning robustness and generalization on real benchmarks, with gains scaling by simulation
-
DriveMoE: Mixture-of-Experts for Vision-Language-Action Model in End-to-End Autonomous Driving
DriveMoE applies scene-specialized Vision MoE and skill-specialized Action MoE to a VLA baseline to achieve SOTA closed-loop performance on Bench2Drive.
- DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving