LAP: Fast LAtent Diffusion Planner for Autonomous Driving

Haoming Song; Jie Mei; Jinhao Zhang; Wenlong Xia; Youmin Gong; Zhexuan Zhou

arxiv: 2512.00470 · v4 · pith:FO3UGST5new · submitted 2025-11-29 · 💻 cs.RO

LAP: Fast LAtent Diffusion Planner for Autonomous Driving

Jinhao Zhang , Wenlong Xia , Zhexuan Zhou , Haoming Song , Youmin Gong , Jie Mei This is my paper

classification 💻 cs.RO

keywords drivinghigh-levellatentplannerautonomousdiffusionkinematicslow-level

0 comments

read the original abstract

Diffusion models have demonstrated strong capabilities for modeling human-like driving behaviors in autonomous driving, but their iterative sampling process induces substantial latency, and operating directly on raw trajectory points forces the model to spend capacity on low-level kinematics, rather than high-level multi-modal semantics. To address these limitations, we propose LAtent Planner (LAP), a framework that plans in a VAE-learned latent space that disentangles high-level intents from low-level kinematics, enabling our planner to capture rich, multi-modal driving strategies. To bridge the representational gap between the high-level semantic planning space and the vectorized scene context, we introduce an intermediate feature alignment mechanism that facilitates robust information fusion. Notably, LAP can produce high-quality plans in one single denoising step, substantially reducing computational overhead. Through extensive evaluations on the large-scale nuPlan benchmark, LAP achieves state-of-the-art closed-loop performance among learning-based planning methods, while demonstrating an inference speed-up of at most 10x over previous SOTA approaches.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

EvoDriveVLA: Evolving Driving VLA Models via Collaborative Perception-Planning Distillation
cs.CV 2026-03 unverdicted novelty 5.0

EvoDriveVLA uses collaborative perception-planning distillation with self-anchor and future-aware teachers to fix perception degradation and long-term instability in driving VLA models, reaching SOTA on nuScenes and NAVSIM.