DriveSafer: End-to-End Autonomous Driving with Safety Guidance

Raj Rajkumar; Shounak Sural

arxiv: 2605.16737 · v1 · pith:UP6K5VDSnew · submitted 2026-05-16 · 💻 cs.RO · cs.CV

DriveSafer: End-to-End Autonomous Driving with Safety Guidance

Shounak Sural , Raj Rajkumar This is my paper

Pith reviewed 2026-05-19 21:39 UTC · model grok-4.3

classification 💻 cs.RO cs.CV

keywords end-to-end autonomous drivingsafety guidancecatastrophic failuresgenerative plannersNAVSIM benchmarkdiffusion-based planningphysical constraintsinference-time guidance

0 comments

The pith

A safety framework for end-to-end driving planners cuts catastrophic failures by 48 percent on the NAVSIM benchmark.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper targets the problem of frequent catastrophic failures in modern generative end-to-end autonomous driving models. It observes that many of these failures come from violations of physical constraints and safety rules. To fix this, the authors introduce DriveSafer, a framework that adds explicit safety constraints during training and safety guidance during inference to steer planners toward safe trajectories. On the NAVSIM benchmark, this produces a 48 percent drop in cases with zero PDMS score and more than 65 percent fewer drivable-area compliance failures compared with the prior DiffusionDrive model. Readers would care because fewer outright unsafe plans could make autonomous vehicles more reliable in real traffic without needing to chase small gains in average performance.

Core claim

DriveSafer is a failure-aware safety framework that steers generative end-to-end planners toward safe behaviors by applying training-time safety constraints together with inference-time safety guidance. When tested against the DiffusionDrive baseline on NAVSIM, the method lowers the count of catastrophic failures (PDMS equal to zero) by 48 percent and reduces drivable-area compliance failures by more than 65 percent.

What carries the argument

The DriveSafer framework, which injects safety constraints at training time and safety guidance at inference time into existing generative planners.

If this is right

Generative planners can be made substantially safer by focusing training and inference on constraint violations rather than solely on average trajectory quality.
Reductions in drivable-area compliance failures exceed 65 percent when both training constraints and inference guidance are applied together.
The approach leaves open the possibility of combining safety guidance with other perception or prediction modules without retraining the entire planner from scratch.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar constraint-plus-guidance patterns could be tested on other generative planning tasks such as robot arm motion or drone navigation.
The method suggests that future benchmarks should report failure-mode breakdowns separately from mean scores so that safety gains are visible.
If the safety guidance scales to onboard hardware, it might allow lighter perception stacks while still meeting regulatory safety thresholds.

Load-bearing premise

Many catastrophic failures in current models stem specifically from violations of physical constraints and safety requirements, and adding targeted constraints plus guidance will reduce those failures without creating new failure modes.

What would settle it

Running the same models on additional benchmarks or real-world logs and checking whether the reduction in PDMS-zero cases holds while average planning quality and non-catastrophic error rates stay the same or improve.

Figures

Figures reproduced from arXiv: 2605.16737 by Raj Rajkumar, Shounak Sural.

**Figure 1.** Figure 1: Overview of our DriveSafer framework benchmark as a catastrophic failure, since its occurrences can potentially be hazardous and life-threatening in realworld driving. A PDMS score of 0 is assigned in the following cases: (i) the ego vehicle collides with another moving object such as a vehicle or a pedestrian, (ii) the ego vehicle drives off the permitted drivable region possibly onto a sidewalk or a s… view at source ↗

**Figure 2.** Figure 2: A complex left-turn in NAVSIM [8] [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: PDM Score Distribution for DiffusionDrive showing [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Sample instances where DiffusionDrive predictions had a PDMS score of 0 but were fixed with [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Analysis of catastrophic failure cases (PDM Score = 0) [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

read the original abstract

End-to-End (E2E) autonomous driving models have shown growing capability in recent years, with performance improving on increasingly challenging benchmarks. However, modern generative E2E planners still suffer from a substantial number of catastrophic failures in safety-critical scenarios. We find that many such failures arise from violations of physical constraints and safety requirements, leading to unsafe behavior. Motivated by this finding, in this paper, we focus on improving safety outcomes in generative end-to-end driving with a targeted reduction of catastrophic planning failures, instead of enhancing average planning quality. Towards this end, we propose DriveSafer, a failure-aware safety framework for end-to-end planners. DriveSafer explicitly steers generative planners towards safe behaviors leveraging both training-time safety constraints and inference-time safety guidance. Compared to the state-of-the-art DiffusionDrive model, on the NAVSIM benchmark, DriveSafer reduces the number of catastrophic failures (PDMS=0) by 48%, with over 65% reduction in drivable-area compliance failures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes DriveSafer, a failure-aware safety framework for generative end-to-end autonomous driving planners. It combines training-time safety constraints with inference-time safety guidance to steer models away from unsafe behaviors. On the NAVSIM benchmark, the method is reported to reduce catastrophic failures (PDMS=0) by 48% and drivable-area compliance failures by over 65% relative to the DiffusionDrive baseline.

Significance. If the reductions in catastrophic failures can be shown to occur without degrading mean performance on successful trajectories or introducing new failure modes such as excessive conservatism, the targeted safety focus would address a recognized weakness in current generative planners. The emphasis on failure reduction rather than average quality metrics is a constructive framing.

major comments (2)

Abstract: The headline claim of a 48% reduction in PDMS=0 cases and 65% reduction in drivable-area failures is presented without accompanying values for mean PDMS, collision rate conditional on PDMS>0, or route-completion rate. This omission leaves open whether the safety gains are achieved by shrinking the generative distribution toward conservative modes, which would undermine the claim that non-catastrophic performance is preserved.
Abstract and experimental description: No ablation studies, implementation details, or analysis of trade-offs are supplied to show the separate contributions of the training-time constraints and inference-time guidance, or to demonstrate that the method does not increase overly cautious failures invisible to the PDMS=0 metric.

minor comments (1)

The acronym PDMS is used without an explicit definition on first appearance in the abstract.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our work. We address the major comments point by point below and outline the revisions we plan to make to improve the manuscript.

read point-by-point responses

Referee: Abstract: The headline claim of a 48% reduction in PDMS=0 cases and 65% reduction in drivable-area failures is presented without accompanying values for mean PDMS, collision rate conditional on PDMS>0, or route-completion rate. This omission leaves open whether the safety gains are achieved by shrinking the generative distribution toward conservative modes, which would undermine the claim that non-catastrophic performance is preserved.

Authors: We appreciate the referee's concern regarding potential conservatism. Our approach is designed to steer away from unsafe behaviors without necessarily restricting the distribution to overly conservative modes. To address this, we will revise the abstract to include the mean PDMS score, the collision rate conditional on PDMS > 0, and the route-completion rate. In the experimental section, we will provide a more detailed comparison showing that performance on non-catastrophic trajectories remains competitive with the baseline. revision: yes
Referee: Abstract and experimental description: No ablation studies, implementation details, or analysis of trade-offs are supplied to show the separate contributions of the training-time constraints and inference-time guidance, or to demonstrate that the method does not increase overly cautious failures invisible to the PDMS=0 metric.

Authors: We agree that additional studies are needed to fully validate the contributions. We will incorporate ablation experiments that isolate the impact of the training-time safety constraints from the inference-time guidance. We will also add analysis to check for increases in overly cautious behaviors, for example by reporting average planning speed and the rate of unnecessary stops on successful drives. Expanded implementation details will be provided to facilitate reproducibility. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical gains on external benchmark rest on independent method and data

full rationale

The paper introduces DriveSafer as a new framework that adds explicit training-time safety constraints and inference-time guidance to generative planners. Its headline result is a measured reduction in PDMS=0 failures versus the external DiffusionDrive baseline on the public NAVSIM benchmark. No equations, fitted parameters, or self-citations are shown that would make the reported improvement equivalent to the inputs by construction. The derivation of the safety mechanism is presented as an engineering addition whose effect is evaluated externally rather than redefined into the metric.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the framework implicitly relies on standard assumptions of end-to-end learning and benchmark validity.

pith-pipeline@v0.9.0 · 5702 in / 1072 out tokens · 34195 ms · 2026-05-19T21:39:31.016009+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Ltotal = Lbase + λDAC LDAC + λCol LCol + λComf LComf ... LDAC penalizes predicted trajectories from leaving the drivable area, LCol penalizes trajectories that come too close to surrounding agents
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We generate alternate trajectories that have a variation of 5% in heading, speed and sub-meter lateral trajectory shifts ... The safest trajectory is then chosen

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 8 internal anchors

[1]

Real-time mpc with control barrier functions for autonomous driving using safety enhanced col- location.IFAC-PapersOnLine, 58(18):392–399, 2024

Jean Pierre Allamaa, Panagiotis Patrinos, Toshiyuki Oht- suka, and Tong Duy Son. Real-time mpc with control barrier functions for autonomous driving using safety enhanced col- location.IFAC-PapersOnLine, 58(18):392–399, 2024. 2

work page 2024
[2]

NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles

Holger Caesar, Juraj Kabzan, Kok Seang Tan, Whye Kit Fong, Eric Wolff, Alex Lang, Luke Fletcher, Oscar Beijbom, and Sammy Omari. nuplan: A closed-loop ml-based plan- ning benchmark for autonomous vehicles.arXiv preprint arXiv:2106.11810, 2021. 1

work page internal anchor Pith review Pith/arXiv arXiv 2021
[3]

Pseudo-simulation for autonomous driving.arXiv preprint arXiv:2506.04218,

Wei Cao, Marcel Hallgarten, Tianyu Li, Daniel Dauner, Xunjiang Gu, Caojun Wang, Yakov Miron, Marco Aiello, Hongyang Li, Igor Gilitschenski, et al. Pseudo-simulation for autonomous driving.arXiv preprint arXiv:2506.04218,

work page arXiv
[4]

Learning by cheating

Dian Chen, Brady Zhou, Vladlen Koltun, and Philipp Kr¨ahenb¨uhl. Learning by cheating. InConference on robot learning, pages 66–75. PMLR, 2020. 2

work page 2020
[5]

VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

Shaoyu Chen, Bo Jiang, Hao Gao, Bencheng Liao, Qing Xu, Qian Zhang, Chang Huang, Wenyu Liu, and Xinggang Wang. Vadv2: End-to-end vectorized autonomous driving via probabilistic planning.arXiv preprint arXiv:2402.13243,

work page internal anchor Pith review Pith/arXiv arXiv
[6]

Transfuser: Imitation with transformer-based sensor fusion for autonomous driv- ing.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11):12878–12895, 2023

Kashyap Chitta, Aditya Prakash, Bernhard Jaeger, Zehao Yu, Katrin Renz, and Andreas Geiger. Transfuser: Imitation with transformer-based sensor fusion for autonomous driv- ing.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11):12878–12895, 2023. 1, 2, 3

work page 2023
[7]

End-to-end driving via conditional imitation learning

Felipe Codevilla, Matthias M ¨uller, Antonio L ´opez, Vladlen Koltun, and Alexey Dosovitskiy. End-to-end driving via conditional imitation learning. In2018 IEEE international conference on robotics and automation (ICRA), pages 4693–

work page
[8]

Navsim: Data-driven non-reactive autonomous vehicle simulation and benchmarking.Advances in Neural Information Processing Systems, 37:28706–28719, 2024

Daniel Dauner, Marcel Hallgarten, Tianyu Li, Xinshuo Weng, Zhiyu Huang, Zetong Yang, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, et al. Navsim: Data-driven non-reactive autonomous vehicle simulation and benchmarking.Advances in Neural Information Processing Systems, 37:28706–28719, 2024. 1, 3, 4

work page 2024
[9]

Carla: An open urban driv- ing simulator

Alexey Dosovitskiy, German Ros, Felipe Codevilla, Anto- nio Lopez, and Vladlen Koltun. Carla: An open urban driv- ing simulator. InConference on robot learning, pages 1–16. PMLR, 2017. 1, 2

work page 2017
[10]

Orion: A holistic end-to- end autonomous driving framework by vision-language in- structed action generation

Haoyu Fu, Diankun Zhang, Zongchuang Zhao, Jianfeng Cui, Dingkang Liang, Chong Zhang, Dingyuan Zhang, Hongwei Xie, Bing Wang, and Xiang Bai. Orion: A holistic end-to- end autonomous driving framework by vision-language in- structed action generation. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 24823– 24834, 2025. 2

work page 2025
[11]

Planning-oriented autonomous driving

Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Xizhou Zhu, Siqi Chai, Senyao Du, Tianwei Lin, Wenhai Wang, et al. Planning-oriented autonomous driving. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 17853–17862, 2023. 2, 3

work page 2023
[12]

Carl: Learning scalable planning policies with simple rewards.arXiv preprint arXiv:2504.17838, 2025

Bernhard Jaeger, Daniel Dauner, Jens Beißwenger, Simon Gerstenecker, Kashyap Chitta, and Andreas Geiger. Carl: Learning scalable planning policies with simple rewards. arXiv preprint arXiv:2504.17838, 2025. 2

work page arXiv 2025
[13]

Driveadapter: Breaking the coupling barrier of perception and planning in end-to-end autonomous driving

Xiaosong Jia, Yulu Gao, Li Chen, Junchi Yan, Patrick Langechuan Liu, and Hongyang Li. Driveadapter: Breaking the coupling barrier of perception and planning in end-to-end autonomous driving. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 7953–7963, 2023. 2

work page 2023
[14]

Bench2drive: Towards multi-ability bench- marking of closed-loop end-to-end autonomous driving.Ad- vances in Neural Information Processing Systems, 37:819– 844, 2024

Xiaosong Jia, Zhenjie Yang, Qifeng Li, Zhiyuan Zhang, and Junchi Yan. Bench2drive: Towards multi-ability bench- marking of closed-loop end-to-end autonomous driving.Ad- vances in Neural Information Processing Systems, 37:819– 844, 2024. 1, 2

work page 2024
[15]

Vad: Vectorized scene representa- tion for efficient autonomous driving

Bo Jiang, Shaoyu Chen, Qing Xu, Bencheng Liao, Jiajie Chen, Helong Zhou, Qian Zhang, Wenyu Liu, Chang Huang, and Xinggang Wang. Vad: Vectorized scene representa- tion for efficient autonomous driving. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 8340–8350, 2023. 2

work page 2023
[16]

Provably-safe autonomous naviga- tion of traffic circles

Rohit Konda, Eric Squires, Pietro Pierpaoli, Magnus Egerst- edt, and Samuel Coogan. Provably-safe autonomous naviga- tion of traffic circles. In2019 IEEE Conference on control technology and applications (CCTA), pages 876–881. IEEE,

work page
[17]

Finetuning generative trajectory model with re- inforcement learning from human feedback.arXiv e-prints, pages arXiv–2503, 2025

Derun Li, Jianwei Ren, Yue Wang, Xin Wen, Pengxiang Li, Leimeng Xu, Kun Zhan, Zhongpu Xia, Peng Jia, Xianpeng Lang, et al. Finetuning generative trajectory model with re- inforcement learning from human feedback.arXiv e-prints, pages arXiv–2503, 2025. 2

work page 2025
[18]

Think2drive: Efficient reinforcement learning by thinking with latent world model for autonomous driving (in carla- v2)

Qifeng Li, Xiaosong Jia, Shaobo Wang, and Junchi Yan. Think2drive: Efficient reinforcement learning by thinking with latent world model for autonomous driving (in carla- v2). InEuropean conference on computer vision, pages 142–

work page
[19]

Enhancing End-to-End Autonomous Driving with Latent World Model

Yingyan Li, Lue Fan, Jiawei He, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang, and Tieniu Tan. Enhancing end-to-end autonomous driving with latent world model.arXiv preprint arXiv:2406.08481, 2024. 3

work page internal anchor Pith review Pith/arXiv arXiv 2024
[20]

End-to-end driving with online trajec- tory evaluation via bev world model

Yingyan Li, Yuqi Wang, Yang Liu, Jiawei He, Lue Fan, and Zhaoxiang Zhang. End-to-end driving with online trajec- tory evaluation via bev world model. InProceedings of the 5 IEEE/CVF International Conference on Computer Vision, pages 27137–27146, 2025. 3

work page 2025
[21]

Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation

Zhenxin Li, Kailin Li, Shihao Wang, Shiyi Lan, Zhiding Yu, Yishen Ji, Zhiqi Li, Ziyue Zhu, Jan Kautz, Zuxuan Wu, et al. Hydra-mdp: End-to-end multimodal planning with multi- target hydra-distillation.arXiv preprint arXiv:2406.06978,

work page internal anchor Pith review Pith/arXiv arXiv
[22]

Zhiqi Li, Zhiding Yu, Shiyi Lan, Jiahan Li, Jan Kautz, Tong Lu, and Jose M. Alvarez. Is ego status all you need for open- loop end-to-end autonomous driving? InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14864–14873, 2024. 1

work page 2024
[23]

Diffusiondrive: Truncated diffusion model for end-to-end autonomous driving

Bencheng Liao, Shaoyu Chen, Haoran Yin, Bo Jiang, Cheng Wang, Sixu Yan, Xinbang Zhang, Xiangyu Li, Ying Zhang, Qian Zhang, et al. Diffusiondrive: Truncated diffusion model for end-to-end autonomous driving. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 12037–12047, 2025. 1, 2, 3

work page 2025
[24]

Lead: Minimizing learner-expert asymmetry in end- to-end driving

Long Nguyen, Micha Fauth, Bernhard Jaeger, Daniel Dauner, Maximilian Igl, Andreas Geiger, and Kashyap Chitta. Lead: Minimizing learner-expert asymmetry in end- to-end driving. InConference on Computer Vision and Pat- tern Recognition (CVPR), 2026. 2

work page 2026
[25]

Simlingo: Vision-only closed-loop autonomous driving with language-action alignment

Katrin Renz, Long Chen, Elahe Arani, and Oleg Sinavski. Simlingo: Vision-only closed-loop autonomous driving with language-action alignment. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 11993– 12003, 2025. 2

work page 2025
[26]

Drivedpo: Policy learning via safety dpo for end-to-end autonomous driving.arXiv preprint arXiv:2509.17940, 2025

Shuyao Shang, Yuntao Chen, Yuqi Wang, Yingyan Li, and Zhaoxiang Zhang. Drivedpo: Policy learning via safety dpo for end-to-end autonomous driving.arXiv preprint arXiv:2509.17940, 2025. 2

work page arXiv 2025
[27]

Drivex: Omni scene modeling for learning generaliz- able world knowledge in autonomous driving

Chen Shi, Shaoshuai Shi, Kehua Sheng, Bo Zhang, and Li Jiang. Drivex: Omni scene modeling for learning generaliz- able world knowledge in autonomous driving. InProceed- ings of the IEEE/CVF International Conference on Com- puter Vision (ICCV), pages 28599–28609, 2025. 3

work page 2025
[28]

Bevmap- match: Multimodal bev neural map matching for robust re-localization of autonomous vehicles.arXiv preprint arXiv:2603.25963, 2026

Shounak Sural and Ragunathan Rajkumar. Bevmap- match: Multimodal bev neural map matching for robust re-localization of autonomous vehicles.arXiv preprint arXiv:2603.25963, 2026. 4

work page arXiv 2026
[29]

Hip-ad: Hierarchical and multi-granularity plan- ning with deformable attention for autonomous driving in a single decoder

Yingqi Tang, Zhuoran Xu, Zhaotie Meng, and Erkang Cheng. Hip-ad: Hierarchical and multi-granularity plan- ning with deformable attention for autonomous driving in a single decoder. InProceedings of the IEEE/CVF In- ternational Conference on Computer Vision (ICCV), pages 25605–25615, 2025. 2

work page 2025
[30]

LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset

Royden Wagner, Omer Sahin Tas, Jaime Villa, Felix Hauser, Yinzhe Shen, Marlon Steiner, Dominik Strutz, Carlos Fer- nandez, Christian Kinzig, Guillermo S Guitierrez-Cabello, et al. Longtail driving scenarios with reasoning traces: The kitscenes longtail dataset.arXiv preprint arXiv:2603.23607,

work page internal anchor Pith review Pith/arXiv arXiv
[31]

Drivedreamer: Towards real-world- drive world models for autonomous driving

Xiaofeng Wang, Zheng Zhu, Guan Huang, Xinze Chen, Jia- gang Zhu, and Jiwen Lu. Drivedreamer: Towards real-world- drive world models for autonomous driving. InEuropean conference on computer vision, pages 55–72. Springer, 2024. 2

work page 2024
[32]

Driving into the future: Multiview visual forecasting and planning with world model for au- tonomous driving

Yuqi Wang, Jiawei He, Lue Fan, Hongxin Li, Yuntao Chen, and Zhaoxiang Zhang. Driving into the future: Multiview visual forecasting and planning with world model for au- tonomous driving. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 14749–14759, 2024. 2

work page 2024
[33]

Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail

Yan Wang, Wenjie Luo, Junjie Bai, Yulong Cao, Tong Che, Ke Chen, Yuxiao Chen, Jenna Diamond, Yifan Ding, Wen- hao Ding, et al. Alpamayo-r1: Bridging reasoning and action prediction for generalizable autonomous driving in the long tail.arXiv preprint arXiv:2511.00088, 2025. 2

work page internal anchor Pith review Pith/arXiv arXiv 2025
[34]

Para-drive: Parallelized architecture for real- time autonomous driving

Xinshuo Weng, Boris Ivanovic, Yan Wang, Yue Wang, and Marco Pavone. Para-drive: Parallelized architecture for real- time autonomous driving. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15449–15458, 2024. 2, 3

work page 2024
[35]

Maplocnet: Coarse-to- fine feature registration for visual re-localization in naviga- tion maps

Hang Wu, Zhenghao Zhang, Siyuan Lin, Xiangru Mu, Qiang Zhao, Ming Yang, and Tong Qin. Maplocnet: Coarse-to- fine feature registration for visual re-localization in naviga- tion maps. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 13198–13205. IEEE, 2024. 4

work page 2024
[36]

A real-time control barrier function-based safety filter for motion plan- ning with arbitrary road boundary constraints.arXiv preprint arXiv:2505.02395, 2025

Jianye Xu, Chang Che, and Bassam Alrifaee. A real-time control barrier function-based safety filter for motion plan- ning with arbitrary road boundary constraints.arXiv preprint arXiv:2505.02395, 2025. 2

work page arXiv 2025
[37]

Wod-e2e: Waymo open dataset for end-to-end driving in challenging long-tail scenarios,

Runsheng Xu, Hubert Lin, Wonseok Jeon, Hao Feng, Yu- liang Zou, Liting Sun, John Gorman, Ekaterina Tolstaya, Sarah Tang, Brandyn White, et al. Wod-e2e: Waymo open dataset for end-to-end driving in challenging long-tail sce- narios.arXiv preprint arXiv:2510.26125, 2025. 1, 2

work page arXiv 2025
[38]

Drama: An efficient end-to-end motion planner for autonomous driving with mamba.arXiv preprint arXiv:2408.03601, 2024

Chengran Yuan, Zhanqi Zhang, Jiawei Sun, Shuo Sun, Ze- fan Huang, Christina Dao Wen Lee, Dongen Li, Yuhang Han, Anthony Wong, Keng Peng Tee, et al. Drama: An efficient end-to-end motion planner for autonomous driving with mamba.arXiv preprint arXiv:2408.03601, 2024. 3

work page arXiv 2024
[39]

Rethinking the Open-Loop Evaluation of End-to-End Autonomous Driving in nuScenes

Jiang-Tian Zhai, Ze Feng, Jinhao Du, Yongqiang Mao, Jiang-Jiang Liu, Zichang Tan, Yifu Zhang, Xiaoqing Ye, and Jingdong Wang. Rethinking the open-loop evaluation of end-to-end autonomous driving in nuscenes.arXiv preprint arXiv:2305.10430, 2023. 1

work page internal anchor Pith review Pith/arXiv arXiv 2023
[40]

Coc- vla: Delving into adversarial domain transfer for explainable autonomous driving via chain-of-causality visual-language- action model.arXiv preprint arXiv:2511.19914, 2025

Dapeng Zhang, Fei Shen, Rui Zhao, Yinda Chen, Peng Zhi, Chenyang Li, Rui Zhou, and Qingguo Zhou. Coc- vla: Delving into adversarial domain transfer for explainable autonomous driving via chain-of-causality visual-language- action model.arXiv preprint arXiv:2511.19914, 2025. 2

work page arXiv 2025
[41]

Cat: Closed-loop adversarial training for safe end-to-end driving

Linrui Zhang, Zhenghao Peng, Quanyi Li, and Bolei Zhou. Cat: Closed-loop adversarial training for safe end-to-end driving. InConference on Robot Learning, pages 2357–

work page
[42]

Occworld: Learning a 3d occupancy world model for autonomous driving

Wenzhao Zheng, Weiliang Chen, Yuanhui Huang, Borui Zhang, Yueqi Duan, and Jiwen Lu. Occworld: Learning a 3d occupancy world model for autonomous driving. InEuro- pean conference on computer vision, pages 55–72. Springer,

work page
[43]

Diffusion-based planning for autonomous driving with flexible guidance.arXiv preprint arXiv:2501.15564, 2025

Yinan Zheng, Ruiming Liang, Kexin Zheng, Jinliang Zheng, Liyuan Mao, Jianxiong Li, Weihao Gu, Rui Ai, 6 Shengbo Eben Li, Xianyuan Zhan, et al. Diffusion-based planning for autonomous driving with flexible guidance. arXiv preprint arXiv:2501.15564, 2025. 2

work page arXiv 2025
[44]

AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning

Zewei Zhou, Tianhui Cai, Seth Z Zhao, Yun Zhang, Zhiyu Huang, Bolei Zhou, and Jiaqi Ma. Autovla: A vision- language-action model for end-to-end autonomous driving with adaptive reasoning and reinforcement fine-tuning.arXiv preprint arXiv:2506.13757, 2025. 2 7

work page internal anchor Pith review Pith/arXiv arXiv 2025

[1] [1]

Real-time mpc with control barrier functions for autonomous driving using safety enhanced col- location.IFAC-PapersOnLine, 58(18):392–399, 2024

Jean Pierre Allamaa, Panagiotis Patrinos, Toshiyuki Oht- suka, and Tong Duy Son. Real-time mpc with control barrier functions for autonomous driving using safety enhanced col- location.IFAC-PapersOnLine, 58(18):392–399, 2024. 2

work page 2024

[2] [2]

NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles

Holger Caesar, Juraj Kabzan, Kok Seang Tan, Whye Kit Fong, Eric Wolff, Alex Lang, Luke Fletcher, Oscar Beijbom, and Sammy Omari. nuplan: A closed-loop ml-based plan- ning benchmark for autonomous vehicles.arXiv preprint arXiv:2106.11810, 2021. 1

work page internal anchor Pith review Pith/arXiv arXiv 2021

[3] [3]

Pseudo-simulation for autonomous driving.arXiv preprint arXiv:2506.04218,

Wei Cao, Marcel Hallgarten, Tianyu Li, Daniel Dauner, Xunjiang Gu, Caojun Wang, Yakov Miron, Marco Aiello, Hongyang Li, Igor Gilitschenski, et al. Pseudo-simulation for autonomous driving.arXiv preprint arXiv:2506.04218,

work page arXiv

[4] [4]

Learning by cheating

Dian Chen, Brady Zhou, Vladlen Koltun, and Philipp Kr¨ahenb¨uhl. Learning by cheating. InConference on robot learning, pages 66–75. PMLR, 2020. 2

work page 2020

[5] [5]

VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

Shaoyu Chen, Bo Jiang, Hao Gao, Bencheng Liao, Qing Xu, Qian Zhang, Chang Huang, Wenyu Liu, and Xinggang Wang. Vadv2: End-to-end vectorized autonomous driving via probabilistic planning.arXiv preprint arXiv:2402.13243,

work page internal anchor Pith review Pith/arXiv arXiv

[6] [6]

Transfuser: Imitation with transformer-based sensor fusion for autonomous driv- ing.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11):12878–12895, 2023

Kashyap Chitta, Aditya Prakash, Bernhard Jaeger, Zehao Yu, Katrin Renz, and Andreas Geiger. Transfuser: Imitation with transformer-based sensor fusion for autonomous driv- ing.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11):12878–12895, 2023. 1, 2, 3

work page 2023

[7] [7]

End-to-end driving via conditional imitation learning

Felipe Codevilla, Matthias M ¨uller, Antonio L ´opez, Vladlen Koltun, and Alexey Dosovitskiy. End-to-end driving via conditional imitation learning. In2018 IEEE international conference on robotics and automation (ICRA), pages 4693–

work page

[8] [8]

Navsim: Data-driven non-reactive autonomous vehicle simulation and benchmarking.Advances in Neural Information Processing Systems, 37:28706–28719, 2024

Daniel Dauner, Marcel Hallgarten, Tianyu Li, Xinshuo Weng, Zhiyu Huang, Zetong Yang, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, et al. Navsim: Data-driven non-reactive autonomous vehicle simulation and benchmarking.Advances in Neural Information Processing Systems, 37:28706–28719, 2024. 1, 3, 4

work page 2024

[9] [9]

Carla: An open urban driv- ing simulator

Alexey Dosovitskiy, German Ros, Felipe Codevilla, Anto- nio Lopez, and Vladlen Koltun. Carla: An open urban driv- ing simulator. InConference on robot learning, pages 1–16. PMLR, 2017. 1, 2

work page 2017

[10] [10]

Orion: A holistic end-to- end autonomous driving framework by vision-language in- structed action generation

Haoyu Fu, Diankun Zhang, Zongchuang Zhao, Jianfeng Cui, Dingkang Liang, Chong Zhang, Dingyuan Zhang, Hongwei Xie, Bing Wang, and Xiang Bai. Orion: A holistic end-to- end autonomous driving framework by vision-language in- structed action generation. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 24823– 24834, 2025. 2

work page 2025

[11] [11]

Planning-oriented autonomous driving

Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Xizhou Zhu, Siqi Chai, Senyao Du, Tianwei Lin, Wenhai Wang, et al. Planning-oriented autonomous driving. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 17853–17862, 2023. 2, 3

work page 2023

[12] [12]

Carl: Learning scalable planning policies with simple rewards.arXiv preprint arXiv:2504.17838, 2025

Bernhard Jaeger, Daniel Dauner, Jens Beißwenger, Simon Gerstenecker, Kashyap Chitta, and Andreas Geiger. Carl: Learning scalable planning policies with simple rewards. arXiv preprint arXiv:2504.17838, 2025. 2

work page arXiv 2025

[13] [13]

Driveadapter: Breaking the coupling barrier of perception and planning in end-to-end autonomous driving

Xiaosong Jia, Yulu Gao, Li Chen, Junchi Yan, Patrick Langechuan Liu, and Hongyang Li. Driveadapter: Breaking the coupling barrier of perception and planning in end-to-end autonomous driving. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 7953–7963, 2023. 2

work page 2023

[14] [14]

Bench2drive: Towards multi-ability bench- marking of closed-loop end-to-end autonomous driving.Ad- vances in Neural Information Processing Systems, 37:819– 844, 2024

Xiaosong Jia, Zhenjie Yang, Qifeng Li, Zhiyuan Zhang, and Junchi Yan. Bench2drive: Towards multi-ability bench- marking of closed-loop end-to-end autonomous driving.Ad- vances in Neural Information Processing Systems, 37:819– 844, 2024. 1, 2

work page 2024

[15] [15]

Vad: Vectorized scene representa- tion for efficient autonomous driving

Bo Jiang, Shaoyu Chen, Qing Xu, Bencheng Liao, Jiajie Chen, Helong Zhou, Qian Zhang, Wenyu Liu, Chang Huang, and Xinggang Wang. Vad: Vectorized scene representa- tion for efficient autonomous driving. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 8340–8350, 2023. 2

work page 2023

[16] [16]

Provably-safe autonomous naviga- tion of traffic circles

Rohit Konda, Eric Squires, Pietro Pierpaoli, Magnus Egerst- edt, and Samuel Coogan. Provably-safe autonomous naviga- tion of traffic circles. In2019 IEEE Conference on control technology and applications (CCTA), pages 876–881. IEEE,

work page

[17] [17]

Finetuning generative trajectory model with re- inforcement learning from human feedback.arXiv e-prints, pages arXiv–2503, 2025

Derun Li, Jianwei Ren, Yue Wang, Xin Wen, Pengxiang Li, Leimeng Xu, Kun Zhan, Zhongpu Xia, Peng Jia, Xianpeng Lang, et al. Finetuning generative trajectory model with re- inforcement learning from human feedback.arXiv e-prints, pages arXiv–2503, 2025. 2

work page 2025

[18] [18]

Think2drive: Efficient reinforcement learning by thinking with latent world model for autonomous driving (in carla- v2)

Qifeng Li, Xiaosong Jia, Shaobo Wang, and Junchi Yan. Think2drive: Efficient reinforcement learning by thinking with latent world model for autonomous driving (in carla- v2). InEuropean conference on computer vision, pages 142–

work page

[19] [19]

Enhancing End-to-End Autonomous Driving with Latent World Model

Yingyan Li, Lue Fan, Jiawei He, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang, and Tieniu Tan. Enhancing end-to-end autonomous driving with latent world model.arXiv preprint arXiv:2406.08481, 2024. 3

work page internal anchor Pith review Pith/arXiv arXiv 2024

[20] [20]

End-to-end driving with online trajec- tory evaluation via bev world model

Yingyan Li, Yuqi Wang, Yang Liu, Jiawei He, Lue Fan, and Zhaoxiang Zhang. End-to-end driving with online trajec- tory evaluation via bev world model. InProceedings of the 5 IEEE/CVF International Conference on Computer Vision, pages 27137–27146, 2025. 3

work page 2025

[21] [21]

Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation

Zhenxin Li, Kailin Li, Shihao Wang, Shiyi Lan, Zhiding Yu, Yishen Ji, Zhiqi Li, Ziyue Zhu, Jan Kautz, Zuxuan Wu, et al. Hydra-mdp: End-to-end multimodal planning with multi- target hydra-distillation.arXiv preprint arXiv:2406.06978,

work page internal anchor Pith review Pith/arXiv arXiv

[22] [22]

Zhiqi Li, Zhiding Yu, Shiyi Lan, Jiahan Li, Jan Kautz, Tong Lu, and Jose M. Alvarez. Is ego status all you need for open- loop end-to-end autonomous driving? InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14864–14873, 2024. 1

work page 2024

[23] [23]

Diffusiondrive: Truncated diffusion model for end-to-end autonomous driving

Bencheng Liao, Shaoyu Chen, Haoran Yin, Bo Jiang, Cheng Wang, Sixu Yan, Xinbang Zhang, Xiangyu Li, Ying Zhang, Qian Zhang, et al. Diffusiondrive: Truncated diffusion model for end-to-end autonomous driving. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 12037–12047, 2025. 1, 2, 3

work page 2025

[24] [24]

Lead: Minimizing learner-expert asymmetry in end- to-end driving

Long Nguyen, Micha Fauth, Bernhard Jaeger, Daniel Dauner, Maximilian Igl, Andreas Geiger, and Kashyap Chitta. Lead: Minimizing learner-expert asymmetry in end- to-end driving. InConference on Computer Vision and Pat- tern Recognition (CVPR), 2026. 2

work page 2026

[25] [25]

Simlingo: Vision-only closed-loop autonomous driving with language-action alignment

Katrin Renz, Long Chen, Elahe Arani, and Oleg Sinavski. Simlingo: Vision-only closed-loop autonomous driving with language-action alignment. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 11993– 12003, 2025. 2

work page 2025

[26] [26]

Drivedpo: Policy learning via safety dpo for end-to-end autonomous driving.arXiv preprint arXiv:2509.17940, 2025

Shuyao Shang, Yuntao Chen, Yuqi Wang, Yingyan Li, and Zhaoxiang Zhang. Drivedpo: Policy learning via safety dpo for end-to-end autonomous driving.arXiv preprint arXiv:2509.17940, 2025. 2

work page arXiv 2025

[27] [27]

Drivex: Omni scene modeling for learning generaliz- able world knowledge in autonomous driving

Chen Shi, Shaoshuai Shi, Kehua Sheng, Bo Zhang, and Li Jiang. Drivex: Omni scene modeling for learning generaliz- able world knowledge in autonomous driving. InProceed- ings of the IEEE/CVF International Conference on Com- puter Vision (ICCV), pages 28599–28609, 2025. 3

work page 2025

[28] [28]

Bevmap- match: Multimodal bev neural map matching for robust re-localization of autonomous vehicles.arXiv preprint arXiv:2603.25963, 2026

Shounak Sural and Ragunathan Rajkumar. Bevmap- match: Multimodal bev neural map matching for robust re-localization of autonomous vehicles.arXiv preprint arXiv:2603.25963, 2026. 4

work page arXiv 2026

[29] [29]

Hip-ad: Hierarchical and multi-granularity plan- ning with deformable attention for autonomous driving in a single decoder

Yingqi Tang, Zhuoran Xu, Zhaotie Meng, and Erkang Cheng. Hip-ad: Hierarchical and multi-granularity plan- ning with deformable attention for autonomous driving in a single decoder. InProceedings of the IEEE/CVF In- ternational Conference on Computer Vision (ICCV), pages 25605–25615, 2025. 2

work page 2025

[30] [30]

LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset

Royden Wagner, Omer Sahin Tas, Jaime Villa, Felix Hauser, Yinzhe Shen, Marlon Steiner, Dominik Strutz, Carlos Fer- nandez, Christian Kinzig, Guillermo S Guitierrez-Cabello, et al. Longtail driving scenarios with reasoning traces: The kitscenes longtail dataset.arXiv preprint arXiv:2603.23607,

work page internal anchor Pith review Pith/arXiv arXiv

[31] [31]

Drivedreamer: Towards real-world- drive world models for autonomous driving

Xiaofeng Wang, Zheng Zhu, Guan Huang, Xinze Chen, Jia- gang Zhu, and Jiwen Lu. Drivedreamer: Towards real-world- drive world models for autonomous driving. InEuropean conference on computer vision, pages 55–72. Springer, 2024. 2

work page 2024

[32] [32]

Driving into the future: Multiview visual forecasting and planning with world model for au- tonomous driving

Yuqi Wang, Jiawei He, Lue Fan, Hongxin Li, Yuntao Chen, and Zhaoxiang Zhang. Driving into the future: Multiview visual forecasting and planning with world model for au- tonomous driving. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 14749–14759, 2024. 2

work page 2024

[33] [33]

Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail

Yan Wang, Wenjie Luo, Junjie Bai, Yulong Cao, Tong Che, Ke Chen, Yuxiao Chen, Jenna Diamond, Yifan Ding, Wen- hao Ding, et al. Alpamayo-r1: Bridging reasoning and action prediction for generalizable autonomous driving in the long tail.arXiv preprint arXiv:2511.00088, 2025. 2

work page internal anchor Pith review Pith/arXiv arXiv 2025

[34] [34]

Para-drive: Parallelized architecture for real- time autonomous driving

Xinshuo Weng, Boris Ivanovic, Yan Wang, Yue Wang, and Marco Pavone. Para-drive: Parallelized architecture for real- time autonomous driving. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15449–15458, 2024. 2, 3

work page 2024

[35] [35]

Maplocnet: Coarse-to- fine feature registration for visual re-localization in naviga- tion maps

Hang Wu, Zhenghao Zhang, Siyuan Lin, Xiangru Mu, Qiang Zhao, Ming Yang, and Tong Qin. Maplocnet: Coarse-to- fine feature registration for visual re-localization in naviga- tion maps. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 13198–13205. IEEE, 2024. 4

work page 2024

[36] [36]

A real-time control barrier function-based safety filter for motion plan- ning with arbitrary road boundary constraints.arXiv preprint arXiv:2505.02395, 2025

Jianye Xu, Chang Che, and Bassam Alrifaee. A real-time control barrier function-based safety filter for motion plan- ning with arbitrary road boundary constraints.arXiv preprint arXiv:2505.02395, 2025. 2

work page arXiv 2025

[37] [37]

Wod-e2e: Waymo open dataset for end-to-end driving in challenging long-tail scenarios,

Runsheng Xu, Hubert Lin, Wonseok Jeon, Hao Feng, Yu- liang Zou, Liting Sun, John Gorman, Ekaterina Tolstaya, Sarah Tang, Brandyn White, et al. Wod-e2e: Waymo open dataset for end-to-end driving in challenging long-tail sce- narios.arXiv preprint arXiv:2510.26125, 2025. 1, 2

work page arXiv 2025

[38] [38]

Drama: An efficient end-to-end motion planner for autonomous driving with mamba.arXiv preprint arXiv:2408.03601, 2024

Chengran Yuan, Zhanqi Zhang, Jiawei Sun, Shuo Sun, Ze- fan Huang, Christina Dao Wen Lee, Dongen Li, Yuhang Han, Anthony Wong, Keng Peng Tee, et al. Drama: An efficient end-to-end motion planner for autonomous driving with mamba.arXiv preprint arXiv:2408.03601, 2024. 3

work page arXiv 2024

[39] [39]

Rethinking the Open-Loop Evaluation of End-to-End Autonomous Driving in nuScenes

Jiang-Tian Zhai, Ze Feng, Jinhao Du, Yongqiang Mao, Jiang-Jiang Liu, Zichang Tan, Yifu Zhang, Xiaoqing Ye, and Jingdong Wang. Rethinking the open-loop evaluation of end-to-end autonomous driving in nuscenes.arXiv preprint arXiv:2305.10430, 2023. 1

work page internal anchor Pith review Pith/arXiv arXiv 2023

[40] [40]

Coc- vla: Delving into adversarial domain transfer for explainable autonomous driving via chain-of-causality visual-language- action model.arXiv preprint arXiv:2511.19914, 2025

Dapeng Zhang, Fei Shen, Rui Zhao, Yinda Chen, Peng Zhi, Chenyang Li, Rui Zhou, and Qingguo Zhou. Coc- vla: Delving into adversarial domain transfer for explainable autonomous driving via chain-of-causality visual-language- action model.arXiv preprint arXiv:2511.19914, 2025. 2

work page arXiv 2025

[41] [41]

Cat: Closed-loop adversarial training for safe end-to-end driving

Linrui Zhang, Zhenghao Peng, Quanyi Li, and Bolei Zhou. Cat: Closed-loop adversarial training for safe end-to-end driving. InConference on Robot Learning, pages 2357–

work page

[42] [42]

Occworld: Learning a 3d occupancy world model for autonomous driving

Wenzhao Zheng, Weiliang Chen, Yuanhui Huang, Borui Zhang, Yueqi Duan, and Jiwen Lu. Occworld: Learning a 3d occupancy world model for autonomous driving. InEuro- pean conference on computer vision, pages 55–72. Springer,

work page

[43] [43]

Diffusion-based planning for autonomous driving with flexible guidance.arXiv preprint arXiv:2501.15564, 2025

Yinan Zheng, Ruiming Liang, Kexin Zheng, Jinliang Zheng, Liyuan Mao, Jianxiong Li, Weihao Gu, Rui Ai, 6 Shengbo Eben Li, Xianyuan Zhan, et al. Diffusion-based planning for autonomous driving with flexible guidance. arXiv preprint arXiv:2501.15564, 2025. 2

work page arXiv 2025

[44] [44]

AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning

Zewei Zhou, Tianhui Cai, Seth Z Zhao, Yun Zhang, Zhiyu Huang, Bolei Zhou, and Jiaqi Ma. Autovla: A vision- language-action model for end-to-end autonomous driving with adaptive reasoning and reinforcement fine-tuning.arXiv preprint arXiv:2506.13757, 2025. 2 7

work page internal anchor Pith review Pith/arXiv arXiv 2025