Using Ensemble Diffusion to Estimate Uncertainty for End-to-End Autonomous Driving

Florian Wintel; Frank Lindseth; Gabriel Kiss; Sigmund H. H{\o}eg

arxiv: 2506.00560 · v2 · pith:PI6AVQD3new · submitted 2025-05-31 · 💻 cs.RO · cs.CV

Using Ensemble Diffusion to Estimate Uncertainty for End-to-End Autonomous Driving

Florian Wintel , Sigmund H. H{\o}eg , Gabriel Kiss , Frank Lindseth This is my paper

Pith reviewed 2026-05-25 08:16 UTC · model grok-4.3

classification 💻 cs.RO cs.CV

keywords end-to-end autonomous drivingdiffusion modelsensemble methodsuncertainty estimationtrajectory planningCARLA simulatorLAV benchmarkmultimodal prediction

0 comments

The pith

Ensemble diffusion generates distributions of trajectories to model uncertainty in end-to-end autonomous driving.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces EnDfuser, an end-to-end driving system that replaces point-estimate trajectory planners with a diffusion model. It fuses camera and LiDAR features through attention pooling inside a diffusion transformer and, from one perception frame, samples 128 candidate trajectories via ensemble diffusion. This produces an explicit distribution that reveals multimodal and uncertain future paths. From the set of trajectories the authors derive a simple safety rule that raises the driving score by 1.7 percent on the LAV benchmark. The central argument is that ensemble diffusion can serve as a drop-in module for uncertainty-aware decision making in closed-loop driving policies.

Core claim

EnDfuser uses a diffusion transformer to combine perception fusion and trajectory planning. Instead of committing to one plan, the model draws 128 trajectories from the posterior distribution in a single forward pass. The resulting set of paths supplies interpretability for uncertain, multimodal spaces and supports a safety rule that improves benchmark performance by 1.7 percent on LAV.

What carries the argument

Ensemble diffusion inside a diffusion transformer module that outputs a distribution of 128 trajectories from fused perception features.

If this is right

The full set of candidate trajectories supplies interpretability for multimodal future spaces.
A safety rule can be designed directly from observed trajectory spread.
Ensemble diffusion can replace traditional point-estimate planners in end-to-end policies.
Uncertainty of the posterior trajectory distribution becomes available for downstream decision making.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same sampling approach could be applied to other sensor suites or non-driving control tasks that require multimodal predictions.
The 128-sample distribution might support probabilistic collision checks or risk-aware planning beyond the simple rule tested.
If the sampled trajectories prove well-calibrated, the method could reduce reliance on hand-crafted uncertainty modules in other autonomous systems.

Load-bearing premise

The trajectories sampled by the ensemble diffusion model accurately reflect real-world uncertainty and the safety rule derived from them improves safety without creating new failure modes.

What would settle it

A closed-loop test in which the safety rule based on trajectory spread produces lower driving scores or additional collisions compared with the baseline point-estimate planner.

Figures

Figures reproduced from arXiv: 2506.00560 by Florian Wintel, Frank Lindseth, Gabriel Kiss, Sigmund H. H{\o}eg.

**Figure 1.** Figure 1: EnDfuser architecture. (a) The TransFuser++ perception backbone consumes two modalities, RGB images from the ego perspective and a LiDAR birds-eye-view (BEV) image. Transformer-based sensor fusion is performed between the two convolutional branches, after which four auxiliary perception tasks are learned (BEV segmentation, BEV object detection, ego perspective depth estimation and ego perspective segmenta… view at source ↗

**Figure 2.** Figure 2: We apply attention pooling on the BEV features. (a) TF++ WP relies on learned queries and GRUs. (b) We accomplish similar attention pooling by creating individual waypoint queries that each sample from the noise prior. As we can sample from the noise prior N times, we can denoise an arbitrary number of plans for any given perception frame. three TF++ WP instances. Since we completely replace the planning m… view at source ↗

**Figure 3.** Figure 3: Uncertainty map in Town02. Areas with a regular occurrence of variance spikes are clearly visible around intersections and bends. Each town displays the variances of 18 cumulative episodes driven by EnDfuser, downsampled to 2Hz and color coded from low variance ◦ to high variance • in the speed predictions. 4. Experiments and comparison We evaluate our agent on the Longest6 benchmark in CARLA [PITH_FUL… view at source ↗

**Figure 4.** Figure 4: Categories of uncertain situations. The majority of uncertainty spikes coincide directly with traffic interactions. We investigate the agent’s context in the 100 least certain situations by recording the sensory input of a full Longest6 evaluation (36 episodes) and extracting the 100 sequences with the highest variance values σˆ 2 (K spd t ). 5. Discussion In the following section, we discuss the observed… view at source ↗

**Figure 5.** Figure 5: High variance situations. X and Y components represent the posterior trajectory sample Tt, desired speed and yaw angle represent Kt. The selected action is marked in magenta. Most instances of high variane are interactions with dynamic objects, either other agents (a) or traffic signals (b). We attribute such instances to aleatoric uncertainty due to the unpredictable nature of other agents. As Tt represen… view at source ↗

**Figure 8.** Figure 8: Failure to predict. (a) EnDfuser changes lanes while taking a right turn. It ignores the vehicle to its right and causes a collision. (b) No spike in uncertainty is detectable. addition, EnDfuser is not equipped to distinguish between aleatoric and epistemic uncertainty, since it only produces first-order candidate distributions. Finally, we do not compare ourselves to agents that only target newer, more … view at source ↗

**Figure 6.** Figure 6: Pre-crash condition. We observe an uncertainty spike before a collision occurs. The ego vehicle is in the process of overshooting into the leftmost lane, while another car is approaching fast from behind, leading to a collision. (a) TP • (b) Tt [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Label noise: (a) The prediction horizon extends beyond the target point, forcing the agent to predict positions for which it has no driving instruction. (b) This results in lateral uncertainty in the posterior sample Tt Choosing a different transformation operation, such a multimodal prediction could cause erratic driving behavior. The observation may further offer an explanation why giving the agent two … view at source ↗

read the original abstract

End-to-end planning systems for autonomous driving are rapidly improving, especially in closed-loop simulation environments like CARLA. Many such driving systems either do not consider uncertainty as part of the plan itself or obtain it by using specialized representations that do not generalize. In this paper, we propose EnDfuser, an end-to-end driving system that uses a diffusion model as the trajectory planner. EnDfuser effectively leverages complex perception information like fused camera and LiDAR features, through combining attention pooling and trajectory planning into a single diffusion transformer module. Instead of committing to a single plan, EnDfuser produces a distribution of candidate trajectories (128 for our case) from a single perception frame through ensemble diffusion. By observing the full set of candidate trajectories, EnDfuser provides interpretability for uncertain, multimodal future trajectory spaces. Using this information we design a simplistic safety-rule that improves the system's driving score by 1.7% on the LAV benchmark. Our findings suggest that ensemble diffusion, used as a drop-in replacement for traditional point-estimate trajectory planning modules, can contribute to an uncertainty-aware decision making process in End-to-End driving policies by modeling the uncertainty of the posterior trajectory distribution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EnDfuser puts ensemble diffusion inside a fused perception-planning transformer and gets a 1.7% LAV bump from a simple safety rule on the 128 trajectories, but supplies no check that those trajectories track real posterior uncertainty.

read the letter

The paper's concrete move is to replace a point-estimate planner with an ensemble diffusion module inside a single attention-pooling transformer that ingests fused camera and LiDAR features. It outputs 128 trajectories per frame and feeds their spread into a hand-crafted safety rule. That combination is not described in the cited prior work, and the 1.7% driving-score lift on LAV is the only quantitative result given in the abstract.

Referee Report

2 major / 2 minor

Summary. The paper proposes EnDfuser, an end-to-end autonomous driving policy that replaces a point-estimate trajectory planner with an ensemble diffusion transformer operating on fused camera-LiDAR features. From a single perception frame the model samples 128 trajectories; a hand-crafted safety rule derived from this set is reported to raise the driving score by 1.7% on the LAV benchmark. The central claim is that the sampled distribution models posterior trajectory uncertainty and thereby enables more interpretable, uncertainty-aware decision making.

Significance. If the empirical link between the diffusion ensemble and posterior uncertainty were demonstrated, the work would supply a practical drop-in module for uncertainty estimation inside existing end-to-end stacks. The 1.7% gain is modest, however, and the absence of calibration diagnostics or comparisons to other uncertainty estimators limits the immediate significance for safety-critical deployment.

major comments (2)

[Abstract and §4] Abstract and §4 (Experiments): the reported 1.7% LAV improvement is stated without any baseline description, statistical significance test, data-split protocol, or ablation that isolates the contribution of the ensemble diffusion component versus the safety rule itself.
[§3.2 and §4] §3.2 and §4: the assertion that the 128 sampled trajectories “model the uncertainty of the posterior trajectory distribution” is unsupported by any calibration result (e.g., predicted variance versus realized error on held-out data) or comparison against alternative uncertainty estimators; without such evidence the safety-rule benefit cannot be attributed to posterior modeling rather than to the learned data distribution.

minor comments (2)

[§3] Notation for the diffusion transformer and attention pooling is introduced without an explicit equation relating the ensemble members to the final safety rule.
[Figure 3] Figure captions and axis labels in the trajectory visualization panels are too small to read the multimodal spread clearly.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We agree that the manuscript requires additional experimental details and supporting analyses for the uncertainty claim. We address each major comment below and commit to revisions that strengthen the paper without overstating current results.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (Experiments): the reported 1.7% LAV improvement is stated without any baseline description, statistical significance test, data-split protocol, or ablation that isolates the contribution of the ensemble diffusion component versus the safety rule itself.

Authors: We agree this information is missing from the abstract and §4. In the revision we will: describe the point-estimate baseline, report means and standard deviations over multiple random seeds with statistical significance tests, specify the LAV data-split protocol, and add an ablation isolating the ensemble diffusion component from the hand-crafted safety rule. These changes will clarify the source of the reported gain. revision: yes
Referee: [§3.2 and §4] §3.2 and §4: the assertion that the 128 sampled trajectories “model the uncertainty of the posterior trajectory distribution” is unsupported by any calibration result (e.g., predicted variance versus realized error on held-out data) or comparison against alternative uncertainty estimators; without such evidence the safety-rule benefit cannot be attributed to posterior modeling rather than to the learned data distribution.

Authors: We acknowledge the claim currently lacks direct empirical support. The manuscript relies on the diffusion model's design to approximate a distribution over trajectories. In revision we will add calibration diagnostics (predicted variance vs. realized trajectory error on held-out data) and, space permitting, a brief comparison to other estimators. If the new results do not strongly corroborate posterior modeling, we will qualify the language to emphasize practical diversity sampling rather than strict posterior inference. revision: yes

Circularity Check

0 steps flagged

No circularity; purely empirical proposal with benchmark validation

full rationale

The paper presents EnDfuser as an end-to-end system replacing point-estimate planners with ensemble diffusion to output 128 trajectories, then applies a hand-crafted safety rule yielding +1.7% on LAV. No derivation chain, equations, or first-principles results are claimed; the uncertainty modeling is asserted as an empirical outcome of the trained model rather than reduced to any fitted parameter or self-citation by construction. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the provided text. The central claim remains an empirical observation independent of its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the diffusion model itself is treated as a standard component.

pith-pipeline@v0.9.0 · 5747 in / 989 out tokens · 32034 ms · 2026-05-25T08:16:01.837026+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 4 internal anchors

[1]

Carla garage repository and dataset, leaderboard 1.0 branch,

work page
[2]

Uncertainty estimation using a single deep de- terministic neural network

Joost Van Amersfoort, Lewis Smith, Yee Whye Teh, and Yarin Gal. Uncertainty estimation using a single deep de- terministic neural network. In Proceedings of the 37th Inter- national Conference on Machine Learning, page 9690–9700. PMLR, 2020. 1

work page 2020
[3]

NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles

Holger Caesar, Juraj Kabzan, Kok Seang Tan, Whye Kit Fong, Eric Wolff, Alex Lang, Luke Fletcher, Oscar Bei- jbom, and Sammy Omari. Nuplan: A closed-loop ml- based planning benchmark for autonomous vehicles. (arXiv:2106.11810), 2022. arXiv:2106.11810 [cs]. 2

work page internal anchor Pith review Pith/arXiv arXiv 2022
[4]

Prob- abilistic end-to-end vehicle navigation in complex dynamic environments with multimodal sensor fusion.IEEE Robotics and Automation Letters, 5(3):4218–4224, 2020

Peide Cai, Sukai Wang, Yuxiang Sun, and Ming Liu. Prob- abilistic end-to-end vehicle navigation in complex dynamic environments with multimodal sensor fusion.IEEE Robotics and Automation Letters, 5(3):4218–4224, 2020. 2, 5

work page 2020
[5]

Vt- gnet: A vision-based trajectory generation network for au- tonomous vehicles in urban environments

Peide Cai, Yuxiang Sun, Hengli Wang, and Ming Liu. Vt- gnet: A vision-based trajectory generation network for au- tonomous vehicles in urban environments. IEEE Transac- tions on Intelligent Vehicles, 6(3):419–429, 2021. 2

work page 2021
[6]

Chan, Maria J

Matthew A. Chan, Maria J. Molina, and Christopher A. Met- zler. Estimating epistemic and aleatoric uncertainty with a single model. arXiv e-prints , 2024. ADS Bibcode: 2024arXiv240203478C. 2

work page 2024
[7]

VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

Shaoyu Chen, Bo Jiang, Hao Gao, Bencheng Liao, Qing Xu, Qian Zhang, Chang Huang, Wenyu Liu, and Xinggang Wang. Vadv2: End-to-end vectorized autonomous driving via probabilistic planning. (arXiv:2402.13243), 2024. 2

work page internal anchor Pith review Pith/arXiv arXiv 2024
[8]

Diffusion policy: Visuomotor policy learning via action dif- fusion

Cheng Chi, Zhenjia Xu, Siyuan Feng, Eric Cousineau, Yilun Du, Benjamin Burchfiel, Russ Tedrake, and Shuran Song. Diffusion policy: Visuomotor policy learning via action dif- fusion. The International Journal of Robotics Research , 0 (0):02783649241273668, 0. 1, 4

work page
[9]

Transfuser: Imitation with transformer-based sensor fusion for autonomous driv- ing

Kashyap Chitta, Aditya Prakash, Bernhard Jaeger, Zehao Yu, Katrin Renz, and Andreas Geiger. Transfuser: Imitation with transformer-based sensor fusion for autonomous driv- ing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11):12878–12895, 2023. 2, 5

work page 2023
[10]

Enhanced safety in autonomous driving: Integrating a latent state dif- fusion model for end-to-end navigation

De-Tian Chu, Lin-Yuan Bai, Jia-Nuo Huang, Zhen-Long Fang, Peng Zhang, Wei Kang, and Hai-Feng Ling. Enhanced safety in autonomous driving: Integrating a latent state dif- fusion model for end-to-end navigation. Sensors, 24(1717): 5514, 2024. 2

work page 2024
[11]

Navsim: Data-driven non- reactive autonomous vehicle simulation and benchmarking

Daniel Dauner, Marcel Hallgarten, Tianyu Li, Xinshuo Weng, Zhiyu Huang, Zetong Yang, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, Andreas Geiger, and Kashyap Chitta. Navsim: Data-driven non- reactive autonomous vehicle simulation and benchmarking. Advances in Neural Information Processing Systems , 37: 28706–28719, 2024. 2

work page 2024
[12]

Carla: An open urban driving simulator

Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. Carla: An open urban driving simulator. In Proceedings of the 1st Annual Conference on Robot Learning, page 1–16. PMLR, 2017. 2

work page 2017
[13]

User-defined event sampling and uncertainty quantification in diffusion models for physical dynamical systems

Marc Anton Finzi, Anudhyan Boral, Andrew Gordon Wil- son, Fei Sha, and Leonardo Zepeda-Nunez. User-defined event sampling and uncertainty quantification in diffusion models for physical dynamical systems. In Proceedings of the 40th International Conference on Machine Learning , page 10136–10152. PMLR, 2023. 2

work page 2023
[14]

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

Justin Fu, Aviral Kumar, Ofir Nachum, George Tucker, and Sergey Levine. D4rl: Datasets for deep data- driven reinforcement learning. (arXiv:2004.07219), 2021. arXiv:2004.07219 [cs]. 2

work page internal anchor Pith review Pith/arXiv arXiv 2004
[15]

Dropout as a bayesian approximation: Representing model uncertainty in deep learning

Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of The 33rd International Confer- ence on Machine Learning, page 1050–1059. PMLR, 2016. 1

work page 2016
[16]

Denoising Dif- fusion Probabilistic Models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising Dif- fusion Probabilistic Models. In Advances in Neural Infor- mation Processing Systems, pages 6840–6851. Curran Asso- ciates, Inc. 1, 3

work page
[17]

Video diffu- sion models

Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Mohammad Norouzi, and David J Fleet. Video diffu- sion models. In Advances in Neural Information Processing Systems, pages 8633–8646. Curran Associates, Inc., 2022. 1

work page 2022
[18]

Hid- den biases of end-to-end driving models

Bernhard Jaeger, Kashyap Chitta, and Andreas Geiger. Hid- den biases of end-to-end driving models. In Proc. of the IEEE International Conf. on Computer Vision (ICCV), 2023. 2, 4, 6

work page 2023
[19]

Planning with Diffusion for Flexible Behavior Syn- thesis

Michael Janner, Yilun Du, Joshua Tenenbaum, and Sergey Levine. Planning with Diffusion for Flexible Behavior Syn- thesis. In Proceedings of the 39th International Conference on Machine Learning, pages 9902–9915. PMLR. 1

work page
[20]

Driveadapter: Breaking the coupling barrier of perception and planning in end-to-end autonomous driving

Xiaosong Jia, Yulu Gao, Li Chen, Junchi Yan, Patrick Langechuan Liu, and Hongyang Li. Driveadapter: Breaking the coupling barrier of perception and planning in end-to-end autonomous driving. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV), page 7919–7929, Paris, France, 2023. IEEE. 5, 6

work page 2023
[21]

Think twice be- fore driving: Towards scalable decoders for end-to-end au- tonomous driving

Xiaosong Jia, Penghao Wu, Li Chen, Jiangwei Xie, Con- ghui He, Junchi Yan, and Hongyang Li. Think twice be- fore driving: Towards scalable decoders for end-to-end au- tonomous driving. In Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR) , pages 21983–21994, 2023. 5, 6

work page 2023
[22]

Elucidating the design space of diffusion-based generative models

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models. In Advances in neural information processing sys- tems, page 26565–26577. Curran Associates, Inc., 2022. 1

work page 2022
[23]

Diffwave: A versatile diffusion model for audio synthesis

Zhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, and Bryan Catanzaro. Diffwave: A versatile diffusion model for audio synthesis. In International Conference on Learning Representations, 2021. 1

work page 2021
[24]

Simple and scalable predictive uncertainty esti- mation using deep ensembles

Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty esti- mation using deep ensembles. In Advances in Neural Infor- mation Processing Systems . Curran Associates, Inc., 2017. 1

work page 2017
[25]

Diffusiondrive: Trun- 9 cated diffusion model for end-to-end autonomous driving

Bencheng Liao, Shaoyu Chen, Haoran Yin, Bo Jiang, Cheng Wang, Sixu Yan, Xinbang Zhang, Xiangyu Li, Ying Zhang, Qian Zhang, and Xinggang Wang. Diffusiondrive: Trun- 9 cated diffusion model for end-to-end autonomous driving. (arXiv:2411.15139), 2024. arXiv:2411.15139 [cs]. 1, 2

work page arXiv 2024
[26]

Cdstraj: Characterized diffusion and spatial-temporal interaction network for trajectory predic- tion in autonomous driving

Haicheng Liao, Xuelin Li, Yongkang Li, Hanlin Kong, Chengyue Wang, Bonan Wang, Yanchen Guan, KaHou Tam, and Zhenning Li. Cdstraj: Characterized diffusion and spatial-temporal interaction network for trajectory predic- tion in autonomous driving. In Proceedings of the Thirty- Third International Joint Conference on Artificial Intelli- gence, IJCAI-24, page...

work page 2024
[27]

A general framework for uncertainty estimation in deep learning

Antonio Loquercio, Mattia Segu, and Davide Scaramuzza. A general framework for uncertainty estimation in deep learning. IEEE Robotics and Automation Letters , 5(2): 3153–3160, 2020. 1

work page 2020
[28]

David J. C. MacKay. A practical bayesian framework for backpropagation networks. Neural Computation, 4(3): 448–472, 1992. 1

work page 1992
[29]

Reliable trajectory prediction and un- certainty quantification with conditioned diffusion models

Marion Neumeier, Sebastian Dorn, Michael Botsch, and Wolfgang Utschick. Reliable trajectory prediction and un- certainty quantification with conditioned diffusion models. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), page 3461–3470,

work page 2024
[30]

Diffusion models for intelligent transporta- tion systems: A survey

Mingxing Peng, Kehua Chen, Xusen Guo, Qiming Zhang, Hongliang Lu, Hui Zhong, Di Chen, Meixin Zhu, and Hai Yang. Diffusion models for intelligent transporta- tion systems: A survey. (arXiv:2409.15816), 2024. arXiv:2409.15816. 2

work page arXiv 2024
[31]

Zero-shot uncer- tainty quantification using diffusion probabilistic models

Dule Shu and Amir Barati Farimani. Zero-shot uncer- tainty quantification using diffusion probabilistic models. (arXiv:2408.04718), 2024. arXiv:2408.04718 [cs]. 2

work page arXiv 2024
[32]

Denoising Diffusion Implicit Models

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denois- ing diffusion implicit models. (arXiv:2010.02502), 2022. arXiv:2010.02502 [cs]. 3

work page internal anchor Pith review Pith/arXiv arXiv 2010
[33]

Visual-based autonomous driving deploy- ment from a stochastic and uncertainty-aware perspective

Lei Tai, Peng Yun, Yuying Chen, Congcong Liu, Haoyang Ye, and Ming Liu. Visual-based autonomous driving deploy- ment from a stochastic and uncertainty-aware perspective. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), page 2622–2628, 2019. 2

work page 2019
[34]

Reasoning with latent diffusion in offline reinforcement learning

Siddarth Venkatraman, Shivesh Khaitan, Ravi Tej Akella, John Dolan, Jeff Schneider, and Glen Berseth. Reasoning with latent diffusion in offline reinforcement learning. InThe Twelfth International Conference on Learning Representa- tions, 2024. 2

work page 2024
[35]

He-drive: Human-like end-to-end driv- ing with vision language models

Junming Wang, Xingyu Zhang, Zebin Xing, Songen Gu, Xi- aoyang Guo, Yang Hu, Ziying Song, Qian Zhang, Xiaoxiao Long, and Wei Yin. He-drive: Human-like end-to-end driv- ing with vision language models. (arXiv:2410.05051), 2024. arXiv:2410.05051. 2

work page arXiv 2024
[36]

Un- certainty quantification for safe and reliable autonomous ve- hicles: A review of methods and applications

Ke Wang, Chongqiang Shen, Xingcan Li, and Jianbo Lu. Un- certainty quantification for safe and reliable autonomous ve- hicles: A review of methods and applications. IEEE Trans- actions on Intelligent Transportation Systems , page 1–17,

work page
[37]

Karen Liu, and Monroe Kennedy III

Weizhuo Wang, C. Karen Liu, and Monroe Kennedy III. Eg- onav: Egocentric scene-aware human trajectory prediction. (arXiv:2403.19026), 2024. arXiv:2403.19026 [cs]. 3

work page arXiv 2024
[38]

C2f-tp: A coarse-to-fine de- noising framework for uncertainty-aware trajectory predic- tion

Zichen Wang, Hao Miao, Senzhang Wang, Renzhi Wang, Jianxin Wang, and Jian Zhang. C2f-tp: A coarse-to-fine de- noising framework for uncertainty-aware trajectory predic- tion. (arXiv:2412.13231), 2024. arXiv:2412.13231 [cs]. 2

work page arXiv 2024
[39]

Diffusion-es: Gradient-free planning with diffusion for autonomous and instruction-guided driving

Brian Yang, Huangyuan Su, Nikolaos Gkanatsios, Tsung- Wei Ke, Ayush Jain, Jeff Schneider, and Katerina Fragki- adaki. Diffusion-es: Gradient-free planning with diffusion for autonomous and instruction-guided driving. In Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15342–15353, 2024. 2

work page 2024
[40]

Uncertainties in on- board algorithms for autonomous vehicles: Challenges, mit- igation, and perspectives

Kai Yang, Xiaolin Tang, Jun Li, Hong Wang, Guichuan Zhong, Jiaxin Chen, and Dongpu Cao. Uncertainties in on- board algorithms for autonomous vehicles: Challenges, mit- igation, and perspectives. IEEE Transactions on Intelligent Transportation Systems, 24(9):8963–8987, 2023. 2

work page 2023
[41]

Calmm-drive: Confidence- aware autonomous driving with large multimodal model

Ruoyu Yao, Yubin Wang, Haichao Liu, Rui Yang, Zengqi Peng, Lei Zhu, and Jun Ma. Calmm-drive: Confidence- aware autonomous driving with large multimodal model. (arXiv:2412.04209), 2024. arXiv:2412.04209 [cs]. 2

work page arXiv 2024
[42]

Hidden biases of end-to-end driving datasets

Julian Zimmerlin, Jens Beißwenger, Bernhard Jaeger, An- dreas Geiger, and Kashyap Chitta. Hidden biases of end-to-end driving datasets. (arXiv:2412.09602), 2024. arXiv:2412.09602 [cs]. 2, 6, 8 10

work page arXiv 2024

[1] [1]

Carla garage repository and dataset, leaderboard 1.0 branch,

work page

[2] [2]

Uncertainty estimation using a single deep de- terministic neural network

Joost Van Amersfoort, Lewis Smith, Yee Whye Teh, and Yarin Gal. Uncertainty estimation using a single deep de- terministic neural network. In Proceedings of the 37th Inter- national Conference on Machine Learning, page 9690–9700. PMLR, 2020. 1

work page 2020

[3] [3]

NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles

Holger Caesar, Juraj Kabzan, Kok Seang Tan, Whye Kit Fong, Eric Wolff, Alex Lang, Luke Fletcher, Oscar Bei- jbom, and Sammy Omari. Nuplan: A closed-loop ml- based planning benchmark for autonomous vehicles. (arXiv:2106.11810), 2022. arXiv:2106.11810 [cs]. 2

work page internal anchor Pith review Pith/arXiv arXiv 2022

[4] [4]

Prob- abilistic end-to-end vehicle navigation in complex dynamic environments with multimodal sensor fusion.IEEE Robotics and Automation Letters, 5(3):4218–4224, 2020

Peide Cai, Sukai Wang, Yuxiang Sun, and Ming Liu. Prob- abilistic end-to-end vehicle navigation in complex dynamic environments with multimodal sensor fusion.IEEE Robotics and Automation Letters, 5(3):4218–4224, 2020. 2, 5

work page 2020

[5] [5]

Vt- gnet: A vision-based trajectory generation network for au- tonomous vehicles in urban environments

Peide Cai, Yuxiang Sun, Hengli Wang, and Ming Liu. Vt- gnet: A vision-based trajectory generation network for au- tonomous vehicles in urban environments. IEEE Transac- tions on Intelligent Vehicles, 6(3):419–429, 2021. 2

work page 2021

[6] [6]

Chan, Maria J

Matthew A. Chan, Maria J. Molina, and Christopher A. Met- zler. Estimating epistemic and aleatoric uncertainty with a single model. arXiv e-prints , 2024. ADS Bibcode: 2024arXiv240203478C. 2

work page 2024

[7] [7]

VADv2: End-to-End Vectorized Autonomous Driving via Probabilistic Planning

Shaoyu Chen, Bo Jiang, Hao Gao, Bencheng Liao, Qing Xu, Qian Zhang, Chang Huang, Wenyu Liu, and Xinggang Wang. Vadv2: End-to-end vectorized autonomous driving via probabilistic planning. (arXiv:2402.13243), 2024. 2

work page internal anchor Pith review Pith/arXiv arXiv 2024

[8] [8]

Diffusion policy: Visuomotor policy learning via action dif- fusion

Cheng Chi, Zhenjia Xu, Siyuan Feng, Eric Cousineau, Yilun Du, Benjamin Burchfiel, Russ Tedrake, and Shuran Song. Diffusion policy: Visuomotor policy learning via action dif- fusion. The International Journal of Robotics Research , 0 (0):02783649241273668, 0. 1, 4

work page

[9] [9]

Transfuser: Imitation with transformer-based sensor fusion for autonomous driv- ing

Kashyap Chitta, Aditya Prakash, Bernhard Jaeger, Zehao Yu, Katrin Renz, and Andreas Geiger. Transfuser: Imitation with transformer-based sensor fusion for autonomous driv- ing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11):12878–12895, 2023. 2, 5

work page 2023

[10] [10]

Enhanced safety in autonomous driving: Integrating a latent state dif- fusion model for end-to-end navigation

De-Tian Chu, Lin-Yuan Bai, Jia-Nuo Huang, Zhen-Long Fang, Peng Zhang, Wei Kang, and Hai-Feng Ling. Enhanced safety in autonomous driving: Integrating a latent state dif- fusion model for end-to-end navigation. Sensors, 24(1717): 5514, 2024. 2

work page 2024

[11] [11]

Navsim: Data-driven non- reactive autonomous vehicle simulation and benchmarking

Daniel Dauner, Marcel Hallgarten, Tianyu Li, Xinshuo Weng, Zhiyu Huang, Zetong Yang, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, Andreas Geiger, and Kashyap Chitta. Navsim: Data-driven non- reactive autonomous vehicle simulation and benchmarking. Advances in Neural Information Processing Systems , 37: 28706–28719, 2024. 2

work page 2024

[12] [12]

Carla: An open urban driving simulator

Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. Carla: An open urban driving simulator. In Proceedings of the 1st Annual Conference on Robot Learning, page 1–16. PMLR, 2017. 2

work page 2017

[13] [13]

User-defined event sampling and uncertainty quantification in diffusion models for physical dynamical systems

Marc Anton Finzi, Anudhyan Boral, Andrew Gordon Wil- son, Fei Sha, and Leonardo Zepeda-Nunez. User-defined event sampling and uncertainty quantification in diffusion models for physical dynamical systems. In Proceedings of the 40th International Conference on Machine Learning , page 10136–10152. PMLR, 2023. 2

work page 2023

[14] [14]

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

Justin Fu, Aviral Kumar, Ofir Nachum, George Tucker, and Sergey Levine. D4rl: Datasets for deep data- driven reinforcement learning. (arXiv:2004.07219), 2021. arXiv:2004.07219 [cs]. 2

work page internal anchor Pith review Pith/arXiv arXiv 2004

[15] [15]

Dropout as a bayesian approximation: Representing model uncertainty in deep learning

Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of The 33rd International Confer- ence on Machine Learning, page 1050–1059. PMLR, 2016. 1

work page 2016

[16] [16]

Denoising Dif- fusion Probabilistic Models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising Dif- fusion Probabilistic Models. In Advances in Neural Infor- mation Processing Systems, pages 6840–6851. Curran Asso- ciates, Inc. 1, 3

work page

[17] [17]

Video diffu- sion models

Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Mohammad Norouzi, and David J Fleet. Video diffu- sion models. In Advances in Neural Information Processing Systems, pages 8633–8646. Curran Associates, Inc., 2022. 1

work page 2022

[18] [18]

Hid- den biases of end-to-end driving models

Bernhard Jaeger, Kashyap Chitta, and Andreas Geiger. Hid- den biases of end-to-end driving models. In Proc. of the IEEE International Conf. on Computer Vision (ICCV), 2023. 2, 4, 6

work page 2023

[19] [19]

Planning with Diffusion for Flexible Behavior Syn- thesis

Michael Janner, Yilun Du, Joshua Tenenbaum, and Sergey Levine. Planning with Diffusion for Flexible Behavior Syn- thesis. In Proceedings of the 39th International Conference on Machine Learning, pages 9902–9915. PMLR. 1

work page

[20] [20]

Driveadapter: Breaking the coupling barrier of perception and planning in end-to-end autonomous driving

Xiaosong Jia, Yulu Gao, Li Chen, Junchi Yan, Patrick Langechuan Liu, and Hongyang Li. Driveadapter: Breaking the coupling barrier of perception and planning in end-to-end autonomous driving. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV), page 7919–7929, Paris, France, 2023. IEEE. 5, 6

work page 2023

[21] [21]

Think twice be- fore driving: Towards scalable decoders for end-to-end au- tonomous driving

Xiaosong Jia, Penghao Wu, Li Chen, Jiangwei Xie, Con- ghui He, Junchi Yan, and Hongyang Li. Think twice be- fore driving: Towards scalable decoders for end-to-end au- tonomous driving. In Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR) , pages 21983–21994, 2023. 5, 6

work page 2023

[22] [22]

Elucidating the design space of diffusion-based generative models

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models. In Advances in neural information processing sys- tems, page 26565–26577. Curran Associates, Inc., 2022. 1

work page 2022

[23] [23]

Diffwave: A versatile diffusion model for audio synthesis

Zhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, and Bryan Catanzaro. Diffwave: A versatile diffusion model for audio synthesis. In International Conference on Learning Representations, 2021. 1

work page 2021

[24] [24]

Simple and scalable predictive uncertainty esti- mation using deep ensembles

Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty esti- mation using deep ensembles. In Advances in Neural Infor- mation Processing Systems . Curran Associates, Inc., 2017. 1

work page 2017

[25] [25]

Diffusiondrive: Trun- 9 cated diffusion model for end-to-end autonomous driving

Bencheng Liao, Shaoyu Chen, Haoran Yin, Bo Jiang, Cheng Wang, Sixu Yan, Xinbang Zhang, Xiangyu Li, Ying Zhang, Qian Zhang, and Xinggang Wang. Diffusiondrive: Trun- 9 cated diffusion model for end-to-end autonomous driving. (arXiv:2411.15139), 2024. arXiv:2411.15139 [cs]. 1, 2

work page arXiv 2024

[26] [26]

Cdstraj: Characterized diffusion and spatial-temporal interaction network for trajectory predic- tion in autonomous driving

Haicheng Liao, Xuelin Li, Yongkang Li, Hanlin Kong, Chengyue Wang, Bonan Wang, Yanchen Guan, KaHou Tam, and Zhenning Li. Cdstraj: Characterized diffusion and spatial-temporal interaction network for trajectory predic- tion in autonomous driving. In Proceedings of the Thirty- Third International Joint Conference on Artificial Intelli- gence, IJCAI-24, page...

work page 2024

[27] [27]

A general framework for uncertainty estimation in deep learning

Antonio Loquercio, Mattia Segu, and Davide Scaramuzza. A general framework for uncertainty estimation in deep learning. IEEE Robotics and Automation Letters , 5(2): 3153–3160, 2020. 1

work page 2020

[28] [28]

David J. C. MacKay. A practical bayesian framework for backpropagation networks. Neural Computation, 4(3): 448–472, 1992. 1

work page 1992

[29] [29]

Reliable trajectory prediction and un- certainty quantification with conditioned diffusion models

Marion Neumeier, Sebastian Dorn, Michael Botsch, and Wolfgang Utschick. Reliable trajectory prediction and un- certainty quantification with conditioned diffusion models. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), page 3461–3470,

work page 2024

[30] [30]

Diffusion models for intelligent transporta- tion systems: A survey

Mingxing Peng, Kehua Chen, Xusen Guo, Qiming Zhang, Hongliang Lu, Hui Zhong, Di Chen, Meixin Zhu, and Hai Yang. Diffusion models for intelligent transporta- tion systems: A survey. (arXiv:2409.15816), 2024. arXiv:2409.15816. 2

work page arXiv 2024

[31] [31]

Zero-shot uncer- tainty quantification using diffusion probabilistic models

Dule Shu and Amir Barati Farimani. Zero-shot uncer- tainty quantification using diffusion probabilistic models. (arXiv:2408.04718), 2024. arXiv:2408.04718 [cs]. 2

work page arXiv 2024

[32] [32]

Denoising Diffusion Implicit Models

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denois- ing diffusion implicit models. (arXiv:2010.02502), 2022. arXiv:2010.02502 [cs]. 3

work page internal anchor Pith review Pith/arXiv arXiv 2010

[33] [33]

Visual-based autonomous driving deploy- ment from a stochastic and uncertainty-aware perspective

Lei Tai, Peng Yun, Yuying Chen, Congcong Liu, Haoyang Ye, and Ming Liu. Visual-based autonomous driving deploy- ment from a stochastic and uncertainty-aware perspective. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), page 2622–2628, 2019. 2

work page 2019

[34] [34]

Reasoning with latent diffusion in offline reinforcement learning

Siddarth Venkatraman, Shivesh Khaitan, Ravi Tej Akella, John Dolan, Jeff Schneider, and Glen Berseth. Reasoning with latent diffusion in offline reinforcement learning. InThe Twelfth International Conference on Learning Representa- tions, 2024. 2

work page 2024

[35] [35]

He-drive: Human-like end-to-end driv- ing with vision language models

Junming Wang, Xingyu Zhang, Zebin Xing, Songen Gu, Xi- aoyang Guo, Yang Hu, Ziying Song, Qian Zhang, Xiaoxiao Long, and Wei Yin. He-drive: Human-like end-to-end driv- ing with vision language models. (arXiv:2410.05051), 2024. arXiv:2410.05051. 2

work page arXiv 2024

[36] [36]

Un- certainty quantification for safe and reliable autonomous ve- hicles: A review of methods and applications

Ke Wang, Chongqiang Shen, Xingcan Li, and Jianbo Lu. Un- certainty quantification for safe and reliable autonomous ve- hicles: A review of methods and applications. IEEE Trans- actions on Intelligent Transportation Systems , page 1–17,

work page

[37] [37]

Karen Liu, and Monroe Kennedy III

Weizhuo Wang, C. Karen Liu, and Monroe Kennedy III. Eg- onav: Egocentric scene-aware human trajectory prediction. (arXiv:2403.19026), 2024. arXiv:2403.19026 [cs]. 3

work page arXiv 2024

[38] [38]

C2f-tp: A coarse-to-fine de- noising framework for uncertainty-aware trajectory predic- tion

Zichen Wang, Hao Miao, Senzhang Wang, Renzhi Wang, Jianxin Wang, and Jian Zhang. C2f-tp: A coarse-to-fine de- noising framework for uncertainty-aware trajectory predic- tion. (arXiv:2412.13231), 2024. arXiv:2412.13231 [cs]. 2

work page arXiv 2024

[39] [39]

Diffusion-es: Gradient-free planning with diffusion for autonomous and instruction-guided driving

Brian Yang, Huangyuan Su, Nikolaos Gkanatsios, Tsung- Wei Ke, Ayush Jain, Jeff Schneider, and Katerina Fragki- adaki. Diffusion-es: Gradient-free planning with diffusion for autonomous and instruction-guided driving. In Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15342–15353, 2024. 2

work page 2024

[40] [40]

Uncertainties in on- board algorithms for autonomous vehicles: Challenges, mit- igation, and perspectives

Kai Yang, Xiaolin Tang, Jun Li, Hong Wang, Guichuan Zhong, Jiaxin Chen, and Dongpu Cao. Uncertainties in on- board algorithms for autonomous vehicles: Challenges, mit- igation, and perspectives. IEEE Transactions on Intelligent Transportation Systems, 24(9):8963–8987, 2023. 2

work page 2023

[41] [41]

Calmm-drive: Confidence- aware autonomous driving with large multimodal model

Ruoyu Yao, Yubin Wang, Haichao Liu, Rui Yang, Zengqi Peng, Lei Zhu, and Jun Ma. Calmm-drive: Confidence- aware autonomous driving with large multimodal model. (arXiv:2412.04209), 2024. arXiv:2412.04209 [cs]. 2

work page arXiv 2024

[42] [42]

Hidden biases of end-to-end driving datasets

Julian Zimmerlin, Jens Beißwenger, Bernhard Jaeger, An- dreas Geiger, and Kashyap Chitta. Hidden biases of end-to-end driving datasets. (arXiv:2412.09602), 2024. arXiv:2412.09602 [cs]. 2, 6, 8 10

work page arXiv 2024