Towards Trustworthy and Explainable AI for Perception Models: From Concept to Prototype Vehicle Deployment

Ayushman Choudhuri; Lutz Eckstein; Manas Mehrotra; Shayan Sharifi; Till Beemelmanns

arxiv: 2605.16087 · v2 · pith:BD73KWXLnew · submitted 2026-05-15 · 💻 cs.RO · cs.AI

Towards Trustworthy and Explainable AI for Perception Models: From Concept to Prototype Vehicle Deployment

Till Beemelmanns , Shayan Sharifi , Manas Mehrotra , Ayushman Choudhuri , Lutz Eckstein This is my paper

Pith reviewed 2026-05-25 06:26 UTC · model grok-4.3

classification 💻 cs.RO cs.AI

keywords trustworthy AIexplainable AIautonomous drivingperceptiontransformer detectoruncertainty calibrationrobustnesssaliency maps

0 comments

The pith

A transformer detector for autonomous driving yields faithful explanations from its attention weights at inference time, plus calibrated uncertainty and robustness improvements, all deployed in a prototype vehicle with a real-time interface

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that a complete trustworthy-AI pipeline can be built for 3D perception in driving by taking a transformer detector, extracting explanations directly from its attention maps, adding an uncertainty-calibration module, and applying robustness training. It then validates the explanations with perturbation consistency tests, measures gains in robustness and calibration, and moves the entire system onto a real vehicle where an interface displays the saliency maps, uncertainty state, and documentation artifacts live. A sympathetic reader would care because current deep networks for scene understanding remain black boxes that conflict with safety standards and make debugging or oversight difficult; if the pipeline works, it supplies one concrete route from abstract trustworthy-AI guidelines to an operational perception stack.

Core claim

Building on a transformer-based detector, explanations are derived from the attention mechanism at inference time and validated for faithfulness using perturbation-based consistency tests. An uncertainty estimation and calibration module is integrated, robustness-enhancing training methods are applied, and the resulting system is shown to produce faithful saliency behavior, improved robustness, and well-calibrated uncertainty estimates. The full set of trustworthy-AI elements is finally deployed in a prototype vehicle together with an XAI interface that visualizes documentation artifacts, model uncertainty state, and saliency maps in real time.

What carries the argument

Attention weights extracted from the transformer detector at inference time, used as the source of saliency explanations and validated by perturbation consistency tests, together with a separate uncertainty-calibration module and robustness training.

If this is right

Explanations become available at inference time with no extra forward passes required.
The perception module can be monitored in real time for uncertainty spikes that may indicate out-of-distribution inputs.
Robustness training reduces performance drop under common perturbations such as noise or occlusion.
The deployed XAI interface supplies a single screen that combines saliency, uncertainty, and model documentation for human oversight.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same attention-extraction pattern could be tested on other transformer architectures used for 3D detection to see whether faithfulness holds across detector families.
If the real-time interface is kept, it might serve as a template for logging artifacts required by future automotive safety standards.
Extending the uncertainty calibration to multi-modal sensor fusion would be a direct next step that the current single-detector pipeline leaves open.

Load-bearing premise

Attention weights from the transformer at inference time give faithful accounts of the model's actual decisions, with faithfulness checked only through the described perturbation tests.

What would settle it

A controlled test in which the attention-derived saliency maps are compared against a ground-truth importance measure obtained by systematically ablating input regions and measuring change in the detector's output scores; systematic mismatch would falsify the faithfulness claim.

Figures

Figures reproduced from arXiv: 2605.16087 by Ayushman Choudhuri, Lutz Eckstein, Manas Mehrotra, Shayan Sharifi, Till Beemelmanns.

**Figure 1.** Figure 1: Proposed Trustworthy AI Approach. We propose a multi-modal perception module that integrates Trustworthy AI components: robust training, calibrated uncertainty quantification, and explainability. An XAI Interface enables transparent monitoring through documentation, visualized uncertainty state, sensor usage, and saliency maps. interpreting and monitoring AI modules, concrete implementations remain scarc… view at source ↗

**Figure 2.** Figure 2: Overview of the Trustworthy AI Approach. From LiDAR point cloud and multi-view camera images, we extract camera ( ) and LiDAR tokens ( ) that interact with object queries ( ) via Cross-Attention and produce calibrated uncertaintyaware 3D bounding boxes. Attention weights ( ), derived sensor usage, and current model uncertainty state are used for visualization in the XAI Interface, along with supporting Mo… view at source ↗

**Figure 3.** Figure 3: XAI Faithfulness Test. NuScenes Detection Score (NDS) under increasing perturbation level ρ. The proposed Mean-Fusion yields the best overall trade-off for both tests. XAI Faithfulness. We evaluate explanation faithfulness via perturbation tests on LiDAR and camera inputs, and provide qualitative examples of the obtained saliency maps and modality contributions in [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: XAI Attention Maps Visualization. REFERENCES [1] J. Yan, Y. Liu, J. Sun, F. Jia, S. Li, T. Wang, and X. Zhang, “Cross modal transformer via coordinates encoding for 3d object dectection,” International Conference on Computer Vision (ICCV), 2023. [2] Y. Xie, C. Xu, M.-J. Rakotosaona, P. Rim, F. Tombari, K. Keutzer, M. Tomizuka, and W. Zhan, “Sparsefusion: Fusing multi-modal sparse representations for multi-… view at source ↗

read the original abstract

Deep Neural Networks have become the dominant solution for Autonomous Driving perception, but their opacity conflicts with emerging Trustworthy AI guidelines and complicates safety assurance, debugging, and human oversight. While theoretical frameworks for safe and Explainable AI (XAI) exist, concrete implementations of Trustworthy AI for 3D scene understanding remain scarce. We address this gap by proposing a Trustworthy AI perception module that is remarkably robust, integrates faithful explainability, and calibrated uncertainty estimates. Building on a transformer-based detector, we derive explanation from the attention mechanism at inference time and validate their faithfulness using perturbation-based consistency tests. We further integrate an uncertainty estimation and calibration module, and apply robustness-enhancing training methods. Experiments show faithful saliency behavior, improved robustness, and well-calibrated uncertainty estimates. Finally, we deploy these Trustworthy AI elements in a prototype vehicle and provide an XAI Interface that visualizes documentation artifacts, model uncertainty state, and saliency maps, demonstrating the feasibility of trustworthy perception monitoring in real time. Supplementary materials are available at https://tillbeemelmanns.github.io/trustworthy_ai/ .

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a straightforward engineering integration paper that gets a real-vehicle prototype running with XAI and uncertainty tools, but adds no new methods and supplies almost no quantitative evidence.

read the letter

The paper's core contribution is showing that a transformer-based 3D detector can be wrapped with attention-derived saliency, perturbation checks, calibration, and robustness training, then run live in a prototype car with a dashboard interface. That deployment step is the part worth noting; most XAI work stops at benchmarks, so getting it into a vehicle is concrete engineering work that matters for safety monitoring discussions.

Referee Report

2 major / 1 minor

Summary. The paper proposes a Trustworthy AI perception module for autonomous driving based on a transformer detector. Explanations are derived from attention weights at inference time and validated via perturbation-based consistency tests; an uncertainty estimation and calibration module is integrated along with robustness-enhancing training. Experiments are claimed to demonstrate faithful saliency behavior, improved robustness, and well-calibrated uncertainty. The full pipeline is deployed in a prototype vehicle with a real-time XAI interface visualizing saliency maps, uncertainty state, and documentation artifacts.

Significance. If the quantitative results and validation hold, the work is significant for providing one of the few end-to-end implementations of trustworthy AI elements (faithful explainability, calibrated uncertainty, robustness) in a 3D perception system for autonomous driving, including real-vehicle deployment and an operational XAI interface. This bridges theoretical frameworks with practical systems integration and could serve as a reference for safety-critical applications.

major comments (2)

[Abstract] Abstract: The abstract reports positive experimental outcomes on faithfulness, robustness, and calibration but supplies no quantitative numbers, baseline comparisons, or details on post-hoc choices (e.g., perturbation types, calibration method). This absence makes it impossible to assess the magnitude or reliability of the claimed improvements.
[Explanation validation] Explanation validation (referenced in Abstract): The claim that attention-derived saliency maps constitute faithful explanations rests solely on perturbation-based consistency tests. The manuscript provides no comparison to alternative methods (e.g., integrated gradients, occlusion), no ablation on perturbation strategy, and no analysis of whether the tests detect known attention failure modes such as spurious focus on background or non-causal tokens. In a safety-critical 3D detector setting, this leaves the faithfulness component of the trustworthy pipeline insufficiently supported.

minor comments (1)

[Abstract] The supplementary materials link is provided, but the main text would benefit from explicit cross-references to specific quantitative results or figures supporting the 'faithful saliency behavior' claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight opportunities to strengthen the presentation of our results. We address each major comment below and commit to revisions that improve clarity and support for the claims without altering the core contributions.

read point-by-point responses

Referee: [Abstract] Abstract: The abstract reports positive experimental outcomes on faithfulness, robustness, and calibration but supplies no quantitative numbers, baseline comparisons, or details on post-hoc choices (e.g., perturbation types, calibration method). This absence makes it impossible to assess the magnitude or reliability of the claimed improvements.

Authors: We agree that the abstract would benefit from quantitative details to enable readers to evaluate the scale of improvements. In the revised version we will incorporate representative metrics (e.g., faithfulness consistency scores, robustness gains under perturbation, and expected calibration error) together with concise references to the perturbation strategy and calibration procedure employed. revision: yes
Referee: [Explanation validation] Explanation validation (referenced in Abstract): The claim that attention-derived saliency maps constitute faithful explanations rests solely on perturbation-based consistency tests. The manuscript provides no comparison to alternative methods (e.g., integrated gradients, occlusion), no ablation on perturbation strategy, and no analysis of whether the tests detect known attention failure modes such as spurious focus on background or non-causal tokens. In a safety-critical 3D detector setting, this leaves the faithfulness component of the trustworthy pipeline insufficiently supported.

Authors: We recognize that the current validation relies exclusively on perturbation consistency and lacks explicit comparisons or failure-mode analysis. We will add (i) a comparison of attention-derived maps against integrated gradients and occlusion, (ii) an ablation on perturbation parameters, and (iii) a targeted examination of attention behavior on background or non-causal regions, including discussion of implications for 3D detection safety. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical systems integration only

full rationale

The paper describes an empirical pipeline integrating a transformer detector, attention-derived saliency maps validated via perturbation consistency tests, uncertainty calibration, and robustness training, with vehicle deployment. No mathematical derivation chain, equations, or first-principles results are claimed. No steps reduce any reported outcome to a fitted parameter, self-citation, or self-definition inside the paper. Claims rest on experimental measurements against external benchmarks rather than internal construction, satisfying the self-contained criterion.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the empirical performance of standard components (transformer attention, perturbation testing, calibration) rather than new axioms or invented entities. No free parameters are explicitly introduced in the abstract; the work does not postulate new particles, forces, or dimensions.

axioms (1)

domain assumption Attention weights from the transformer detector can be treated as explanations whose faithfulness can be assessed via perturbation consistency tests.
This premise is invoked when the paper states that explanations are derived from the attention mechanism and validated by perturbation-based tests.

pith-pipeline@v0.9.0 · 5740 in / 1572 out tokens · 24774 ms · 2026-05-25T06:26:27.483586+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages

[1]

Cross modal transformer via coordinates encoding for 3d object dectection,

J. Yan, Y . Liu, J. Sun, F. Jia, S. Li, T. Wang, and X. Zhang, “Cross modal transformer via coordinates encoding for 3d object dectection,” International Conference on Computer Vision (ICCV), 2023

work page 2023
[2]

Sparsefusion: Fusing multi-modal sparse representations for multi-sensor 3d object detection,

Y . Xie, C. Xu, M.-J. Rakotosaona, P. Rim, F. Tombari, K. Keutzer, M. Tomizuka, and W. Zhan, “Sparsefusion: Fusing multi-modal sparse representations for multi-sensor 3d object detection,” inConference on Computer Vision and Pattern Recognition (CVPR), 2023

work page 2023
[3]

Ethics guidelines for trustworthy AI,

European Commission, “Ethics guidelines for trustworthy AI,” 2019, accessed: 2025-12-28. [Online]. Available: https://digital-strategy.ec. europa.eu/en/library/ethics-guidelines-trustworthy-ai

work page 2019
[4]

Artificial intelligence risk management framework,

National Institute of Standards and Technology, “Artificial intelligence risk management framework,” 2023, accessed: 2025-12-28. [Online]. Available: https://airc.nist.gov/airmf-resources/playbook/

work page 2023
[5]

ISO 26262- 1:2018(en): Road vehicles — functional safety,

International Organization for Standardization (ISO), “ISO 26262- 1:2018(en): Road vehicles — functional safety,” 2018. [Online]. Available: https://www.iso.org/standard/68383.html

work page 2018
[6]

ISO 21448:2022: Road vehicles — safety of the intended functionality,

——, “ISO 21448:2022: Road vehicles — safety of the intended functionality,” 2022. [Online]. Available: https://www.iso.org/standard/ 77490.html

work page 2022
[7]

Explainable ai for safe and trustworthy autonomous driving: A systematic review,

A. Kuznietsov, B. Gyevnar, C. Wang, S. Peters, and S. V . Albrecht, “Explainable ai for safe and trustworthy autonomous driving: A systematic review,”IEEE International Conference on Intelligent Transportation Systems (ITSC), 2024

work page 2024
[8]

On calibration of modern neural networks,

C. Guo, G. Pleiss, Y . Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” inProceedings of the 34th Interna- tional Conference on Machine Learning - Volume 70, ser. ICML’17. JMLR.org, 2017, p. 1321–1330. (a)karl.Research vehicle used to deploy the proposed approach. (b)XAI Interface.Embedded into the vehicle’s dashboard. (c)XAI Interfa...

work page 2017
[9]

Can we trust you? on calibration of a probabilistic object detector for autonomous driving,

Di Feng, L. Rosenbaum, C. Glaeser, F. Timm, and K. Dietmayer, “Can we trust you? on calibration of a probabilistic object detector for autonomous driving,” inInternational Conference on Intelligent Robots and Systems (IROS), 2019

work page 2019
[10]

Multi- variate confidence calibration for object detection,

F. K ¨uppers, J. Kronenberger, A. Shantia, and A. Haselhoff, “Multi- variate confidence calibration for object detection,” inConference on Computer Vision and Pattern Recognition Workshop (CVPR’W), 2020

work page 2020
[11]

“why should i trust you?

M. T. Ribeiro, S. Singh, and C. Guestrin, ““why should i trust you?”: Explaining the predictions of any classifier,” inKDD, 2016, pp. 1135– 1144

work page 2016
[12]

Grad-cam: Visual explanations from deep networks via gradient-based localization,

R. R. Selvaraju, M. Cogswell, A. Das, and et al., “Grad-cam: Visual explanations from deep networks via gradient-based localization,” in ICCV, 2017, pp. 618–626

work page 2017
[13]

A unified approach to interpreting model predictions,

S. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,”Advances in Neural Information Processing Systems, vol. 30, 2017

work page 2017
[14]

Interpretable explanations of black boxes by meaningful perturbation,

R. Fong and A. Vedaldi, “Interpretable explanations of black boxes by meaningful perturbation,” inICCV, 2017, pp. 3429–3437

work page 2017
[15]

A methodology to enhance transparency for trustworthy artificial intelligence for cooperative, connected, and automated mobility,

P. N. Ca ˜nas, M. Nieto, O. Otaegui, and I. Rodriguez, “A methodology to enhance transparency for trustworthy artificial intelligence for cooperative, connected, and automated mobility,”SAE International Journal of Connected and Automated Vehicles, vol. 8, 2024

work page 2024
[16]

Molnar,Interpretable Machine Learning, 3rd ed., 2025

C. Molnar,Interpretable Machine Learning, 3rd ed., 2025. [Online]. Available: https://christophm.github.io/interpretable-ml-book

work page 2025
[17]

Guidelines for human-ai interaction,

S. Amershi, D. Weld, M. V orvoreanu, A. Fourney, B. Nushi, P. Collis- son, J. Suh, S. Iqbal, P. N. Bennett, K. Inkpen, J. Teevan, R. Kikin-Gil, and E. Horvitz, “Guidelines for human-ai interaction,” inProceedings of the 2019 CHI Conference on Human Factors in Computing Systems, ser. CHI ’19. New York, NY , USA: Association for Computing Machinery, 2019, p. 1–13

work page 2019
[18]

Deep inside convolutional networks: Visualising image classification models and saliency maps,

K. Simonyan and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” inICLR Workshop, 2014

work page 2014
[19]

Rise: Randomized input sampling for explanation of black-box models,

V . Petsiuk, A. Das, and K. Saenko, “Rise: Randomized input sampling for explanation of black-box models,” inBMVC, 2018

work page 2018
[20]

Black-box explanation of object detectors via saliency maps,

V . Petsiuk, R. Jain, V . Manjunatha, V . I. Morariu, A. Mehra, V . Or- donez, and K. Saenko, “Black-box explanation of object detectors via saliency maps,” inConference on Computer Vision and Pattern Recognition (CVPR), 2021

work page 2021
[21]

Occam’s laser: Occlusion-based attribution maps for 3d object de- tectors on lidar data,

D. Schinagl, G. Krispel, H. Possegger, P. M. Roth, and H. Bischof, “Occam’s laser: Occlusion-based attribution maps for 3d object de- tectors on lidar data,” inConference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 1131–1140

work page 2022
[22]

Attention is All you Need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is All you Need,” in Neural Information Processing Systems (NIPS), 2017

work page 2017
[23]

Explainable multi-camera 3d object detection with transformer-based saliency maps,

T. Beemelmanns, W. Zahr, and L. Eckstein, “Explainable multi-camera 3d object detection with transformer-based saliency maps,” inNeurIPS 2023 Workshop on Machine Learning for Autonomous Driving, 2023

work page 2023
[24]

Attention is not explanation,

S. Jain and B. Wallace, “Attention is not explanation,” inNAACL, 2019, pp. 3543–3556

work page 2019
[25]

Multicorrupt: A multi-modal robustness dataset and benchmark of lidar-camera fusion for 3d object detection,

T. Beemelmanns, Q. Zhang, C. Geller, and L. Eckstein, “Multicorrupt: A multi-modal robustness dataset and benchmark of lidar-camera fusion for 3d object detection,” inIntelligent Vehicles Symposium (IV), 2024

work page 2024
[26]

Robo3d: Towards robust and reliable 3d perception against corruptions,

L. Kong, Y . Liu, X. Li, R. Chen, W. Zhang, J. Ren, L. Pan, K. Chen, and Z. Liu, “Robo3d: Towards robust and reliable 3d perception against corruptions,” inInternational Conference on Computer Vision (ICCV), 2023, pp. 19 994–20 006

work page 2023
[27]

Benchmarking and improving bird’s eye view perception robustness in autonomous driving,

S. Xie, L. Kong, W. Zhang, J. Ren, L. Pan, K. Chen, and Z. Liu, “Benchmarking and improving bird’s eye view perception robustness in autonomous driving,”IEEE transactions on pattern analysis and machine intelligence (TPAMI), vol. 47, no. 5, pp. 3878–3894, 2025

work page 2025
[28]

Seeing through fog without seeing fog: Deep multi- modal sensor fusion in unseen adverse weather,

M. Bijelic, T. Gruber, F. Mannan, F. Kraus, W. Ritter, K. Dietmayer, and F. Heide, “Seeing through fog without seeing fog: Deep multi- modal sensor fusion in unseen adverse weather,” inConference on Computer Vision and Pattern Recognition (CVPR), 2020

work page 2020
[29]

Canadian adverse driving conditions dataset,

M. Pitropov, D. E. Garcia, J. Rebello, M. Smart, C. Wang, K. Czar- necki, and S. Waslander, “Canadian adverse driving conditions dataset,”The International Journal of Robotics Research, vol. 40, no. 4-5, pp. 681–690, 12 2020

work page 2020
[30]

nuScenes: A Multimodal Dataset for Autonomous Driving,

H. Caesar, V . Bankiti, A. H. Lang, S. V ora, V . E. Liong, Q. Xu, A. Krishnan, Y . Pan, G. Baldan, and O. Beijbom, “nuScenes: A Multimodal Dataset for Autonomous Driving,” inConference on Computer Vision and Pattern Recognition (CVPR), 2020

work page 2020
[31]

Scalability in Perception for Autonomous Driving: Waymo Open Dataset,

P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V . Patnaik, P. Tsui, J. Guo, Y . Zhou, Y . Chai, B. Caine,et al., “Scalability in Perception for Autonomous Driving: Waymo Open Dataset,” inConference on Computer Vision and Pattern Recognition (CVPR), 2020

work page 2020
[32]

Benchmarking the robustness of lidar-camera fusion for 3d object detection,

K. Yu, T. Tang, H. Xie, Z. Lin, Z. Wu, Z. Xia, T. Liang, H. Sun, J. Deng, D. Hao, Y . Wang, X. Liang, and B. Wang, “Benchmarking the robustness of lidar-camera fusion for 3d object detection,”Conference on Computer Vision and Pattern Recognition Workshop (CVPR’W), pp. 3188–3198, 2022

work page 2022
[33]

Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning,

Y . Gal and Z. Ghahramani, “Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning,” inInternational Conference on Machine Learning (ICML), 2017

work page 2017
[34]

Sampling-free epistemic uncertainty estimation using approximated variance propagation,

J. Postels, F. Ferroni, H. Coskun, N. Navab, and F. Tombari, “Sampling-free epistemic uncertainty estimation using approximated variance propagation,” inConference on Computer Vision and Pattern Recognition (CVPR), October 2019

work page 2019
[35]

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles,

B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles,” inNeural Information Processing Systems (NIPS), 2017

work page 2017
[36]

Estimating the mean and variance of the target probability distribution,

D. Nix and A. Weigend, “Estimating the mean and variance of the target probability distribution,” inProceedings of 1994 IEEE Interna- tional Conference on Neural Networks (ICNN’94), vol. 1, 1994

work page 1994
[37]

Practical confidence and prediction intervals,

T. Heskes, “Practical confidence and prediction intervals,” inNeural Information Processing Systems (NIPS), M.C. Mozer, M. Jordan, and T. Petsche, Eds., vol. 9. MIT Press, 1996

work page 1996
[38]

Bayesod: A bayesian approach for uncertainty estimation in deep object detectors,

A. Harakeh, M. Smart, and S. L. Waslander, “Bayesod: A bayesian approach for uncertainty estimation in deep object detectors,” in International Conference on Robotics and Automation (ICRA), i. b. Institute of Electrical and Electronics Engineers, Ed. IEEE, 2020

work page 2020
[39]

Uncertainty estimation for deep neural object detectors in safety-critical applications,

M. T. Le, F. Diehl, T. Brunner, and A. Knoll, “Uncertainty estimation for deep neural object detectors in safety-critical applications,” in IEEE International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2018, pp. 3873–3878

work page 2018
[40]

Gaussian yolov3: An accurate and fast object detector using localization uncertainty for autonomous driving,

J. Choi, D. Chun, H. Kim, and H.-J. Lee, “Gaussian yolov3: An accurate and fast object detector using localization uncertainty for autonomous driving,” inInternational Conference on Computer Vision (ICCV), 2019, pp. 502–511

work page 2019
[41]

Bounding box regression with uncertainty for accurate object detection,

Y . He, C. Zhu, J. Wang, M. Savvides, and X. Zhang, “Bounding box regression with uncertainty for accurate object detection,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2018

work page 2018
[42]

Towards safe autonomous driving: Capture uncertainty in the deep neural network for lidar 3d vehicle detection,

D. Feng, L. Rosenbaum, and K. Dietmayer, “Towards safe autonomous driving: Capture uncertainty in the deep neural network for lidar 3d vehicle detection,” inIEEE International Conference on Intelligent Transportation Systems (ITSC), 2018, pp. 3266–3273

work page 2018
[43]

Training independent subnet- works for robust prediction,

M. Havasi, R. Jenatton, S. Fort, J. Z. Liu, J. Snoek, B. Lakshmi- narayanan, A. M. Dai, and D. Tran, “Training independent subnet- works for robust prediction,” inInternational Conference on Learning Representations (ICLR), 2021

work page 2021
[44]

Lidar-mimo: Efficient uncertainty estimation for lidar-based 3d object detection,

M. Pitropov, C. Huang, V . Abdelzad, K. Czarnecki, and S. Waslander, “Lidar-mimo: Efficient uncertainty estimation for lidar-based 3d object detection,” inIntelligent Vehicles Symposium (IV), 2022, pp. 813–820

work page 2022
[45]

OCCUQ: Exploring Efficient Uncertainty Quantification for 3D Occupancy Prediction,

S. Heidrich, T. Beemelmanns, A. Nekrasov, B. Leibe, and L. Eck- stein, “OCCUQ: Exploring Efficient Uncertainty Quantification for 3D Occupancy Prediction,” inInternational Conference on Robotics and Automation (ICRA), 2025

work page 2025
[46]

Query2uncertainty: Robust uncertainty quantification and calibration for 3d object detection under distribution shift,

T. Beemelmanns, A. Nekrasov, S. Vilceanu, J. Steinhaus, T. Woopen, B. Leibe, and L. Eckstein, “Query2uncertainty: Robust uncertainty quantification and calibration for 3d object detection under distribution shift,” inConference on Computer Vision and Pattern Recognition (CVPR), 2026

work page 2026
[47]

Lasernet: An efficient probabilistic 3d object detector for autonomous driving,

G. P. Meyer, A. Laddha, E. Kee, C. Vallespi-Gonzalez, and C. K. Wellington, “Lasernet: An efficient probabilistic 3d object detector for autonomous driving,” inConference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019, pp. 12 669–12 678

work page 2019
[48]

Leveraging heteroscedastic aleatoric uncertainties for robust real-time lidar 3d object detection,

D. Feng, L. Rosenbaum, F. Timm, and K. Dietmayer, “Leveraging heteroscedastic aleatoric uncertainties for robust real-time lidar 3d object detection,” inIntelligent Vehicles Symposium (IV). IEEE, 2019, pp. 1280–1287

work page 2019
[49]

Uncertainty-aware voxel based 3d object detection and tracking with von-mises loss,

Y . Zhong, M. Zhu, and H. Peng, “Uncertainty-aware voxel based 3d object detection and tracking with von-mises loss,”arXiv preprint arXiv:2011.02553, 2020

work page arXiv 2011
[50]

Robust collaborative 3d object detection in presence of pose errors,

Y . Lu, Q. Li, B. Liu, M. Dianati, C. Feng, S. Chen, and Y . Wang, “Robust collaborative 3d object detection in presence of pose errors,” inInternational Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 4812–4818

work page 2023
[51]

Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation,

Z. Liu, H. Tang, A. Amini, X. Yang, H. Mao, D. Rus, and S. Han, “Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation,” inInternational Conference on Robotics and Automation (ICRA), 2023

work page 2023
[52]

Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods,

J. Platt, “Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods,”Adv. Large Margin Classif., vol. 10, June 1999

work page 1999
[53]

Accurate uncertainties for deep learning using calibrated regression,

V . Kuleshov, N. Fenner, and S. Ermon, “Accurate uncertainties for deep learning using calibrated regression,” inInternational conference on machine learning. Proceedings of Machine Learning Research (PMLR), 2018, pp. 2796–2804

work page 2018
[54]

Deep- interaction: 3d object detection via modality interaction,

Z. Yang, J. Chen, Z. Miao, W. Li, X. Zhu, and L. Zhang, “Deep- interaction: 3d object detection via modality interaction,” inNeural Information Processing Systems (NIPS), 2022

work page 2022
[55]

TransFusion: Robust Lidar-Camera Fusion for 3d Object Detection with Transformers,

X. Bai, Z. Hu, X. Zhu, Q. Huang, Y . Chen, H. Fu, and C.-L. Tai, “TransFusion: Robust Lidar-Camera Fusion for 3d Object Detection with Transformers,”Conference on Computer Vision and Pattern Recognition (CVPR), 2022

work page 2022
[56]

Is- fusion: Instance-scene collaborative fusion for multimodal 3d object detection,

J. Yin, J. Shen, R. Chen, W. Li, R. Yang, P. Frossard, and W. Wang, “Is- fusion: Instance-scene collaborative fusion for multimodal 3d object detection,” inConference on Computer Vision and Pattern Recognition (CVPR), 2024

work page 2024
[57]

karl. - A Research Vehicle for Automated and Connected Driving,

J.-P. Busch, L. Ostendorf, G. Linden, L. Reiher, T. Beemelmanns, B. Lampe, T. Woopen, and L. Eckstein, “karl. - A Research Vehicle for Automated and Connected Driving,” inIntelligent Vehicles Symposium (IV), 2026

work page 2026

[1] [1]

Cross modal transformer via coordinates encoding for 3d object dectection,

J. Yan, Y . Liu, J. Sun, F. Jia, S. Li, T. Wang, and X. Zhang, “Cross modal transformer via coordinates encoding for 3d object dectection,” International Conference on Computer Vision (ICCV), 2023

work page 2023

[2] [2]

Sparsefusion: Fusing multi-modal sparse representations for multi-sensor 3d object detection,

Y . Xie, C. Xu, M.-J. Rakotosaona, P. Rim, F. Tombari, K. Keutzer, M. Tomizuka, and W. Zhan, “Sparsefusion: Fusing multi-modal sparse representations for multi-sensor 3d object detection,” inConference on Computer Vision and Pattern Recognition (CVPR), 2023

work page 2023

[3] [3]

Ethics guidelines for trustworthy AI,

European Commission, “Ethics guidelines for trustworthy AI,” 2019, accessed: 2025-12-28. [Online]. Available: https://digital-strategy.ec. europa.eu/en/library/ethics-guidelines-trustworthy-ai

work page 2019

[4] [4]

Artificial intelligence risk management framework,

National Institute of Standards and Technology, “Artificial intelligence risk management framework,” 2023, accessed: 2025-12-28. [Online]. Available: https://airc.nist.gov/airmf-resources/playbook/

work page 2023

[5] [5]

ISO 26262- 1:2018(en): Road vehicles — functional safety,

International Organization for Standardization (ISO), “ISO 26262- 1:2018(en): Road vehicles — functional safety,” 2018. [Online]. Available: https://www.iso.org/standard/68383.html

work page 2018

[6] [6]

ISO 21448:2022: Road vehicles — safety of the intended functionality,

——, “ISO 21448:2022: Road vehicles — safety of the intended functionality,” 2022. [Online]. Available: https://www.iso.org/standard/ 77490.html

work page 2022

[7] [7]

Explainable ai for safe and trustworthy autonomous driving: A systematic review,

A. Kuznietsov, B. Gyevnar, C. Wang, S. Peters, and S. V . Albrecht, “Explainable ai for safe and trustworthy autonomous driving: A systematic review,”IEEE International Conference on Intelligent Transportation Systems (ITSC), 2024

work page 2024

[8] [8]

On calibration of modern neural networks,

C. Guo, G. Pleiss, Y . Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” inProceedings of the 34th Interna- tional Conference on Machine Learning - Volume 70, ser. ICML’17. JMLR.org, 2017, p. 1321–1330. (a)karl.Research vehicle used to deploy the proposed approach. (b)XAI Interface.Embedded into the vehicle’s dashboard. (c)XAI Interfa...

work page 2017

[9] [9]

Can we trust you? on calibration of a probabilistic object detector for autonomous driving,

Di Feng, L. Rosenbaum, C. Glaeser, F. Timm, and K. Dietmayer, “Can we trust you? on calibration of a probabilistic object detector for autonomous driving,” inInternational Conference on Intelligent Robots and Systems (IROS), 2019

work page 2019

[10] [10]

Multi- variate confidence calibration for object detection,

F. K ¨uppers, J. Kronenberger, A. Shantia, and A. Haselhoff, “Multi- variate confidence calibration for object detection,” inConference on Computer Vision and Pattern Recognition Workshop (CVPR’W), 2020

work page 2020

[11] [11]

“why should i trust you?

M. T. Ribeiro, S. Singh, and C. Guestrin, ““why should i trust you?”: Explaining the predictions of any classifier,” inKDD, 2016, pp. 1135– 1144

work page 2016

[12] [12]

Grad-cam: Visual explanations from deep networks via gradient-based localization,

R. R. Selvaraju, M. Cogswell, A. Das, and et al., “Grad-cam: Visual explanations from deep networks via gradient-based localization,” in ICCV, 2017, pp. 618–626

work page 2017

[13] [13]

A unified approach to interpreting model predictions,

S. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,”Advances in Neural Information Processing Systems, vol. 30, 2017

work page 2017

[14] [14]

Interpretable explanations of black boxes by meaningful perturbation,

R. Fong and A. Vedaldi, “Interpretable explanations of black boxes by meaningful perturbation,” inICCV, 2017, pp. 3429–3437

work page 2017

[15] [15]

A methodology to enhance transparency for trustworthy artificial intelligence for cooperative, connected, and automated mobility,

P. N. Ca ˜nas, M. Nieto, O. Otaegui, and I. Rodriguez, “A methodology to enhance transparency for trustworthy artificial intelligence for cooperative, connected, and automated mobility,”SAE International Journal of Connected and Automated Vehicles, vol. 8, 2024

work page 2024

[16] [16]

Molnar,Interpretable Machine Learning, 3rd ed., 2025

C. Molnar,Interpretable Machine Learning, 3rd ed., 2025. [Online]. Available: https://christophm.github.io/interpretable-ml-book

work page 2025

[17] [17]

Guidelines for human-ai interaction,

S. Amershi, D. Weld, M. V orvoreanu, A. Fourney, B. Nushi, P. Collis- son, J. Suh, S. Iqbal, P. N. Bennett, K. Inkpen, J. Teevan, R. Kikin-Gil, and E. Horvitz, “Guidelines for human-ai interaction,” inProceedings of the 2019 CHI Conference on Human Factors in Computing Systems, ser. CHI ’19. New York, NY , USA: Association for Computing Machinery, 2019, p. 1–13

work page 2019

[18] [18]

Deep inside convolutional networks: Visualising image classification models and saliency maps,

K. Simonyan and A. Zisserman, “Deep inside convolutional networks: Visualising image classification models and saliency maps,” inICLR Workshop, 2014

work page 2014

[19] [19]

Rise: Randomized input sampling for explanation of black-box models,

V . Petsiuk, A. Das, and K. Saenko, “Rise: Randomized input sampling for explanation of black-box models,” inBMVC, 2018

work page 2018

[20] [20]

Black-box explanation of object detectors via saliency maps,

V . Petsiuk, R. Jain, V . Manjunatha, V . I. Morariu, A. Mehra, V . Or- donez, and K. Saenko, “Black-box explanation of object detectors via saliency maps,” inConference on Computer Vision and Pattern Recognition (CVPR), 2021

work page 2021

[21] [21]

Occam’s laser: Occlusion-based attribution maps for 3d object de- tectors on lidar data,

D. Schinagl, G. Krispel, H. Possegger, P. M. Roth, and H. Bischof, “Occam’s laser: Occlusion-based attribution maps for 3d object de- tectors on lidar data,” inConference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 1131–1140

work page 2022

[22] [22]

Attention is All you Need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is All you Need,” in Neural Information Processing Systems (NIPS), 2017

work page 2017

[23] [23]

Explainable multi-camera 3d object detection with transformer-based saliency maps,

T. Beemelmanns, W. Zahr, and L. Eckstein, “Explainable multi-camera 3d object detection with transformer-based saliency maps,” inNeurIPS 2023 Workshop on Machine Learning for Autonomous Driving, 2023

work page 2023

[24] [24]

Attention is not explanation,

S. Jain and B. Wallace, “Attention is not explanation,” inNAACL, 2019, pp. 3543–3556

work page 2019

[25] [25]

Multicorrupt: A multi-modal robustness dataset and benchmark of lidar-camera fusion for 3d object detection,

T. Beemelmanns, Q. Zhang, C. Geller, and L. Eckstein, “Multicorrupt: A multi-modal robustness dataset and benchmark of lidar-camera fusion for 3d object detection,” inIntelligent Vehicles Symposium (IV), 2024

work page 2024

[26] [26]

Robo3d: Towards robust and reliable 3d perception against corruptions,

L. Kong, Y . Liu, X. Li, R. Chen, W. Zhang, J. Ren, L. Pan, K. Chen, and Z. Liu, “Robo3d: Towards robust and reliable 3d perception against corruptions,” inInternational Conference on Computer Vision (ICCV), 2023, pp. 19 994–20 006

work page 2023

[27] [27]

Benchmarking and improving bird’s eye view perception robustness in autonomous driving,

S. Xie, L. Kong, W. Zhang, J. Ren, L. Pan, K. Chen, and Z. Liu, “Benchmarking and improving bird’s eye view perception robustness in autonomous driving,”IEEE transactions on pattern analysis and machine intelligence (TPAMI), vol. 47, no. 5, pp. 3878–3894, 2025

work page 2025

[28] [28]

Seeing through fog without seeing fog: Deep multi- modal sensor fusion in unseen adverse weather,

M. Bijelic, T. Gruber, F. Mannan, F. Kraus, W. Ritter, K. Dietmayer, and F. Heide, “Seeing through fog without seeing fog: Deep multi- modal sensor fusion in unseen adverse weather,” inConference on Computer Vision and Pattern Recognition (CVPR), 2020

work page 2020

[29] [29]

Canadian adverse driving conditions dataset,

M. Pitropov, D. E. Garcia, J. Rebello, M. Smart, C. Wang, K. Czar- necki, and S. Waslander, “Canadian adverse driving conditions dataset,”The International Journal of Robotics Research, vol. 40, no. 4-5, pp. 681–690, 12 2020

work page 2020

[30] [30]

nuScenes: A Multimodal Dataset for Autonomous Driving,

H. Caesar, V . Bankiti, A. H. Lang, S. V ora, V . E. Liong, Q. Xu, A. Krishnan, Y . Pan, G. Baldan, and O. Beijbom, “nuScenes: A Multimodal Dataset for Autonomous Driving,” inConference on Computer Vision and Pattern Recognition (CVPR), 2020

work page 2020

[31] [31]

Scalability in Perception for Autonomous Driving: Waymo Open Dataset,

P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V . Patnaik, P. Tsui, J. Guo, Y . Zhou, Y . Chai, B. Caine,et al., “Scalability in Perception for Autonomous Driving: Waymo Open Dataset,” inConference on Computer Vision and Pattern Recognition (CVPR), 2020

work page 2020

[32] [32]

Benchmarking the robustness of lidar-camera fusion for 3d object detection,

K. Yu, T. Tang, H. Xie, Z. Lin, Z. Wu, Z. Xia, T. Liang, H. Sun, J. Deng, D. Hao, Y . Wang, X. Liang, and B. Wang, “Benchmarking the robustness of lidar-camera fusion for 3d object detection,”Conference on Computer Vision and Pattern Recognition Workshop (CVPR’W), pp. 3188–3198, 2022

work page 2022

[33] [33]

Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning,

Y . Gal and Z. Ghahramani, “Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning,” inInternational Conference on Machine Learning (ICML), 2017

work page 2017

[34] [34]

Sampling-free epistemic uncertainty estimation using approximated variance propagation,

J. Postels, F. Ferroni, H. Coskun, N. Navab, and F. Tombari, “Sampling-free epistemic uncertainty estimation using approximated variance propagation,” inConference on Computer Vision and Pattern Recognition (CVPR), October 2019

work page 2019

[35] [35]

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles,

B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles,” inNeural Information Processing Systems (NIPS), 2017

work page 2017

[36] [36]

Estimating the mean and variance of the target probability distribution,

D. Nix and A. Weigend, “Estimating the mean and variance of the target probability distribution,” inProceedings of 1994 IEEE Interna- tional Conference on Neural Networks (ICNN’94), vol. 1, 1994

work page 1994

[37] [37]

Practical confidence and prediction intervals,

T. Heskes, “Practical confidence and prediction intervals,” inNeural Information Processing Systems (NIPS), M.C. Mozer, M. Jordan, and T. Petsche, Eds., vol. 9. MIT Press, 1996

work page 1996

[38] [38]

Bayesod: A bayesian approach for uncertainty estimation in deep object detectors,

A. Harakeh, M. Smart, and S. L. Waslander, “Bayesod: A bayesian approach for uncertainty estimation in deep object detectors,” in International Conference on Robotics and Automation (ICRA), i. b. Institute of Electrical and Electronics Engineers, Ed. IEEE, 2020

work page 2020

[39] [39]

Uncertainty estimation for deep neural object detectors in safety-critical applications,

M. T. Le, F. Diehl, T. Brunner, and A. Knoll, “Uncertainty estimation for deep neural object detectors in safety-critical applications,” in IEEE International Conference on Intelligent Transportation Systems (ITSC). IEEE, 2018, pp. 3873–3878

work page 2018

[40] [40]

Gaussian yolov3: An accurate and fast object detector using localization uncertainty for autonomous driving,

J. Choi, D. Chun, H. Kim, and H.-J. Lee, “Gaussian yolov3: An accurate and fast object detector using localization uncertainty for autonomous driving,” inInternational Conference on Computer Vision (ICCV), 2019, pp. 502–511

work page 2019

[41] [41]

Bounding box regression with uncertainty for accurate object detection,

Y . He, C. Zhu, J. Wang, M. Savvides, and X. Zhang, “Bounding box regression with uncertainty for accurate object detection,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2018

work page 2018

[42] [42]

Towards safe autonomous driving: Capture uncertainty in the deep neural network for lidar 3d vehicle detection,

D. Feng, L. Rosenbaum, and K. Dietmayer, “Towards safe autonomous driving: Capture uncertainty in the deep neural network for lidar 3d vehicle detection,” inIEEE International Conference on Intelligent Transportation Systems (ITSC), 2018, pp. 3266–3273

work page 2018

[43] [43]

Training independent subnet- works for robust prediction,

M. Havasi, R. Jenatton, S. Fort, J. Z. Liu, J. Snoek, B. Lakshmi- narayanan, A. M. Dai, and D. Tran, “Training independent subnet- works for robust prediction,” inInternational Conference on Learning Representations (ICLR), 2021

work page 2021

[44] [44]

Lidar-mimo: Efficient uncertainty estimation for lidar-based 3d object detection,

M. Pitropov, C. Huang, V . Abdelzad, K. Czarnecki, and S. Waslander, “Lidar-mimo: Efficient uncertainty estimation for lidar-based 3d object detection,” inIntelligent Vehicles Symposium (IV), 2022, pp. 813–820

work page 2022

[45] [45]

OCCUQ: Exploring Efficient Uncertainty Quantification for 3D Occupancy Prediction,

S. Heidrich, T. Beemelmanns, A. Nekrasov, B. Leibe, and L. Eck- stein, “OCCUQ: Exploring Efficient Uncertainty Quantification for 3D Occupancy Prediction,” inInternational Conference on Robotics and Automation (ICRA), 2025

work page 2025

[46] [46]

Query2uncertainty: Robust uncertainty quantification and calibration for 3d object detection under distribution shift,

T. Beemelmanns, A. Nekrasov, S. Vilceanu, J. Steinhaus, T. Woopen, B. Leibe, and L. Eckstein, “Query2uncertainty: Robust uncertainty quantification and calibration for 3d object detection under distribution shift,” inConference on Computer Vision and Pattern Recognition (CVPR), 2026

work page 2026

[47] [47]

Lasernet: An efficient probabilistic 3d object detector for autonomous driving,

G. P. Meyer, A. Laddha, E. Kee, C. Vallespi-Gonzalez, and C. K. Wellington, “Lasernet: An efficient probabilistic 3d object detector for autonomous driving,” inConference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019, pp. 12 669–12 678

work page 2019

[48] [48]

Leveraging heteroscedastic aleatoric uncertainties for robust real-time lidar 3d object detection,

D. Feng, L. Rosenbaum, F. Timm, and K. Dietmayer, “Leveraging heteroscedastic aleatoric uncertainties for robust real-time lidar 3d object detection,” inIntelligent Vehicles Symposium (IV). IEEE, 2019, pp. 1280–1287

work page 2019

[49] [49]

Uncertainty-aware voxel based 3d object detection and tracking with von-mises loss,

Y . Zhong, M. Zhu, and H. Peng, “Uncertainty-aware voxel based 3d object detection and tracking with von-mises loss,”arXiv preprint arXiv:2011.02553, 2020

work page arXiv 2011

[50] [50]

Robust collaborative 3d object detection in presence of pose errors,

Y . Lu, Q. Li, B. Liu, M. Dianati, C. Feng, S. Chen, and Y . Wang, “Robust collaborative 3d object detection in presence of pose errors,” inInternational Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 4812–4818

work page 2023

[51] [51]

Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation,

Z. Liu, H. Tang, A. Amini, X. Yang, H. Mao, D. Rus, and S. Han, “Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation,” inInternational Conference on Robotics and Automation (ICRA), 2023

work page 2023

[52] [52]

Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods,

J. Platt, “Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods,”Adv. Large Margin Classif., vol. 10, June 1999

work page 1999

[53] [53]

Accurate uncertainties for deep learning using calibrated regression,

V . Kuleshov, N. Fenner, and S. Ermon, “Accurate uncertainties for deep learning using calibrated regression,” inInternational conference on machine learning. Proceedings of Machine Learning Research (PMLR), 2018, pp. 2796–2804

work page 2018

[54] [54]

Deep- interaction: 3d object detection via modality interaction,

Z. Yang, J. Chen, Z. Miao, W. Li, X. Zhu, and L. Zhang, “Deep- interaction: 3d object detection via modality interaction,” inNeural Information Processing Systems (NIPS), 2022

work page 2022

[55] [55]

TransFusion: Robust Lidar-Camera Fusion for 3d Object Detection with Transformers,

X. Bai, Z. Hu, X. Zhu, Q. Huang, Y . Chen, H. Fu, and C.-L. Tai, “TransFusion: Robust Lidar-Camera Fusion for 3d Object Detection with Transformers,”Conference on Computer Vision and Pattern Recognition (CVPR), 2022

work page 2022

[56] [56]

Is- fusion: Instance-scene collaborative fusion for multimodal 3d object detection,

J. Yin, J. Shen, R. Chen, W. Li, R. Yang, P. Frossard, and W. Wang, “Is- fusion: Instance-scene collaborative fusion for multimodal 3d object detection,” inConference on Computer Vision and Pattern Recognition (CVPR), 2024

work page 2024

[57] [57]

karl. - A Research Vehicle for Automated and Connected Driving,

J.-P. Busch, L. Ostendorf, G. Linden, L. Reiher, T. Beemelmanns, B. Lampe, T. Woopen, and L. Eckstein, “karl. - A Research Vehicle for Automated and Connected Driving,” inIntelligent Vehicles Symposium (IV), 2026

work page 2026