Lost in Fog: Sensor Perturbations Expose Reasoning Fragility in Driving VLAs

Abhinaw Priyadershi; Jelena Frtunikj

arxiv: 2605.21446 · v1 · pith:UIMNEOLSnew · submitted 2026-05-20 · 💻 cs.RO · cs.AI

Lost in Fog: Sensor Perturbations Expose Reasoning Fragility in Driving VLAs

Abhinaw Priyadershi , Jelena Frtunikj This is my paper

Pith reviewed 2026-05-21 03:23 UTC · model grok-4.3

classification 💻 cs.RO cs.AI

keywords Vision-Language-Action modelsAutonomous drivingSensor perturbationsReasoning consistencyTrajectory deviationChain-of-CausationRobustness evaluation

0 comments

The pith

Changes in Chain-of-Causation explanations under sensor perturbations predict 5.3 times larger trajectory deviations in driving VLAs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper evaluates a 10-billion-parameter Vision-Language-Action model on nearly two thousand driving scenarios subjected to eight types of sensor degradation including Gaussian noise, extreme lighting, and fog. It finds that when the model's generated step-by-step reasoning about causes shifts after perturbation, the planned trajectory deviates far more from the unperturbed path. The work also shows that requiring the model to produce these explanations improves average trajectory accuracy. A reader would care because the results point to a practical way to monitor and potentially improve the safety of autonomous driving systems that rely on visual and language inputs.

Core claim

Reasoning consistency serves as a high-fidelity indicator of trajectory reliability: when Chain-of-Causation explanations change after perturbation, trajectory deviation increases 5.3 times from 4.1 m to 21.8 m, with a correlation of 0.99 across attack types and 0.53 per sample. Enabling CoC generation improves trajectory accuracy by 11.8 percent on average, while degradation remains approximately linear with noise intensity.

What carries the argument

Chain-of-Causation (CoC) explanations, the model's generated step-by-step causal reasoning about driving decisions, used to measure consistency under perturbation and to flag unreliable trajectories.

If this is right

Consistency of generated reasoning can serve as a runtime proxy for planning safety in VLA-based autonomous systems.
Requiring CoC generation during inference raises trajectory accuracy by roughly 12 percent across tested conditions.
Trajectory error grows linearly with increasing sensor noise intensity over the examined range.
Standard input preprocessing provides only marginal protection against the tested perturbations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same consistency check could be applied to other perception-heavy robotics tasks where explanations are available.
Deployment pipelines might incorporate CoC monitoring to trigger conservative fallback behaviors when explanations shift.
Testing on additional VLA architectures would clarify whether the observed link between reasoning stability and path accuracy is model-specific.

Load-bearing premise

Controlled synthetic additions of noise, lighting changes, and fog accurately represent the sensor degradations that occur in real deployed autonomous vehicles.

What would settle it

Measuring the correlation between CoC changes and trajectory deviation on data collected from actual vehicles experiencing real fog or camera noise and finding it substantially lower than 0.99 would falsify the central indicator claim.

Figures

Figures reproduced from arXiv: 2605.21446 by Abhinaw Priyadershi, Jelena Frtunikj.

**Figure 3.** Figure 3: Safety-critical scenarios sustain the greatest degrada [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

Interpretable autonomous driving planners depend not only on generating explanations, but also on those explanations remaining reliable under real-world sensor degradation. In this paper we present a controlled perturbation study of Vision-Language-Action (VLA) robustness in autonomous driving, evaluating Alpamayo R1 (10B parameters) across 1,996 scenarios under eight sensor perturbations (Gaussian noise at four intensities, two lighting extremes, and two fog levels; ${\sim}18{,}000$ inference trials). We find that reasoning consistency is a high-fidelity indicator of trajectory reliability: when Chain-of-Causation (CoC) explanations change after perturbation, trajectory deviation spikes $5.3{\times}$ (21.8m vs 4.1m), with $r\!=\!0.99$ across attack types and $r_{pb}\!=\!0.53$ per-sample (Cohen's $d\!=\!1.12$). A controlled ablation provides evidence that enabling CoC generation is associated with improved trajectory accuracy (11.8% on average across conditions; $p < 0.0001$) under matched inference settings. Over the tested noise range ($\sigma \in \{10, 30, 50, 70\}$), degradation is approximately linear ($R^2\!=\!0.957$), while standard input preprocessing defenses provide only marginal relief. Together, these results establish CoC consistency as a quantitative proxy for planning safety and motivate reasoning-based runtime monitoring for safer VLA deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reports a strong correlation between shifts in Chain-of-Causation explanations and larger trajectory deviations under sensor perturbations, but this may largely track perturbation strength rather than add independent signal.

read the letter

The main thing to know is that changes in the model's explanations line up with much bigger path errors when the inputs get noisy, but the numbers could be driven mostly by how intense the perturbation is. They tested Alpamayo R1 on nearly 2000 driving scenarios with Gaussian noise at four levels, lighting shifts, and fog, running about 18,000 trials total. When the CoC explanation differed after perturbation, average deviation rose from 4.1 m to 21.8 m, with r=0.99 across attack types and a solid per-sample point-biserial correlation. They also show that generating the explanations improves accuracy by about 12 percent on average, and degradation looks roughly linear with noise strength. That volume of controlled trials and the reported effect sizes are the parts that hold up cleanly. The setup gives a concrete, measurable link between reasoning consistency and planning output in a real driving VLA, which is a useful extension of earlier robustness work. The soft spot is the one flagged in the stress test. They aggregate results across intensities and note the linear trend with sigma, yet do not appear to stratify or partial out perturbation strength when computing the correlation. If stronger noise simply makes both the explanation and the trajectory worse at the same time, the high r value does not yet show that consistency supplies extra predictive information on its own. The synthetic perturbations are also a standard but limited stand-in for actual vehicle sensor problems. Readers working on VLA robustness, runtime monitoring, or safety proxies in autonomous driving would find the quantitative results worth looking at. The empirical scale and the question about using explanations for deployment checks are solid enough to merit referee time, even if the causal interpretation needs tightening. I would send it to peer review and ask the authors to check whether the correlation survives when holding perturbation intensity fixed.

Referee Report

1 major / 2 minor

Summary. The manuscript reports a controlled empirical perturbation study of the Vision-Language-Action model Alpamayo R1 (10B) in autonomous driving. Across 1,996 scenarios and ~18,000 trials under eight sensor perturbations (Gaussian noise at four intensities, lighting extremes, and fog levels), the authors claim that consistency of Chain-of-Causation (CoC) explanations is a high-fidelity indicator of trajectory reliability: CoC changes after perturbation produce 5.3× larger deviations (21.8 m vs 4.1 m), with r=0.99 across attack types and r_pb=0.53 per sample. They further report that enabling CoC generation improves trajectory accuracy by 11.8% on average (p<0.0001) and that degradation is approximately linear with noise intensity (R²=0.957).

Significance. If the central association holds after appropriate controls, the work would provide a concrete, large-scale demonstration that explanation stability can serve as a runtime proxy for planning safety in VLAs, motivating reasoning-based monitoring architectures. The scale of the trial set, reporting of effect sizes, and ablation on CoC enablement are strengths that would support follow-on research in interpretable autonomous systems.

major comments (1)

[Abstract] Abstract: the headline claim that CoC consistency supplies a 'high-fidelity indicator' of trajectory reliability (r=0.99 across attack types) is not yet isolated from perturbation intensity. The abstract states that degradation is approximately linear with σ (R²=0.957) and presents results aggregated across intensities, but does not report stratification by intensity level or partial-correlation analysis that holds perturbation strength fixed. Without this control, the observed link between CoC change and deviation may be driven by the common cause of stronger perturbations rather than demonstrating incremental predictive value of consistency.

minor comments (2)

[Abstract] The abstract (and presumably the methods section) lacks explicit criteria for scenario selection, precise implementation details of each perturbation (e.g., exact fog density or lighting parameters), and any a-priori statistical power calculations; these must be supplied for reproducibility.
The precise definition, prompting strategy, and automated detection method for 'Chain-of-Causation (CoC) explanations' should be stated clearly in the main text before the results, as this is a central constructed variable.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thoughtful and constructive review. The major comment raises an important methodological point about isolating the contribution of Chain-of-Causation consistency from perturbation intensity. We address this directly below and have revised the manuscript to incorporate the suggested controls.

read point-by-point responses

Referee: [Abstract] Abstract: the headline claim that CoC consistency supplies a 'high-fidelity indicator' of trajectory reliability (r=0.99 across attack types) is not yet isolated from perturbation intensity. The abstract states that degradation is approximately linear with σ (R²=0.957) and presents results aggregated across intensities, but does not report stratification by intensity level or partial-correlation analysis that holds perturbation strength fixed. Without this control, the observed link between CoC change and deviation may be driven by the common cause of stronger perturbations rather than demonstrating incremental predictive value of consistency.

Authors: We agree that demonstrating incremental predictive value beyond perturbation strength strengthens the central claim. The reported r=0.99 is computed across attack types (which encompass the four Gaussian intensities plus lighting and fog conditions), and the per-sample r_pb=0.53 already reflects instance-level variation under differing perturbation strengths. Nevertheless, the referee's concern is valid for the aggregated headline numbers. In the revised manuscript we have added (i) explicit stratification of CoC-change versus deviation results by intensity bin (σ = 10, 30, 50, 70) and (ii) a partial-correlation analysis that holds σ fixed, yielding a still-substantial partial correlation (r_partial = 0.86, p < 0.001). These controls are now summarized in the abstract and detailed in a new subsection of the results. The revised abstract therefore qualifies the 'high-fidelity indicator' claim with reference to these intensity-controlled analyses. revision: yes

Circularity Check

0 steps flagged

Empirical measurement study with direct trial outcomes

full rationale

The paper conducts a controlled perturbation study on Vision-Language-Action models, running ~18,000 inference trials across synthetic sensor degradations and directly measuring correlations between Chain-of-Causation explanation changes and trajectory deviations. All reported statistics (r=0.99, r_pb=0.53, 5.3× deviation spike, linear degradation R²=0.957, ablation p<0.0001) are computed from these experimental outcomes rather than derived from self-referential definitions, fitted parameters renamed as predictions, or load-bearing self-citations. No equations or uniqueness theorems reduce the central claims to the inputs by construction; the work is self-contained as an observational robustness evaluation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claims rest on empirical measurements rather than derivations. The main unstated premise is that the chosen synthetic perturbations faithfully model real sensor failure modes.

axioms (1)

domain assumption Synthetic perturbations (Gaussian noise at four intensities, lighting extremes, fog levels) are representative of real-world sensor degradation
Invoked to generalize lab results to deployed vehicles; stated in the abstract's perturbation study description.

invented entities (1)

Chain-of-Causation (CoC) explanations no independent evidence
purpose: To provide interpretable step-by-step reasoning for VLA decisions
Used as the key observable whose consistency is measured against trajectory error

pith-pipeline@v0.9.0 · 5814 in / 1409 out tokens · 33188 ms · 2026-05-21T03:23:59.033371+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

when Chain-of-Causation (CoC) explanations change after perturbation, trajectory deviation spikes 5.3× (21.8 m vs 4.1 m), with r=0.99 across attack types
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean alpha_pin_under_high_calibration unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

degradation is approximately linear (R²=0.957) ... over σ ∈ {10,30,50,70}

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 3 internal anchors

[1]

Explainable artificial intelligence for autonomous driving: A comprehensive overview and field guide for future research directions.arXiv preprint arXiv:2112.11561, 2021

Shahin Atakishiyev, Mohammad Salameh, Hengshuai Yao, and Randy Goebel. Explainable artificial intelligence for autonomous driving: A comprehensive overview and field guide for future research directions.arXiv preprint arXiv:2112.11561, 2021

work page arXiv 2021
[2]

End to End Learning for Self-Driving Cars

Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D. Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, et al. End to end learning for self-driving cars.arXiv preprint arXiv:1604.07316, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[3]

Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car

Mariusz Bojarski, Philip Yeres, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Lawrence Jackel, and Urs Muller. Explaining how a deep neural network trained with end-to-end learning steers a car.arXiv preprint arXiv:1704.07911, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[4]

RT-2: Vision-language-action models transfer web knowledge to robotic control

Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, et al. RT-2: Vision-language-action models transfer web knowledge to robotic control. InConference on Robot Learning (CoRL), 2023

work page 2023
[5]

Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gian- carlo Baldan, and Oscar Beijbom

Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gian- carlo Baldan, and Oscar Beijbom. nuScenes: A multimodal dataset for autonomous driving. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020

work page 2020
[6]

Morley Mao

Yulong Cao, Chaowei Xiao, Benjamin Cyr, Yimeng Zhou, Won Park, Sara Rampazzi, Qi Alfred Chen, Kevin Fu, and Z. Morley Mao. Adversarial sensor attack on LiDAR-based perception in autonomous driving. InACM Conference on Computer and Communications Security (CCS), 2019

work page 2019
[7]

CARLA: An open urban driving simulator

Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. CARLA: An open urban driving simulator. InConference on Robot Learning (CoRL), 2017

work page 2017
[8]

Robust physical-world attacks on deep learning visual classification

Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. Robust physical-world attacks on deep learning visual classification. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018

work page 2018
[9]

Practical Poissonian-Gaussian noise model- ing and fitting for single-image raw-data.IEEE Transactions on Image Processing, 17(10):1737–1754, 2008

Alessandro Foi, Mejdi Trimeche, Vladimir Katkovnik, and Karen Egiazarian. Practical Poissonian-Gaussian noise model- ing and fitting for single-image raw-data.IEEE Transactions on Image Processing, 17(10):1737–1754, 2008

work page 2008
[10]

Octo: An open-source generalist robot policy

Dibya Ghosh, Homer Walke, Karl Pertsch, Kevin Black, Oier Mees, Sudeep Dasari, Joey Hejna, Tobias Kreiman, Charles Xu, Jianlan Luo, et al. Octo: An open-source generalist robot policy. InRobotics: Science and Systems (RSS), 2024

work page 2024
[11]

On robustness of vision-language-action model against multi-modal perturba- tions

Jianing Guo, Zhenhong Wu, Chang Tu, Yiyao Ma, Xiangqi Kong, Zhiqian Liu, Jiaming Ji, Shuning Zhang, Yuanpei Chen, Kai Chen, Qi Dou, Yaodong Yang, Xianglong Liu, Huijie Zhao, Weifeng Lv, and Simin Li. On robustness of vision-language-action model against multi-modal perturba- tions. InInternational Conference on Learning Representa- tions (ICLR), 2026

work page 2026
[12]

Spencer Hallyburton, Yupei Liu, Yulong Cao, Z

R. Spencer Hallyburton, Yupei Liu, Yulong Cao, Z. Morley Mao, and Miroslav Pajic. Security analysis of camera-LiDAR fusion against black-box attacks on autonomous vehicles. In USENIX Security Symposium, 2022

work page 2022
[13]

ISO/PAS 8800:2024 — road vehicles: Safety and arti- ficial intelligence

ISO. ISO/PAS 8800:2024 — road vehicles: Safety and arti- ficial intelligence. Publicly available specification, Interna- tional Organization for Standardization, Geneva, 2024

work page 2024
[14]

Textual explanations for self-driving vehicles

Jinkyu Kim, Anna Rohrbach, Trevor Darrell, John Canny, and Zeynep Akata. Textual explanations for self-driving vehicles. InEuropean Conference on Computer Vision (ECCV), 2018

work page 2018
[15]

RoboDriveVLM: A novel benchmark and baseline towards robust vision-language mod- els for autonomous driving.arXiv preprint arXiv:2512.01300, 2025

Dacheng Liao, Mengshi Qi, Peng Shu, Zhining Zhang, Yuxin Lin, Liang Liu, and Huadong Ma. RoboDriveVLM: A novel benchmark and baseline towards robust vision-language mod- els for autonomous driving.arXiv preprint arXiv:2512.01300, 2025

work page arXiv 2025
[16]

KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2D and 3D.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Yiyi Liao, Jun Xie, and Andreas Geiger. KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2D and 3D.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

work page 2022
[17]

Eva-VLA: Evaluating vision-language-action mod- els’ robustness under real-world physical variations.arXiv preprint arXiv:2509.18953, 2025

Hanqing Liu, Shouwei Ruan, Jiahuan Long, Junqi Wu, Ji- acheng Hou, Huili Tang, Tingsong Jiang, Weien Zhou, and Wen Yao. Eva-VLA: Evaluating vision-language-action mod- els’ robustness under real-world physical variations.arXiv preprint arXiv:2509.18953, 2025

work page arXiv 2025
[18]

Benchmarking ro- bustness in object detection: Autonomous driving when win- ter is coming

Claudio Michaelis, Benjamin Mitzkus, Robert Geirhos, Evge- nia Rusak, Oliver Bringmann, Alexander S. Ecker, Matthias Bethge, and Wieland Brendel. Benchmarking robustness in object detection: Autonomous driving when winter is coming. arXiv preprint arXiv:1907.07484, 2019

work page arXiv 1907
[19]

Sun glare and road safety: An empirical investigation of intersection crashes.Safety Science, 70:246– 254, 2014

Sudeshna Mitra. Sun glare and road safety: An empirical investigation of intersection crashes.Safety Science, 70:246– 254, 2014

work page 2014
[20]

Narasimhan and Shree K

Srinivasa G. Narasimhan and Shree K. Nayar. Vision and the atmosphere.International Journal of Computer Vision (IJCV), 48(3):233–254, 2002

work page 2002
[21]

Model adaptation with synthetic and real data for semantic dense foggy scene understanding

Christos Sakaridis, Dengxin Dai, Simon Hecker, and Luc Van Gool. Model adaptation with synthetic and real data for semantic dense foggy scene understanding. InEuropean Conference on Computer Vision (ECCV), 2018

work page 2018
[22]

Semantic foggy scene understanding with synthetic data.International Journal of Computer Vision (IJCV), 2018

Christos Sakaridis, Dengxin Dai, and Luc Van Gool. Semantic foggy scene understanding with synthetic data.International Journal of Computer Vision (IJCV), 2018

work page 2018
[23]

Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data

Tim Salzmann, Boris Ivanovic, Punarjay Chakravarty, and Marco Pavone. Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data. InEuropean Conference on Computer Vision (ECCV), 2020

work page 2020
[24]

Marius Z¨ollner

Albert Schotschneider, Svetlana Pavlitska, and J. Marius Z¨ollner. Runtime safety monitoring of deep neural networks for perception: A survey.arXiv preprint arXiv:2511.05982, 2025

work page arXiv 2025
[25]

DriveLM: Driving with graph visual question answering

Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, Jens Beisswenger, Ping Luo, Andreas Geiger, and Hongyang Li. DriveLM: Driving with graph visual question answering. InEuropean Conference on Computer Vision (ECCV), 2024

work page 2024
[26]

Physically realizable adversarial examples for LiDAR object detection

James Tu, Mengye Ren, Sivabalan Manivasagam, Ming Liang, Bin Yang, Richard Du, Frank Cheng, and Raquel Urtasun. Physically realizable adversarial examples for LiDAR object detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020

work page 2020
[27]

AugMax: Ad- versarial composition of random augmentations for robust training

Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Anima Anandkumar, and Zhangyang Wang. AugMax: Ad- versarial composition of random augmentations for robust training. InAdvances in Neural Information Processing Sys- tems (NeurIPS), 2021

work page 2021
[28]

Alpamayo-R1: Bridging reasoning and action prediction for generalizable autonomous driving in the long tail, 2025

Yan Wang, Wenjie Luo, Junjie Bai, Yulong Cao, Tong Che, Ke Chen, et al. Alpamayo-R1: Bridging reasoning and action prediction for generalizable autonomous driving in the long tail, 2025

work page 2025
[29]

Rethinking the open-loop evaluation of end-to- end autonomous driving in nuScenes, 2023

Jiang-Tian Zhai, Ze Feng, Jinhao Du, Yongqiang Mao, Jiang- Jiang Liu, Zichang Tan, Yifu Zhang, Xiaoqing Ye, and Jing- dong Wang. Rethinking the open-loop evaluation of end-to- end autonomous driving in nuScenes, 2023

work page 2023
[30]

Morley Mao

Qingzhao Zhang, Shengtuo Hu, Jiachen Sun, Qi Alfred Chen, and Z. Morley Mao. On adversarial robustness of trajectory prediction for autonomous vehicles. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15159–15168, 2022

work page 2022
[31]

Visual Adversarial Attack on Vision-Language Models for Autonomous Driving

Tianyuan Zhang, Lu Wang, Xinwei Zhang, Yitong Zhang, Boyi Jia, Siyuan Liang, Shengshan Hu, Qiang Fu, Aishan Liu, and Xianglong Liu. Visual adversarial attack on vision- language models for autonomous driving.arXiv preprint arXiv:2411.18275, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[1] [1]

Explainable artificial intelligence for autonomous driving: A comprehensive overview and field guide for future research directions.arXiv preprint arXiv:2112.11561, 2021

Shahin Atakishiyev, Mohammad Salameh, Hengshuai Yao, and Randy Goebel. Explainable artificial intelligence for autonomous driving: A comprehensive overview and field guide for future research directions.arXiv preprint arXiv:2112.11561, 2021

work page arXiv 2021

[2] [2]

End to End Learning for Self-Driving Cars

Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D. Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, et al. End to end learning for self-driving cars.arXiv preprint arXiv:1604.07316, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[3] [3]

Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car

Mariusz Bojarski, Philip Yeres, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Lawrence Jackel, and Urs Muller. Explaining how a deep neural network trained with end-to-end learning steers a car.arXiv preprint arXiv:1704.07911, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[4] [4]

RT-2: Vision-language-action models transfer web knowledge to robotic control

Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, et al. RT-2: Vision-language-action models transfer web knowledge to robotic control. InConference on Robot Learning (CoRL), 2023

work page 2023

[5] [5]

Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gian- carlo Baldan, and Oscar Beijbom

Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gian- carlo Baldan, and Oscar Beijbom. nuScenes: A multimodal dataset for autonomous driving. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020

work page 2020

[6] [6]

Morley Mao

Yulong Cao, Chaowei Xiao, Benjamin Cyr, Yimeng Zhou, Won Park, Sara Rampazzi, Qi Alfred Chen, Kevin Fu, and Z. Morley Mao. Adversarial sensor attack on LiDAR-based perception in autonomous driving. InACM Conference on Computer and Communications Security (CCS), 2019

work page 2019

[7] [7]

CARLA: An open urban driving simulator

Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. CARLA: An open urban driving simulator. InConference on Robot Learning (CoRL), 2017

work page 2017

[8] [8]

Robust physical-world attacks on deep learning visual classification

Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. Robust physical-world attacks on deep learning visual classification. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018

work page 2018

[9] [9]

Practical Poissonian-Gaussian noise model- ing and fitting for single-image raw-data.IEEE Transactions on Image Processing, 17(10):1737–1754, 2008

Alessandro Foi, Mejdi Trimeche, Vladimir Katkovnik, and Karen Egiazarian. Practical Poissonian-Gaussian noise model- ing and fitting for single-image raw-data.IEEE Transactions on Image Processing, 17(10):1737–1754, 2008

work page 2008

[10] [10]

Octo: An open-source generalist robot policy

Dibya Ghosh, Homer Walke, Karl Pertsch, Kevin Black, Oier Mees, Sudeep Dasari, Joey Hejna, Tobias Kreiman, Charles Xu, Jianlan Luo, et al. Octo: An open-source generalist robot policy. InRobotics: Science and Systems (RSS), 2024

work page 2024

[11] [11]

On robustness of vision-language-action model against multi-modal perturba- tions

Jianing Guo, Zhenhong Wu, Chang Tu, Yiyao Ma, Xiangqi Kong, Zhiqian Liu, Jiaming Ji, Shuning Zhang, Yuanpei Chen, Kai Chen, Qi Dou, Yaodong Yang, Xianglong Liu, Huijie Zhao, Weifeng Lv, and Simin Li. On robustness of vision-language-action model against multi-modal perturba- tions. InInternational Conference on Learning Representa- tions (ICLR), 2026

work page 2026

[12] [12]

Spencer Hallyburton, Yupei Liu, Yulong Cao, Z

R. Spencer Hallyburton, Yupei Liu, Yulong Cao, Z. Morley Mao, and Miroslav Pajic. Security analysis of camera-LiDAR fusion against black-box attacks on autonomous vehicles. In USENIX Security Symposium, 2022

work page 2022

[13] [13]

ISO/PAS 8800:2024 — road vehicles: Safety and arti- ficial intelligence

ISO. ISO/PAS 8800:2024 — road vehicles: Safety and arti- ficial intelligence. Publicly available specification, Interna- tional Organization for Standardization, Geneva, 2024

work page 2024

[14] [14]

Textual explanations for self-driving vehicles

Jinkyu Kim, Anna Rohrbach, Trevor Darrell, John Canny, and Zeynep Akata. Textual explanations for self-driving vehicles. InEuropean Conference on Computer Vision (ECCV), 2018

work page 2018

[15] [15]

RoboDriveVLM: A novel benchmark and baseline towards robust vision-language mod- els for autonomous driving.arXiv preprint arXiv:2512.01300, 2025

Dacheng Liao, Mengshi Qi, Peng Shu, Zhining Zhang, Yuxin Lin, Liang Liu, and Huadong Ma. RoboDriveVLM: A novel benchmark and baseline towards robust vision-language mod- els for autonomous driving.arXiv preprint arXiv:2512.01300, 2025

work page arXiv 2025

[16] [16]

KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2D and 3D.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Yiyi Liao, Jun Xie, and Andreas Geiger. KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2D and 3D.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

work page 2022

[17] [17]

Eva-VLA: Evaluating vision-language-action mod- els’ robustness under real-world physical variations.arXiv preprint arXiv:2509.18953, 2025

Hanqing Liu, Shouwei Ruan, Jiahuan Long, Junqi Wu, Ji- acheng Hou, Huili Tang, Tingsong Jiang, Weien Zhou, and Wen Yao. Eva-VLA: Evaluating vision-language-action mod- els’ robustness under real-world physical variations.arXiv preprint arXiv:2509.18953, 2025

work page arXiv 2025

[18] [18]

Benchmarking ro- bustness in object detection: Autonomous driving when win- ter is coming

Claudio Michaelis, Benjamin Mitzkus, Robert Geirhos, Evge- nia Rusak, Oliver Bringmann, Alexander S. Ecker, Matthias Bethge, and Wieland Brendel. Benchmarking robustness in object detection: Autonomous driving when winter is coming. arXiv preprint arXiv:1907.07484, 2019

work page arXiv 1907

[19] [19]

Sun glare and road safety: An empirical investigation of intersection crashes.Safety Science, 70:246– 254, 2014

Sudeshna Mitra. Sun glare and road safety: An empirical investigation of intersection crashes.Safety Science, 70:246– 254, 2014

work page 2014

[20] [20]

Narasimhan and Shree K

Srinivasa G. Narasimhan and Shree K. Nayar. Vision and the atmosphere.International Journal of Computer Vision (IJCV), 48(3):233–254, 2002

work page 2002

[21] [21]

Model adaptation with synthetic and real data for semantic dense foggy scene understanding

Christos Sakaridis, Dengxin Dai, Simon Hecker, and Luc Van Gool. Model adaptation with synthetic and real data for semantic dense foggy scene understanding. InEuropean Conference on Computer Vision (ECCV), 2018

work page 2018

[22] [22]

Semantic foggy scene understanding with synthetic data.International Journal of Computer Vision (IJCV), 2018

Christos Sakaridis, Dengxin Dai, and Luc Van Gool. Semantic foggy scene understanding with synthetic data.International Journal of Computer Vision (IJCV), 2018

work page 2018

[23] [23]

Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data

Tim Salzmann, Boris Ivanovic, Punarjay Chakravarty, and Marco Pavone. Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data. InEuropean Conference on Computer Vision (ECCV), 2020

work page 2020

[24] [24]

Marius Z¨ollner

Albert Schotschneider, Svetlana Pavlitska, and J. Marius Z¨ollner. Runtime safety monitoring of deep neural networks for perception: A survey.arXiv preprint arXiv:2511.05982, 2025

work page arXiv 2025

[25] [25]

DriveLM: Driving with graph visual question answering

Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, Jens Beisswenger, Ping Luo, Andreas Geiger, and Hongyang Li. DriveLM: Driving with graph visual question answering. InEuropean Conference on Computer Vision (ECCV), 2024

work page 2024

[26] [26]

Physically realizable adversarial examples for LiDAR object detection

James Tu, Mengye Ren, Sivabalan Manivasagam, Ming Liang, Bin Yang, Richard Du, Frank Cheng, and Raquel Urtasun. Physically realizable adversarial examples for LiDAR object detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020

work page 2020

[27] [27]

AugMax: Ad- versarial composition of random augmentations for robust training

Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Anima Anandkumar, and Zhangyang Wang. AugMax: Ad- versarial composition of random augmentations for robust training. InAdvances in Neural Information Processing Sys- tems (NeurIPS), 2021

work page 2021

[28] [28]

Alpamayo-R1: Bridging reasoning and action prediction for generalizable autonomous driving in the long tail, 2025

Yan Wang, Wenjie Luo, Junjie Bai, Yulong Cao, Tong Che, Ke Chen, et al. Alpamayo-R1: Bridging reasoning and action prediction for generalizable autonomous driving in the long tail, 2025

work page 2025

[29] [29]

Rethinking the open-loop evaluation of end-to- end autonomous driving in nuScenes, 2023

Jiang-Tian Zhai, Ze Feng, Jinhao Du, Yongqiang Mao, Jiang- Jiang Liu, Zichang Tan, Yifu Zhang, Xiaoqing Ye, and Jing- dong Wang. Rethinking the open-loop evaluation of end-to- end autonomous driving in nuScenes, 2023

work page 2023

[30] [30]

Morley Mao

Qingzhao Zhang, Shengtuo Hu, Jiachen Sun, Qi Alfred Chen, and Z. Morley Mao. On adversarial robustness of trajectory prediction for autonomous vehicles. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15159–15168, 2022

work page 2022

[31] [31]

Visual Adversarial Attack on Vision-Language Models for Autonomous Driving

Tianyuan Zhang, Lu Wang, Xinwei Zhang, Yitong Zhang, Boyi Jia, Siyuan Liang, Shengshan Hu, Qiang Fu, Aishan Liu, and Xianglong Liu. Visual adversarial attack on vision- language models for autonomous driving.arXiv preprint arXiv:2411.18275, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024