Lost in Fog: Sensor Perturbations Expose Reasoning Fragility in Driving VLAs
Pith reviewed 2026-05-21 03:23 UTC · model grok-4.3
The pith
Changes in Chain-of-Causation explanations under sensor perturbations predict 5.3 times larger trajectory deviations in driving VLAs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Reasoning consistency serves as a high-fidelity indicator of trajectory reliability: when Chain-of-Causation explanations change after perturbation, trajectory deviation increases 5.3 times from 4.1 m to 21.8 m, with a correlation of 0.99 across attack types and 0.53 per sample. Enabling CoC generation improves trajectory accuracy by 11.8 percent on average, while degradation remains approximately linear with noise intensity.
What carries the argument
Chain-of-Causation (CoC) explanations, the model's generated step-by-step causal reasoning about driving decisions, used to measure consistency under perturbation and to flag unreliable trajectories.
If this is right
- Consistency of generated reasoning can serve as a runtime proxy for planning safety in VLA-based autonomous systems.
- Requiring CoC generation during inference raises trajectory accuracy by roughly 12 percent across tested conditions.
- Trajectory error grows linearly with increasing sensor noise intensity over the examined range.
- Standard input preprocessing provides only marginal protection against the tested perturbations.
Where Pith is reading between the lines
- The same consistency check could be applied to other perception-heavy robotics tasks where explanations are available.
- Deployment pipelines might incorporate CoC monitoring to trigger conservative fallback behaviors when explanations shift.
- Testing on additional VLA architectures would clarify whether the observed link between reasoning stability and path accuracy is model-specific.
Load-bearing premise
Controlled synthetic additions of noise, lighting changes, and fog accurately represent the sensor degradations that occur in real deployed autonomous vehicles.
What would settle it
Measuring the correlation between CoC changes and trajectory deviation on data collected from actual vehicles experiencing real fog or camera noise and finding it substantially lower than 0.99 would falsify the central indicator claim.
Figures
read the original abstract
Interpretable autonomous driving planners depend not only on generating explanations, but also on those explanations remaining reliable under real-world sensor degradation. In this paper we present a controlled perturbation study of Vision-Language-Action (VLA) robustness in autonomous driving, evaluating Alpamayo R1 (10B parameters) across 1,996 scenarios under eight sensor perturbations (Gaussian noise at four intensities, two lighting extremes, and two fog levels; ${\sim}18{,}000$ inference trials). We find that reasoning consistency is a high-fidelity indicator of trajectory reliability: when Chain-of-Causation (CoC) explanations change after perturbation, trajectory deviation spikes $5.3{\times}$ (21.8m vs 4.1m), with $r\!=\!0.99$ across attack types and $r_{pb}\!=\!0.53$ per-sample (Cohen's $d\!=\!1.12$). A controlled ablation provides evidence that enabling CoC generation is associated with improved trajectory accuracy (11.8% on average across conditions; $p < 0.0001$) under matched inference settings. Over the tested noise range ($\sigma \in \{10, 30, 50, 70\}$), degradation is approximately linear ($R^2\!=\!0.957$), while standard input preprocessing defenses provide only marginal relief. Together, these results establish CoC consistency as a quantitative proxy for planning safety and motivate reasoning-based runtime monitoring for safer VLA deployment.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports a controlled empirical perturbation study of the Vision-Language-Action model Alpamayo R1 (10B) in autonomous driving. Across 1,996 scenarios and ~18,000 trials under eight sensor perturbations (Gaussian noise at four intensities, lighting extremes, and fog levels), the authors claim that consistency of Chain-of-Causation (CoC) explanations is a high-fidelity indicator of trajectory reliability: CoC changes after perturbation produce 5.3× larger deviations (21.8 m vs 4.1 m), with r=0.99 across attack types and r_pb=0.53 per sample. They further report that enabling CoC generation improves trajectory accuracy by 11.8% on average (p<0.0001) and that degradation is approximately linear with noise intensity (R²=0.957).
Significance. If the central association holds after appropriate controls, the work would provide a concrete, large-scale demonstration that explanation stability can serve as a runtime proxy for planning safety in VLAs, motivating reasoning-based monitoring architectures. The scale of the trial set, reporting of effect sizes, and ablation on CoC enablement are strengths that would support follow-on research in interpretable autonomous systems.
major comments (1)
- [Abstract] Abstract: the headline claim that CoC consistency supplies a 'high-fidelity indicator' of trajectory reliability (r=0.99 across attack types) is not yet isolated from perturbation intensity. The abstract states that degradation is approximately linear with σ (R²=0.957) and presents results aggregated across intensities, but does not report stratification by intensity level or partial-correlation analysis that holds perturbation strength fixed. Without this control, the observed link between CoC change and deviation may be driven by the common cause of stronger perturbations rather than demonstrating incremental predictive value of consistency.
minor comments (2)
- [Abstract] The abstract (and presumably the methods section) lacks explicit criteria for scenario selection, precise implementation details of each perturbation (e.g., exact fog density or lighting parameters), and any a-priori statistical power calculations; these must be supplied for reproducibility.
- The precise definition, prompting strategy, and automated detection method for 'Chain-of-Causation (CoC) explanations' should be stated clearly in the main text before the results, as this is a central constructed variable.
Simulated Author's Rebuttal
We thank the referee for their thoughtful and constructive review. The major comment raises an important methodological point about isolating the contribution of Chain-of-Causation consistency from perturbation intensity. We address this directly below and have revised the manuscript to incorporate the suggested controls.
read point-by-point responses
-
Referee: [Abstract] Abstract: the headline claim that CoC consistency supplies a 'high-fidelity indicator' of trajectory reliability (r=0.99 across attack types) is not yet isolated from perturbation intensity. The abstract states that degradation is approximately linear with σ (R²=0.957) and presents results aggregated across intensities, but does not report stratification by intensity level or partial-correlation analysis that holds perturbation strength fixed. Without this control, the observed link between CoC change and deviation may be driven by the common cause of stronger perturbations rather than demonstrating incremental predictive value of consistency.
Authors: We agree that demonstrating incremental predictive value beyond perturbation strength strengthens the central claim. The reported r=0.99 is computed across attack types (which encompass the four Gaussian intensities plus lighting and fog conditions), and the per-sample r_pb=0.53 already reflects instance-level variation under differing perturbation strengths. Nevertheless, the referee's concern is valid for the aggregated headline numbers. In the revised manuscript we have added (i) explicit stratification of CoC-change versus deviation results by intensity bin (σ = 10, 30, 50, 70) and (ii) a partial-correlation analysis that holds σ fixed, yielding a still-substantial partial correlation (r_partial = 0.86, p < 0.001). These controls are now summarized in the abstract and detailed in a new subsection of the results. The revised abstract therefore qualifies the 'high-fidelity indicator' claim with reference to these intensity-controlled analyses. revision: yes
Circularity Check
Empirical measurement study with direct trial outcomes
full rationale
The paper conducts a controlled perturbation study on Vision-Language-Action models, running ~18,000 inference trials across synthetic sensor degradations and directly measuring correlations between Chain-of-Causation explanation changes and trajectory deviations. All reported statistics (r=0.99, r_pb=0.53, 5.3× deviation spike, linear degradation R²=0.957, ablation p<0.0001) are computed from these experimental outcomes rather than derived from self-referential definitions, fitted parameters renamed as predictions, or load-bearing self-citations. No equations or uniqueness theorems reduce the central claims to the inputs by construction; the work is self-contained as an observational robustness evaluation.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Synthetic perturbations (Gaussian noise at four intensities, lighting extremes, fog levels) are representative of real-world sensor degradation
invented entities (1)
-
Chain-of-Causation (CoC) explanations
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
when Chain-of-Causation (CoC) explanations change after perturbation, trajectory deviation spikes 5.3× (21.8 m vs 4.1 m), with r=0.99 across attack types
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leanalpha_pin_under_high_calibration unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
degradation is approximately linear (R²=0.957) ... over σ ∈ {10,30,50,70}
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Shahin Atakishiyev, Mohammad Salameh, Hengshuai Yao, and Randy Goebel. Explainable artificial intelligence for autonomous driving: A comprehensive overview and field guide for future research directions.arXiv preprint arXiv:2112.11561, 2021
-
[2]
End to End Learning for Self-Driving Cars
Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D. Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, et al. End to end learning for self-driving cars.arXiv preprint arXiv:1604.07316, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[3]
Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car
Mariusz Bojarski, Philip Yeres, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Lawrence Jackel, and Urs Muller. Explaining how a deep neural network trained with end-to-end learning steers a car.arXiv preprint arXiv:1704.07911, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[4]
RT-2: Vision-language-action models transfer web knowledge to robotic control
Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, et al. RT-2: Vision-language-action models transfer web knowledge to robotic control. InConference on Robot Learning (CoRL), 2023
work page 2023
-
[5]
Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gian- carlo Baldan, and Oscar Beijbom. nuScenes: A multimodal dataset for autonomous driving. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
work page 2020
-
[6]
Yulong Cao, Chaowei Xiao, Benjamin Cyr, Yimeng Zhou, Won Park, Sara Rampazzi, Qi Alfred Chen, Kevin Fu, and Z. Morley Mao. Adversarial sensor attack on LiDAR-based perception in autonomous driving. InACM Conference on Computer and Communications Security (CCS), 2019
work page 2019
-
[7]
CARLA: An open urban driving simulator
Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. CARLA: An open urban driving simulator. InConference on Robot Learning (CoRL), 2017
work page 2017
-
[8]
Robust physical-world attacks on deep learning visual classification
Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. Robust physical-world attacks on deep learning visual classification. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018
work page 2018
-
[9]
Alessandro Foi, Mejdi Trimeche, Vladimir Katkovnik, and Karen Egiazarian. Practical Poissonian-Gaussian noise model- ing and fitting for single-image raw-data.IEEE Transactions on Image Processing, 17(10):1737–1754, 2008
work page 2008
-
[10]
Octo: An open-source generalist robot policy
Dibya Ghosh, Homer Walke, Karl Pertsch, Kevin Black, Oier Mees, Sudeep Dasari, Joey Hejna, Tobias Kreiman, Charles Xu, Jianlan Luo, et al. Octo: An open-source generalist robot policy. InRobotics: Science and Systems (RSS), 2024
work page 2024
-
[11]
On robustness of vision-language-action model against multi-modal perturba- tions
Jianing Guo, Zhenhong Wu, Chang Tu, Yiyao Ma, Xiangqi Kong, Zhiqian Liu, Jiaming Ji, Shuning Zhang, Yuanpei Chen, Kai Chen, Qi Dou, Yaodong Yang, Xianglong Liu, Huijie Zhao, Weifeng Lv, and Simin Li. On robustness of vision-language-action model against multi-modal perturba- tions. InInternational Conference on Learning Representa- tions (ICLR), 2026
work page 2026
-
[12]
Spencer Hallyburton, Yupei Liu, Yulong Cao, Z
R. Spencer Hallyburton, Yupei Liu, Yulong Cao, Z. Morley Mao, and Miroslav Pajic. Security analysis of camera-LiDAR fusion against black-box attacks on autonomous vehicles. In USENIX Security Symposium, 2022
work page 2022
-
[13]
ISO/PAS 8800:2024 — road vehicles: Safety and arti- ficial intelligence
ISO. ISO/PAS 8800:2024 — road vehicles: Safety and arti- ficial intelligence. Publicly available specification, Interna- tional Organization for Standardization, Geneva, 2024
work page 2024
-
[14]
Textual explanations for self-driving vehicles
Jinkyu Kim, Anna Rohrbach, Trevor Darrell, John Canny, and Zeynep Akata. Textual explanations for self-driving vehicles. InEuropean Conference on Computer Vision (ECCV), 2018
work page 2018
-
[15]
Dacheng Liao, Mengshi Qi, Peng Shu, Zhining Zhang, Yuxin Lin, Liang Liu, and Huadong Ma. RoboDriveVLM: A novel benchmark and baseline towards robust vision-language mod- els for autonomous driving.arXiv preprint arXiv:2512.01300, 2025
-
[16]
Yiyi Liao, Jun Xie, and Andreas Geiger. KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2D and 3D.IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
work page 2022
-
[17]
Hanqing Liu, Shouwei Ruan, Jiahuan Long, Junqi Wu, Ji- acheng Hou, Huili Tang, Tingsong Jiang, Weien Zhou, and Wen Yao. Eva-VLA: Evaluating vision-language-action mod- els’ robustness under real-world physical variations.arXiv preprint arXiv:2509.18953, 2025
-
[18]
Benchmarking ro- bustness in object detection: Autonomous driving when win- ter is coming
Claudio Michaelis, Benjamin Mitzkus, Robert Geirhos, Evge- nia Rusak, Oliver Bringmann, Alexander S. Ecker, Matthias Bethge, and Wieland Brendel. Benchmarking robustness in object detection: Autonomous driving when winter is coming. arXiv preprint arXiv:1907.07484, 2019
-
[19]
Sudeshna Mitra. Sun glare and road safety: An empirical investigation of intersection crashes.Safety Science, 70:246– 254, 2014
work page 2014
-
[20]
Srinivasa G. Narasimhan and Shree K. Nayar. Vision and the atmosphere.International Journal of Computer Vision (IJCV), 48(3):233–254, 2002
work page 2002
-
[21]
Model adaptation with synthetic and real data for semantic dense foggy scene understanding
Christos Sakaridis, Dengxin Dai, Simon Hecker, and Luc Van Gool. Model adaptation with synthetic and real data for semantic dense foggy scene understanding. InEuropean Conference on Computer Vision (ECCV), 2018
work page 2018
-
[22]
Christos Sakaridis, Dengxin Dai, and Luc Van Gool. Semantic foggy scene understanding with synthetic data.International Journal of Computer Vision (IJCV), 2018
work page 2018
-
[23]
Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data
Tim Salzmann, Boris Ivanovic, Punarjay Chakravarty, and Marco Pavone. Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data. InEuropean Conference on Computer Vision (ECCV), 2020
work page 2020
-
[24]
Albert Schotschneider, Svetlana Pavlitska, and J. Marius Z¨ollner. Runtime safety monitoring of deep neural networks for perception: A survey.arXiv preprint arXiv:2511.05982, 2025
-
[25]
DriveLM: Driving with graph visual question answering
Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, Jens Beisswenger, Ping Luo, Andreas Geiger, and Hongyang Li. DriveLM: Driving with graph visual question answering. InEuropean Conference on Computer Vision (ECCV), 2024
work page 2024
-
[26]
Physically realizable adversarial examples for LiDAR object detection
James Tu, Mengye Ren, Sivabalan Manivasagam, Ming Liang, Bin Yang, Richard Du, Frank Cheng, and Raquel Urtasun. Physically realizable adversarial examples for LiDAR object detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
work page 2020
-
[27]
AugMax: Ad- versarial composition of random augmentations for robust training
Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Anima Anandkumar, and Zhangyang Wang. AugMax: Ad- versarial composition of random augmentations for robust training. InAdvances in Neural Information Processing Sys- tems (NeurIPS), 2021
work page 2021
-
[28]
Yan Wang, Wenjie Luo, Junjie Bai, Yulong Cao, Tong Che, Ke Chen, et al. Alpamayo-R1: Bridging reasoning and action prediction for generalizable autonomous driving in the long tail, 2025
work page 2025
-
[29]
Rethinking the open-loop evaluation of end-to- end autonomous driving in nuScenes, 2023
Jiang-Tian Zhai, Ze Feng, Jinhao Du, Yongqiang Mao, Jiang- Jiang Liu, Zichang Tan, Yifu Zhang, Xiaoqing Ye, and Jing- dong Wang. Rethinking the open-loop evaluation of end-to- end autonomous driving in nuScenes, 2023
work page 2023
-
[30]
Qingzhao Zhang, Shengtuo Hu, Jiachen Sun, Qi Alfred Chen, and Z. Morley Mao. On adversarial robustness of trajectory prediction for autonomous vehicles. InIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15159–15168, 2022
work page 2022
-
[31]
Visual Adversarial Attack on Vision-Language Models for Autonomous Driving
Tianyuan Zhang, Lu Wang, Xinwei Zhang, Yitong Zhang, Boyi Jia, Siyuan Liang, Shengshan Hu, Qiang Fu, Aishan Liu, and Xianglong Liu. Visual adversarial attack on vision- language models for autonomous driving.arXiv preprint arXiv:2411.18275, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.