Recognition: unknown
From Research to Practice: An Interactive Rapid Review of Autonomous Driving System Testing in Industry
Pith reviewed 2026-05-09 18:49 UTC · model grok-4.3
The pith
An interactive review with industry practitioners reveals that research on testing end-to-end autonomous driving systems often overlooks practical constraints.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Through an interactive rapid review involving 21 practitioners, the study identifies 12 challenges in ADS testing and prioritizes two related to end-to-end systems. Analysis of 17 papers reveals that while research emphasizes generating critical scenarios, these approaches often fail to account for practical constraints like regulatory requirements, legacy systems, and specific operational contexts in industry. The core finding is the persistent disconnect, calling for more industry-relevant research.
What carries the argument
The interactive rapid review process that integrates practitioner input to identify challenges and evaluate research applicability on ADS testing.
Load-bearing premise
The assumption that the views of twenty-one practitioners from one automotive company and the selection of seventeen studies adequately represent the broader industry's testing challenges and the full research landscape.
What would settle it
A survey of practitioners from additional companies revealing different top priorities for ADS testing challenges, or a larger review finding that many of the seventeen studies are already adapted for industrial use in practice.
Figures
read the original abstract
Autonomous driving systems (ADS) are increasingly deployed in real traffic, yet testing remains fundamentally challenging due to open environments, complex scenarios, and the lack of established processes and metrics. Despite extensive research, a gap persists between academic advances and their applicability in industrial practice. To address this, we conduct an interactive rapid review in collaboration with 21 practitioners from a leading automotive company. Practitioners identified 12 key challenges in ADS testing, and prioritised two as the most critical issues, namely approaches to and completeness of testing for End-to-End (E2E) ADS. We analyzed 17 research studies relevant to these two challenges, most of which focus on generating critical testing scenarios, and subsequently assessed their relevance and applicability in practice. Our study provides the first practitioner-driven review and evaluation of current ADS testing research, reveals practical challenges in ADS testing, offers rapid insights for practitioners, and highlights the need for more context-aware, industry-relevant solutions to bridge the gap between research and practice.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper conducts an interactive rapid review of autonomous driving system (ADS) testing in collaboration with 21 practitioners from one leading automotive company. Practitioners identified 12 challenges and prioritized two on approaches to and completeness of testing for End-to-End (E2E) ADS; the authors then analyzed 17 relevant research studies (mostly on scenario generation) and assessed their practical relevance and applicability, claiming to provide the first practitioner-driven evaluation that reveals industry challenges and highlights needs for more context-aware solutions.
Significance. If the central synthesis holds after addressing scope limitations, the work offers a valuable practitioner perspective on the research-practice gap in safety-critical ADS testing, which is a strength for software engineering venues focused on empirical methods and industry collaboration. The interactive approach with practitioners is a positive element, but the narrow sample and opaque selection process limit its broader utility as a generalizable review.
major comments (2)
- [Methods (practitioner collaboration and literature analysis)] The methods description (practitioner collaboration and literature analysis sections) provides no details on the search strategy, databases, inclusion/exclusion criteria, or screening process used to identify and select the 17 relevant studies. This directly weakens the validity of the relevance/applicability assessments and the synthesis of findings on E2E ADS testing challenges.
- [Practitioner input and results sections] The practitioner sample is restricted to 21 individuals from a single automotive company. This assumption of representativeness underpins the identification of the 12 challenges, the prioritization of E2E ADS testing issues, and the claims of revealing 'practical challenges in ADS testing' and offering 'rapid insights for practitioners'; automotive firms vary substantially in architectures, standards, and testing contexts, so the gap-bridging conclusions rest on untested external validity.
minor comments (2)
- [Abstract] The abstract's claim of being 'the first practitioner-driven review' would benefit from a short qualification or reference to prior ADS testing reviews to avoid overstatement.
- [Throughout the manuscript] Ensure all acronyms (ADS, E2E) are defined on first use and that the applicability assessment criteria are explicitly listed for reader evaluation.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript to enhance transparency and appropriately scope our claims.
read point-by-point responses
-
Referee: The methods description (practitioner collaboration and literature analysis sections) provides no details on the search strategy, databases, inclusion/exclusion criteria, or screening process used to identify and select the 17 relevant studies. This directly weakens the validity of the relevance/applicability assessments and the synthesis of findings on E2E ADS testing challenges.
Authors: We agree that the current manuscript lacks sufficient detail on the literature selection process. As this is a rapid review driven by the two practitioner-prioritized challenges rather than a comprehensive systematic review, the 17 studies were identified through targeted searches for relevance to E2E ADS testing approaches and completeness. We will add a new subsection to the Methods section explicitly describing the search strategy (including keywords such as 'end-to-end autonomous driving testing' and 'scenario generation for ADS'), databases (IEEE Xplore, ACM Digital Library, Google Scholar), inclusion/exclusion criteria (e.g., peer-reviewed studies from 2018 onward focusing on E2E systems, excluding purely simulation-only works without testing implications), and the two-stage screening process. This will allow readers to assess the validity of our relevance and applicability evaluations. revision: yes
-
Referee: The practitioner sample is restricted to 21 individuals from a single automotive company. This assumption of representativeness underpins the identification of the 12 challenges, the prioritization of E2E ADS testing issues, and the claims of revealing 'practical challenges in ADS testing' and offering 'rapid insights for practitioners'; automotive firms vary substantially in architectures, standards, and testing contexts, so the gap-bridging conclusions rest on untested external validity.
Authors: We accept that the single-company sample limits generalizability and do not assert that the identified challenges or their prioritization apply universally across the automotive sector. The study is framed as an in-depth interactive rapid review with one leading company, which provides unique access to industrial perspectives often unavailable in public literature. We will revise the manuscript by adding an explicit Limitations section (or expanding Threats to Validity) that discusses the single-company scope, rephrases broader claims (e.g., changing 'reveals practical challenges in ADS testing' to 'reveals practical challenges in ADS testing within the context of the collaborating company'), and positions the work as a foundation for future multi-company studies rather than a definitive industry-wide synthesis. revision: partial
Circularity Check
No circularity: qualitative synthesis of practitioner input and literature
full rationale
The paper performs an interactive rapid review: practitioners from one firm identify 12 challenges and prioritize two, after which the authors select and assess 17 studies for relevance. No equations, parameters, predictions, or derivations exist. No self-citations are invoked as load-bearing premises, and the central claims (revealing challenges, assessing applicability) are direct outputs of the described process rather than reductions to prior self-referential results. The single-company sample raises external-validity concerns but does not create circularity by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
ACM. 2025. ACM Digital Library. https://dl.acm.org/ (last accessed: April 16 2026)
2025
-
[2]
Victor Basili, Lionel Briand, Domenico Bianculli, Shiva Nejati, Fabrizio Pastore, and Mehrdad Sabetzadeh. 2018. Software engineering research and industry: a symbiotic relationship to foster impact.IEEE Software35, 5 (2018), 44–49
2018
-
[3]
Felix Beringhoff, Joel Greenyer, Christian Roesener, and Matthias Tichy. 2022. Thirty-one challenges in testing automated vehicles: Interviews with experts from industry and research. In2022 IEEE Intelligent Vehicles Symposium (IV). IEEE, 360–366
2022
-
[4]
Lionel Briand, Domenico Bianculli, Shiva Nejati, Fabrizio Pastore, and Mehrdad Sabetzadeh. 2017. The case for context-driven software engineering research: generalizability is overrated.IEEE Software34, 5 (2017), 72–75
2017
-
[5]
Jinkang Cai, Weiwen Deng, Haoran Guang, Ying Wang, Jiangkun Li, and Juan Ding. 2022. A survey on data-driven scenario generation for automated vehicle testing.Machines10, 11 (2022), 1101
2022
-
[6]
Li Chen, Penghao Wu, Kashyap Chitta, Bernhard Jaeger, Andreas Geiger, and Hongyang Li. 2024. End-to-End Autonomous Driving: Challenges and Frontiers. IEEE Transactions on Pattern Analysis and Machine Intelligence46, 12 (2024), 10164–10183. doi:10.1109/TPAMI.2024.3435937
- [7]
-
[8]
Cornell University. 2025. arXiv. https://arxiv.org/ (last accessed: April 16 2026)
2025
-
[9]
Elsevier. 2025. Scopus Content. https://www.elsevier.com/products/scopus/ content (last accessed: April 16 2026)
2025
-
[10]
Yuan Gao, Mattia Piccinini, Yuchen Zhang, Dingrui Wang, Korbinian Moller, Roberto Brusnicki, Baha Zarrouki, Alessio Gambi, Jan Frederik Totz, Kai Storms, et al. 2026. Foundation models in autonomous driving: A survey on scenario generation and scenario analysis.IEEE Open Journal of Intelligent Transportation Systems(2026)
2026
-
[11]
Jiaheng Geng, Jiatong Du, Xinyu Zhang, Ye Li, Panqu Wang, and Yanjun Huang
-
[12]
Driving in Corner Case: A Real-World Adversarial Closed-Loop Evaluation Platform for End-to-End Autonomous Driving.arXiv preprint arXiv:2512.16055 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[13]
Fitash Ul Haq, Donghwan Shin, and Lionel C Briand. 2023. Many-objective rein- forcement learning for online testing of dnn-enabled systems. In2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1814–1826
2023
-
[14]
IEEE. 2025. IEEE Xplore. https://ieeexplore.ieee.org/Xplore/home.jsp (last accessed: April 16 2026)
2025
-
[15]
Pengliang Ji, Ruan Li, Yunzhi Xue, Qian Dong, Limin Xiao, and Rui Xue. 2021. Per- spective, survey and trends: Public driving datasets and toolsets for autonomous driving virtual test. In2021 IEEE International Intelligent Transportation Systems Conference (ITSC). IEEE, 264–269
2021
-
[16]
Yuxiong Ji, Zhongke Xu, Cong Zhao, Kun Chen, and Yuchuan Du. 2025. Accel- erated Testing and Evaluation for Black-Box Autonomous Driving Systems via Adaptive Markov Chain Monte Carlo.IEEE Transactions on Intelligent Transporta- tion Systems26, 5 (2025), 6463–6476
2025
-
[17]
Yue Kang, Hang Yin, and Christian Berger. 2019. Test your self-driving algo- rithm: An overview of publicly available driving datasets and virtual testing environments.IEEE Transactions on Intelligent Vehicles4, 2 (2019), 171–185
2019
-
[18]
Alessia Knauss, Jan Schröder, Christian Berger, and Henrik Eriksson. 2017. Paving the roadway for safety of automated vehicles: An empirical study on testing challenges. In2017 IEEE Intelligent Vehicles Symposium (IV). IEEE, 1873–1880
2017
-
[19]
Patricia Lago, Per Runeson, Qunying Song, and Roberto Verdecchia. 2024. Threats to validity in software engineering–hypocritical paper section or essential analy- sis?. InProceedings of the 18th ACM/IEEE International symposium on empirical software engineering and measurement. 314–324
2024
- [20]
- [21]
-
[22]
Yihan Liao, Jingyu Zhang, Jacky Keung, Yan Xiao, and Yurou Dai. 2025. Advancing autonomous driving system testing: Demands, challenges, and future directions. Information and Software Technology(2025), 107859
2025
-
[23]
Guannan Lou, Yao Deng, Xi Zheng, Mengshi Zhang, and Tianyi Zhang. 2022. Testing of autonomous driving systems: where are we and where should we go?. InProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 31–43
2022
-
[24]
Chengjie Lu, Shaukat Ali, and Tao Yue. 2024. Epitester: Testing autonomous vehicles with epigenetic algorithm and attention mechanism.IEEE Transactions on Software Engineering50, 10 (2024), 2614–2632
2024
-
[25]
Jing Ma, Xiaobo Che, Yanqiang Li, and Edmund M-K Lai. 2021. Traffic scenarios for automated vehicle testing: A review of description languages and systems. Machines9, 12 (2021), 342
2021
-
[26]
Sagar Pathrudkar, Saadhana Venkataraman, Deepika Kanade, Aswin Ajayan, Palash Gupta, Shehzaman Khatib, Vijaya Sarathi Indla, and Saikat Mukherjee
- [27]
-
[28]
Sergio Rico, Nauman Bin Ali, Emelie Engström, and Martin Höst. 2020. Guidelines for conducting interactive rapid reviews in software engineering–from a focus on technology transfer to knowledge exchange.Technical Report(2020)
2020
-
[29]
Sergio Rico, Nauman Bin Ali, Emelie Engström, and Martin Höst. 2024. Experi- ences from conducting rapid reviews in collaboration with practitioners—Two industrial cases.Information and Software Technology167 (2024), 107364
2024
-
[30]
Stefan Riedmaier, Thomas Ponn, Dieter Ludwig, Bernhard Schick, and Frank Diermeyer. 2020. Survey on scenario-based safety assessment of automated vehicles.IEEE access8 (2020), 87456–87477
2020
-
[31]
Francisca Rosique, Pedro J Navarro, Carlos Fernández, and Antonio Padilla. 2019. A systematic review of perception system and simulators for autonomous vehicles research.Sensors19, 3 (2019), 648
2019
-
[32]
Per Runeson, Emelie Engström, and Margaret-Anne Storey. 2020. The design science paradigm as a frame for empirical software engineering. InContemporary empirical methods in software engineering. Springer, 127–147
2020
-
[33]
Qunying Song, Markus Borg, Emelie Engström, Håkan Ardö, and Sergio Rico
-
[34]
InProceedings of the 1st International Conference on AI Engineering: Software Engineering for AI
Exploring ML testing in practice: Lessons learned from an interactive rapid review with axis communications. InProceedings of the 1st International Conference on AI Engineering: Software Engineering for AI. 10–21
-
[35]
Qunying Song, Emelie Engström, and Per Runeson. 2024. An empirically grounded path forward for scenario-based testing of autonomous driving sys- tems. InCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering. 232–243
2024
-
[36]
Qunying Song, Emelie Engström, and Per Runeson. 2024. Industry practices for challenging autonomous driving systems with critical scenarios.ACM Transac- tions on Software Engineering and Methodology33, 4 (2024), 1–35
2024
-
[37]
Qunying Song, Ali Nouri, Håkan Sivencrona, Mark Harman, and Federica Sarro
-
[38]
Supplementary Material for Interactive Rapid Review on ADS Testing in Industry. doi:10.5281/zenodo.19627023
-
[39]
Qunying Song and Per Runeson. 2023. Industry-academia collaboration for real- ism in software engineering research: Insights and recommendations.Information and Software Technology156 (2023), 107135
2023
- [40]
-
[41]
Jian Sun, He Zhang, Huajun Zhou, Rongjie Yu, and Ye Tian. 2021. Scenario-based test automation for highly automated vehicles: A review and paving the way for systematic safety assurance.IEEE transactions on intelligent transportation systems23, 9 (2021), 14088–14103
2021
-
[42]
Shuncheng Tang, Zhenya Zhang, Yi Zhang, Jixiang Zhou, Yan Guo, Shuang Liu, Shengjian Guo, Yan-Fu Li, Lei Ma, Yinxing Xue, et al . 2023. A survey on automated driving system testing: Landscapes and trends.ACM Transactions on Software Engineering and Methodology32, 5 (2023), 1–62
2023
-
[43]
Hanlin Tian, Kethan Reddy, Yuxiang Feng, Mohammed Quddus, Yiannis Demiris, and Panagiotis Angeloudis. 2025. Large (vision) language models for autonomous vehicles: Current trends and future directions.IEEE Transactions on Intelligent Transportation Systems27, 1 (2025), 187–210
2025
-
[44]
Roberto Verdecchia, Emelie Engström, Patricia Lago, Per Runeson, and Qunying Song. 2023. Threats to validity in software engineering research: A critical reflection.Information and Software Technology164 (2023), 107329
2023
-
[45]
Tong Wang, Xiaohui Kuang, Hu Li, Qianjin Du, Zhanhao Hu, Huan Deng, and Gang Zhao. 2023. Driving into danger: Adversarial patch attack on end-to-end autonomous driving systems using deep learning. In2023 IEEE Symposium on Computers and Communications (ISCC). IEEE, 995–1000
2023
-
[46]
Jiahui Wu, Chengjie Lu, Aitor Arrieta, and Shaukat Ali. 2025. Multi-objective reinforcement learning for critical scenario generation of autonomous vehicles. arXiv preprint arXiv:2502.15792(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[47]
Xiongfei Wu, Mingfei Cheng, Xiaoning Ren, Qiang Hu, Jianlang Chen, Yuheng Huang, Maxime Cordy, Yao Zhang, Xiaofei Xie, Lei Ma, et al. 2026. Foundation Models for Autonomous Driving Systems: An Initial Roadmap.ACM Transactions on Software Engineering and Methodology(2026)
2026
-
[48]
Songyang Yan, Xiaodong Zhang, Kunkun Hao, Haojie Xin, Yonggang Luo, Jucheng Yang, Ming Fan, Chao Yang, Jun Sun, and Zijiang Yang. 2025. On- demand scenario generation for testing automated driving systems.Proceedings of the ACM on Software Engineering2, FSE (2025), 86–105
2025
-
[49]
Yuhang Yang, Kalle Kujanpää, I Amin Babadi, Joni Pajarinen, and Alexander Ilin
-
[50]
In2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC)
Suicidal pedestrian: Generation of safety-critical scenarios for autonomous vehicles. In2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 1983–1988
1983
-
[51]
Hossein Yousefizadeh, Shenghui Gu, Lionel C Briand, and Ali Nasr. 2025. Con- strained Co-evolutionary Metamorphic Differential Testing for Autonomous Systems with an Interpretability Approach.arXiv preprint arXiv:2509.16478 (2025). Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Song et al
-
[52]
Jingyu Zhang, Jacky Wai Keung, Yan Xiao, Yihan Liao, Yishu Li, and Xiaoxue Ma
-
[53]
Uniada: Universal adaptive multiobjective adversarial attack for end-to- end autonomous driving systems.IEEE Transactions on Reliability73, 4 (2024), 1892–1906
2024
-
[54]
Xinhai Zhang, Jianbo Tao, Kaige Tan, Martin Törngren, José Manuel Gaspar Sánchez, Muhammad Rusyadi Ramli, Xin Tao, Magnus Gyllenhammar, Franz Wotawa, Naveen Mohan, et al. 2022. Finding critical scenarios for automated driving systems: A systematic mapping study.IEEE Transactions on Software Engineering49, 3 (2022), 991–1026
2022
-
[55]
Yongqi Zhao, Ji Zhou, Dong Bi, Tomislav Mihalj, Jia Hu, and Arno Eichberger
-
[56]
A survey on the application of large language models in scenario-based test- ing of automated driving systems.IEEE Transactions on Intelligent Transportation Systems(2026)
2026
-
[57]
Yixing Zheng, Yizhuo Xiao, Zhongpan Zhu, Mustafa Suphi Erden, and Cheng Wang. 2025. CADiffusion: Controllable Adversarial Diffusion for Attacking Lane Detection of Autonomous Vehicles. In2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC). IEEE, 4516–4522
2025
-
[58]
Ziyuan Zhong, Gail Kaiser, and Baishakhi Ray. 2022. Neural network guided evolutionary fuzzing for finding traffic violations of autonomous vehicles.IEEE Transactions on Software Engineering49, 4 (2022), 1860–1875
2022
- [59]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.