Recognition: no theorem link
MDrive: Benchmarking Closed-Loop Cooperative Driving for End-to-End Multi-agent Systems
Pith reviewed 2026-05-12 03:52 UTC · model grok-4.3
The pith
Multi-agent V2X systems outperform single-agent driving in closed-loop tests but perception sharing and negotiation show clear limits.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MDrive establishes that multi-agent cooperative systems are generally better than single-agent counterparts across 225 scenarios. Perception sharing improves what agents see but does not reliably produce better planning outputs. Negotiation helps planning in many cases yet reduces performance when traffic is complex and dense. The benchmark uses NHTSA pre-crash typologies and real-world V2X datasets to create closed-loop evaluations that capture interactive driving behavior.
What carries the argument
The MDrive benchmark, which runs end-to-end multi-agent driving systems in closed-loop simulation with perception sharing and negotiation protocols across 225 interactive scenarios.
If this is right
- Cooperative driving methods can be pursued for measurable gains over isolated agents in interactive conditions.
- Systems must address the disconnect between improved perception and downstream planning decisions.
- Negotiation mechanisms require safeguards to prevent performance drops in high-density traffic.
- The provided scenario generation and Real2Sim tools enable consistent reproduction and extension of the evaluation.
Where Pith is reading between the lines
- Better data-fusion techniques between shared perception and planning modules could close the observed performance gap.
- Testing the same systems on physical vehicle fleets would show whether simulation findings hold under real sensor noise and latency.
- The benchmark's structure could guide development of communication standards that prioritize planning-relevant information over raw sensor data.
Load-bearing premise
The chosen 225 scenarios are diverse and representative enough to support broad claims about the benefits and drawbacks of multi-agent cooperation in real driving.
What would settle it
A controlled experiment showing that single-agent systems match or exceed multi-agent performance across a new set of closed-loop scenarios with similar density and interaction would undermine the general superiority result.
Figures
read the original abstract
Vehicle-to-Everything (V2X) communication has emerged as a promising paradigm for autonomous driving, enabling connected agents to share complementary perception information and negotiate with each other to benefit the final planning. Existing V2X benchmarks, however, fall short in two ways: (i) open-loop evaluations fail to capture the inherently closed-loop nature of driving, leading to evaluation gaps, and (ii) current closed-loop evaluations lack behavioral and interactive diversity to reflect real-world driving. Thus, it is still unclear the extent of benefits of multi-agent systems for closed-loop driving. In this paper, we introduce MDrive, a closed-loop cooperative driving benchmark comprising 225 scenarios grounded in both NHTSA pre-crash typologies and real-world V2X datasets. Our benchmark results demonstrate that multi-agent systems are generally better than single-agent counterparts. However, current multi-agent systems still face two important challenges: (i) perception sharing enhances perceptions, but doesn't always translate to better planning; (ii) negotiation improves planning performance but harms it in complex and dense traffic scenarios. MDrive further provides an open-source toolbox for scenario generation, Real2Sim conversion, and human-in-the-loop simulation. Together, MDrive establishes a reproducible foundation for evaluating and improving the generalization and robustness of cooperative driving systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces MDrive, a closed-loop cooperative driving benchmark consisting of 225 scenarios grounded in NHTSA pre-crash typologies and real-world V2X datasets. It reports that multi-agent systems generally outperform single-agent counterparts in closed-loop settings, while identifying two challenges: (i) perception sharing improves perception accuracy but does not always translate to better planning, and (ii) negotiation improves planning performance but degrades it in complex and dense traffic scenarios. The work also releases an open-source toolbox for scenario generation, Real2Sim conversion, and human-in-the-loop simulation.
Significance. If the empirical results hold under closer scrutiny, MDrive fills a clear gap in V2X evaluation by moving beyond open-loop setups to closed-loop interactive driving with behavioral diversity. The explicit identification of perception-sharing and negotiation failure modes supplies concrete, actionable limitations for the community. The open-source toolbox for scenario generation and Real2Sim conversion is a genuine strength that directly supports reproducibility and future extensions.
major comments (2)
- [Abstract and results section] Abstract and results section: the claims that 'multi-agent systems are generally better than single-agent counterparts' and that the two specific challenges exist rest on comparative results, yet no details are supplied on the precise metrics (e.g., collision rate, progress, comfort), the single-agent and multi-agent baselines, statistical tests, or the protocol for selecting and executing the 225 scenarios. Without these, it is impossible to verify whether the evidence supports the stated general conclusions.
- [Benchmark construction section] Benchmark construction section: no quantitative coverage analysis (distribution over agent density, interaction complexity, or edge-case frequency) is provided for the 225 scenarios. This is load-bearing for the central claim that the observed perception-sharing and negotiation harms are intrinsic limits rather than potential artifacts of under-represented dense or highly interactive regimes.
minor comments (2)
- Add explicit definitions and measurement procedures for 'complex and dense traffic scenarios' in the results analysis so that the second challenge can be reproduced and tested by others.
- The toolbox release is welcome; include a README that documents the exact command sequence to regenerate the 225 scenarios used in the reported experiments.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our MDrive benchmark paper. The comments highlight opportunities to improve clarity and verifiability of our empirical claims, and we will incorporate revisions accordingly while preserving the core contributions of the closed-loop evaluation and open-source toolbox.
read point-by-point responses
-
Referee: [Abstract and results section] Abstract and results section: the claims that 'multi-agent systems are generally better than single-agent counterparts' and that the two specific challenges exist rest on comparative results, yet no details are supplied on the precise metrics (e.g., collision rate, progress, comfort), the single-agent and multi-agent baselines, statistical tests, or the protocol for selecting and executing the 225 scenarios. Without these, it is impossible to verify whether the evidence supports the stated general conclusions.
Authors: We agree that additional explicit details in the abstract and results section would strengthen verifiability of the claims. The manuscript describes the evaluation metrics (collision rate, progress, and comfort) and baselines in the experimental setup, with single-agent systems using standard end-to-end planners and multi-agent systems incorporating perception sharing and negotiation; scenario selection follows NHTSA pre-crash typologies matched to real-world V2X data. However, to directly address the concern, we will expand the results section with a summary table of all metrics and baselines, a clear description of the execution protocol, and reporting of statistical significance (paired t-tests with p-values) across the 225 scenarios. We will also update the abstract to reference these elements concisely. revision: yes
-
Referee: [Benchmark construction section] Benchmark construction section: no quantitative coverage analysis (distribution over agent density, interaction complexity, or edge-case frequency) is provided for the 225 scenarios. This is load-bearing for the central claim that the observed perception-sharing and negotiation harms are intrinsic limits rather than potential artifacts of under-represented dense or highly interactive regimes.
Authors: We acknowledge that a quantitative coverage analysis is important to substantiate that the identified challenges (perception-sharing not always improving planning, and negotiation harming performance in dense traffic) reflect intrinsic limits rather than sampling artifacts. The current manuscript grounds the 225 scenarios in NHTSA typologies and real-world V2X datasets to promote diversity, but does not include explicit distributions. In the revision, we will add a new subsection with quantitative analysis, including histograms and statistics on agent density, interaction complexity (e.g., number of interacting pairs), and edge-case frequency, to demonstrate coverage and support the generalizability of our findings on the two challenges. revision: yes
Circularity Check
No significant circularity in empirical benchmark
full rationale
The paper introduces an empirical benchmark (MDrive) consisting of 225 scenarios grounded in external NHTSA pre-crash typologies and real-world V2X datasets, then reports comparative simulation results between multi-agent and single-agent systems. No equations, fitted parameters, derivations, or self-referential definitions appear in the provided text. Claims about multi-agent benefits and specific challenges rest directly on the benchmark outputs rather than reducing to the paper's own inputs by construction. The work is self-contained as a benchmark release with no load-bearing self-citation chains or ansatz smuggling.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Closed-loop simulation of multi-agent driving accurately reflects real-world interactive behavior for the purpose of benchmarking.
Reference graph
Works this paper leans on
-
[1]
Cooptrack: Exploring end-to-end learning for efficient cooperative sequential perception
Jiaru Zhong, Jiahao Wang, Jiahui Xu, Xiaofan Li, Zaiqing Nie, and Haibao Yu. Cooptrack: Exploring end-to-end learning for efficient cooperative sequential perception. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 26954–26965, 2025
work page 2025
-
[2]
Runsheng Xu, Hao Xiang, Xin Xia, Xu Han, Jinlong Li, and Jiaqi Ma. Opv2v: An open benchmark dataset and fusion pipeline for perception with vehicle-to-vehicle communication. In2022 International Conference on Robotics and Automation (ICRA), pages 2583–2589. IEEE, 2022
work page 2022
-
[3]
V2x-sim: Multi-agent collaborative perception dataset and benchmark for autonomous driving
Yiming Li, Dekun Ma, Ziyan An, Zixun Wang, Yiqi Zhong, Siheng Chen, and Chen Feng. V2x-sim: Multi-agent collaborative perception dataset and benchmark for autonomous driving. IEEE Robotics and Automation Letters, 7(4):10914–10921, 2022
work page 2022
-
[4]
Rui Song, Chenwei Liang, Hu Cao, Zhiran Yan, Walter Zimmer, Markus Gross, Andreas Festag, and Alois Knoll. Collaborative semantic occupancy prediction with hybrid feature fusion in connected automated vehicles. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17996–18006, 2024
work page 2024
-
[5]
Zewei Zhou, Hao Xiang, Zhaoliang Zheng, Seth Z Zhao, Mingyue Lei, Yun Zhang, Tianhui Cai, Xinyi Liu, Johnson Liu, Maheswari Bajji, et al. V2XPnP: Vehicle-to-everything spatio-temporal fusion for multi-agent perception and prediction.Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
work page 2025
-
[6]
Walter Zimmer, Gerhard Arya Wardana, Suren Sritharan, Xingcheng Zhou, Rui Song, and Alois C. Knoll. Tumtraf v2x cooperative perception dataset. In2024 IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR). IEEE/CVF, 2024
work page 2024
-
[7]
Zhao, Letian Gao, Zewei Zhou, Tianhui Cai, Yun Zhang, and Jiaqi Ma
Hao Xiang, Zhaoliang Zheng, Xin Xia, Seth Z. Zhao, Letian Gao, Zewei Zhou, Tianhui Cai, Yun Zhang, and Jiaqi Ma. V2x-realo: An open online framework and dataset for cooperative perception in reality, 2025. URLhttps://arxiv.org/abs/2503.10034
-
[8]
Seth Z Zhao, Huizhi Zhang, Zhaowei Li, Juntong Peng, Anthony Chui, Zewei Zhou, Zonglin Meng, Hao Xiang, Zhiyu Huang, Fujia Wang, et al. Quantv2x: A fully quantized multi-agent system for cooperative perception.arXiv preprint arXiv:2509.03704, 2025
-
[9]
Coopre: Cooperative pretraining for v2x cooperative perception.arXiv preprint arXiv:2408.11241, 2024
Seth Z Zhao, Hao Xiang, Chenfeng Xu, Xin Xia, Bolei Zhou, and Jiaqi Ma. Coopre: Cooperative pretraining for v2x cooperative perception.arXiv preprint arXiv:2408.11241, 2024
-
[10]
Zewei Zhou, Seth Z Zhao, Tianhui Cai, Zhiyu Huang, Bolei Zhou, and Jiaqi Ma. Turbotrain: Towards efficient and balanced multi-task learning for multi-agent perception and prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4391–4402, 2025
work page 2025
-
[11]
End-to-end autonomous driving through v2x cooperation
Haibao Yu, Wenxian Yang, Jiaru Zhong, Zhenwei Yang, Siqi Fan, Ping Luo, and Zaiqing Nie. End-to-end autonomous driving through v2x cooperation. InProceedings of the AAAI conference on artificial intelligence, volume 39, pages 9598–9606, 2025
work page 2025
-
[12]
Mingyue Lei, Zewei Zhou, Hongchen Li, Jia Hu, and Jiaqi Ma. CooperRisk: A driving risk quantification pipeline with multi-agent cooperative perception and prediction.arXiv preprint arXiv:2506.15868, 2025
-
[13]
Bingyi Liu, Jian Teng, Hongfei Xue, Enshu Wang, Chuanhui Zhu, Pu Wang, and Libing Wu. mmcooper: A multi-agent multi-stage communication-efficient and collaboration-robust cooperative perception framework. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 28396–28406, 2025
work page 2025
-
[14]
Yun Zhang, Zhaoliang Zheng, Johnson Liu, Zhiyu Huang, Zewei Zhou, Zonglin Meng, Tianhui Cai, and Jiaqi Ma. Mic-bev: Multi-infrastructure camera bird’s-eye-view transformer with relation-aware fusion for 3d object detection.arXiv preprint arXiv:2510.24688, 2025. 11
-
[15]
Mingyue Lei, Zewei Zhou, Hongchen Li, Jiaqi Ma, and Jia Hu. Risk map as middleware: Toward interpretable cooperative end-to-end autonomous driving for risk-aware planning.IEEE Robotics and Automation Letters, 11(1):818–825, 2025
work page 2025
-
[16]
BridgeSim: Unveiling the OL-CL Gap in End-to-End Autonomous Driving
Seth Z Zhao, Luobin Wang, Hongwei Ruan, Yuxin Bao, Yilan Chen, Ziyang Leng, Abhijit Ravichandran, Honglin He, Zewei Zhou, Xu Han, et al. Bridgesim: Unveiling the ol-cl gap in end-to-end autonomous driving.arXiv preprint arXiv:2604.10856, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[17]
Peter Karkus, Maximilian Igl, Yuxiao Chen, Kashyap Chitta, Jef Packer, Bertrand Douillard, Ran Tian, Alexander Naumann, Guillermo Garcia-Cobo, Shuhan Tan, et al. Beyond behavior cloning in autonomous driving: a survey of closed-loop training techniques.Authorea Preprints
-
[18]
Zhiqi Li, Zhiding Yu, Shiyi Lan, Jiahan Li, Jan Kautz, Tong Lu, and Jose M Alvarez. Is ego status all you need for open-loop end-to-end autonomous driving? InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14864–14873, 2024
work page 2024
-
[19]
Bench2drive: Towards multi-ability evaluation for end-to-end autonomous driving
Shaocong Jia, Penghan Wu, Li Chen, Jiazhao Jiang, Junchi Yan, and Hongyang Li. Bench2drive: Towards multi-ability evaluation for end-to-end autonomous driving. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16413– 16423, 2024
work page 2024
-
[20]
Changxing Liu, Genjia Liu, Zijun Wang, Jinchang Yang, and Siheng Chen. Colmdriver: Llm- based negotiation benefits cooperative autonomous driving.arXiv preprint arXiv:2503.08683, 2025
-
[21]
Jiaxun Cui, Chen Tang, Jarrett Holtz, Janice Nguyen, Alessandro G Allievi, Hang Qiu, and Peter Stone. Coopreflect: Towards natural language communication for cooperative autonomous driving via multiagent learning. InInternational Conference on Autonomous Agents and Multi-Agent Systems, 2026
work page 2026
-
[22]
Pre-crash scenario typology for crash avoidance research
Wassim G Najm, John D Smith, and Mikio Yanagisawa. Pre-crash scenario typology for crash avoidance research. 2007
work page 2007
-
[23]
LEAD: Minimizing Learner-Expert Asymmetry in End-to-End Driving
Long Nguyen, Micha Fauth, Bernhard Jaeger, Daniel Dauner, Maximilian Igl, Andreas Geiger, and Kashyap Chitta. Lead: Minimizing learner-expert asymmetry in end-to-end driving.arXiv preprint arXiv:2512.20563, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[24]
Fail2Drive: Benchmarking Closed-Loop Driving Generalization
Simon Gerstenecker, Andreas Geiger, and Katrin Renz. Fail2drive: Benchmarking closed-loop driving generalization.arXiv preprint arXiv:2604.08535, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[25]
Holger Caesar, Juraj Kabzan, Kok Seang Tan, Whye Kit Fong, Eric Wolff, Alex Lang, Luke Fletcher, Oscar Beijbom, and Sammy Omari. nuplan: A closed-loop ml-based planning benchmark for autonomous vehicles.arXiv preprint arXiv:2106.11810, 2021
-
[26]
Daniel Dauner, Marcel Hallgarten, Tianyu Li, Xinshuo Weng, Zhiyu Huang, Zetong Yang, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, et al. Navsim: Data-driven non- reactive autonomous vehicle simulation and benchmarking.Advances in Neural Information Processing Systems, 37:28706–28719, 2024
work page 2024
-
[27]
Pseudo-simulation for autonomous driving
Wei Cao, Marcel Hallgarten, Tianyu Li, Daniel Dauner, Xunjiang Gu, Caojun Wang, Yakov Miron, Marco Aiello, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, Andreas Geiger, and Kashyap Chitta. Pseudo-simulation for autonomous driving. InConference on Robot Learning (CoRL), 2025
work page 2025
-
[28]
Hongyu Zhou, Longzhong Lin, Jiabao Wang, Yichong Lu, Dongfeng Bai, Bingbing Liu, Yue Wang, Andreas Geiger, and Yiyi Liao. Hugsim: A real-time, photo-realistic and closed-loop simulator for autonomous driving.arXiv preprint arXiv:2412.01718, 2024
-
[29]
Xuemeng Yang, Licheng Wen, Yukai Ma, Jianbiao Mei, Xin Li, Tiantian Wei, Wenjie Lei, Daocheng Fu, Pinlong Cai, Min Dou, Botian Shi, Liang He, Yong Liu, and Yu Qiao. Drivearena: A closed-loop generative simulation platform for autonomous driving.arXiv preprint arXiv:2408.00415, 2024. 12
-
[31]
Zhiqi Li, Wenhai Wang, Hongyang Li, Enze Xie, Chonghao Sima, Tong Lu, Qiao Yu, and Jifeng Dai. Bevformer: learning bird’s-eye-view representation from lidar-camera via spatiotemporal transformers.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024
work page 2024
-
[32]
Pointpillars: Fast encoders for object detection from point clouds
Alex H Lang, Sourabh V ora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom. Pointpillars: Fast encoders for object detection from point clouds. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12697–12705, 2019
work page 2019
-
[33]
MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and Guided Intention Querying,
Shaoshuai Shi, Li Jiang, Dengxin Dai, and Bernt Schiele. Mtr++: Multi-agent motion prediction with symmetric scene modeling and guided intention querying.arXiv preprint arXiv:2306.17770, 2023
-
[34]
Zikang Zhou, Zihao Wen, Jianping Wang, Yung-Hui Li, and Yu-Kai Huang. Qcnext: A next-generation framework for joint multi-agent trajectory prediction.arXiv preprint arXiv:2306.10508, 2023
-
[35]
A comprehensive study of speed prediction in transportation system: From vehicle to traffic
Zewei Zhou, Ziru Yang, Yuanjian Zhang, Yanjun Huang, Hong Chen, and Zhuoping Yu. A comprehensive study of speed prediction in transportation system: From vehicle to traffic. Iscience, 25(3), 2022
work page 2022
-
[36]
A reliable path planning method for lane change based on hybrid pso-iaco algorithm
Zewei Zhou, Zhuoping Yu, Lu Xiong, Dequan Zeng, Zhiqiang Fu, Zhuoren Li, and Bo Leng. A reliable path planning method for lane change based on hybrid pso-iaco algorithm. In2021 6th International Conference on Transportation Information and Safety (ICTIS), pages 1253–1258. IEEE, 2021
work page 2021
-
[37]
Zhiyu Huang, Xinshuo Weng, Maximilian Igl, Yuxiao Chen, Yulong Cao, Boris Ivanovic, Marco Pavone, and Chen Lv. Gen-drive: Enhancing diffusion generative driving policies with reward modeling and reinforcement learning fine-tuning.arXiv preprint arXiv:2410.05582, 2024
-
[38]
Li Chen, Penghao Wu, Kashyap Chitta, Bernhard Jaeger, Andreas Geiger, and Hongyang Li. End-to-end autonomous driving: Challenges and frontiers.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024
work page 2024
-
[39]
Driving on registers.preprint, 2026
Ellington Kirby, Alexandre Boulch, Yihong Xu, Yuan Yin, Gilles Puy, Eloi Zablocki, Andrei Bursuc, Spyros Gidaris, Renaud Marlet, Florent Bartoccioni, Anh-Quan Cao, Nermin Samet, Tuan-Hung Vu, and Matthieu Cord. Driving on registers.preprint, 2026
work page 2026
-
[40]
RAP: 3D rasterization augmented end-to-end planning.arXiv preprint arXiv:2510.04333, 2025
Lan Feng, Yang Gao, Eloi Zablocki, Quanyi Li, Wuyang Li, Sichao Liu, Matthieu Cord, and Alexandre Alahi. Rap: 3d rasterization augmented end-to-end planning, 2025. URL https://arxiv.org/abs/2510.04333
-
[41]
Zewei Zhou, Tianhui Cai, Yun Zhao, Seth Z.and Zhang, Zhiyu Huang, Bolei Zhou, and Jiaqi Ma. AutoVLA: A vision-language-action model for end-to-end autonomous driving with adaptive reasoning and reinforcement fine-tuning.Advances in Neural Information Processing Systems (NeurIPS), 2025
work page 2025
-
[42]
Zewei Zhou, Ruining Yang, Yiluan Guo, Sherry X Chen, Tao Feng, Kateryna Pistunova, Yishan Shen, Lili Su, Jiaqi Ma, et al. SpanVLA: Efficient action bridging and learning from negative-recovery samples for vision-language-action model.arXiv preprint arXiv:2604.19710, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[43]
Jialv Zou, Shaoyu Chen, Bencheng Liao, Zhiyu Zheng, Yuehao Song, Lefei Zhang, Qian Zhang, Wenyu Liu, and Xinggang Wang. Diffusiondrivev2: Reinforcement learning-constrained trun- cated diffusion modeling in end-to-end autonomous driving.arXiv preprint arXiv:2512.07745, 2025
-
[44]
2025 waymo open dataset challenge: Vision-based end-to-end driving
o Research. 2025 waymo open dataset challenge: Vision-based end-to-end driving. https: //waymo.com/open/challenges/2025/e2e-driving/, 2025. Accessed: 2025-04-25. 13
work page 2025
-
[45]
Learning distilled collaboration graph for multi-agent perception
Yiming Li, Shunli Ren, Pengxiang Wu, Siheng Chen, Chen Feng, and Wenjun Zhang. Learning distilled collaboration graph for multi-agent perception. InThirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021), 2021
work page 2021
-
[46]
Yiming Li, Shunli Ren, Pengxiang Wu, Siheng Chen, Chen Feng, and Wenjun Zhang. Learning distilled collaboration graph for multi-agent perception.Advances in Neural Information Processing Systems, 34:29541–29552, 2021
work page 2021
-
[47]
Where2comm: communication-efficient collaborative perception via spatial confidence maps
Yue Hu, Shaoheng Fang, Zixing Lei, Yiqi Zhong, and Siheng Chen. Where2comm: communication-efficient collaborative perception via spatial confidence maps. InProceedings of the 36th International Conference on Neural Information Processing Systems, NIPS ’22, Red Hook, NY , USA, 2022. Curran Associates Inc. ISBN 9781713871088
work page 2022
-
[48]
An extensible framework for open heterogeneous collaborative perception
Yifan Lu, Yue Hu, Yiqi Zhong, Dequan Wang, Siheng Chen, and Yanfeng Wang. An extensible framework for open heterogeneous collaborative perception. InThe Twelfth International Conference on Learning Representations, 2024
work page 2024
-
[49]
URL https://doi.org/10.1109/ICRA48891.2023.10160591
Hao Xiang, Runsheng Xu, Xin Xia, Zhaoliang Zheng, Bolei Zhou, and Jiaqi Ma. V2xp-asg: Generating adversarial scenes for vehicle-to-everything perception. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 3584–3591, 2023. doi: 10.1109/ ICRA48891.2023.10161384
-
[50]
Coopre: Cooperative pretraining for v2x cooperative perception
Seth Z Zhao, Hao Xiang, Chenfeng Xu, Xin Xia, Bolei Zhou, and Jiaqi Ma. Coopre: Cooperative pretraining for v2x cooperative perception. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 11765–11772. IEEE, 2025
work page 2025
-
[51]
Haibao Yu, Yingjuan Tang, Enze Xie, Jilei Mao, Ping Luo, and Zaiqing Nie. Flow-based feature fusion for vehicle-infrastructure cooperative 3d object detection.Advances in Neural Information Processing Systems, 36, 2024
work page 2024
-
[52]
Hsu-kuang Chiu, Ryo Hachiuma, Chien-Yi Wang, Yu-Chiang Frank Wang, Min-Hung Chen, and Stephen F Smith. V2v-got: Vehicle-to-vehicle cooperative autonomous driving with multimodal large language models and graph-of-thoughts.arXiv preprint arXiv:2509.18053, 2025
-
[53]
Junwei You, Pei Li, Zhuoyu Jiang, Weizhe Tang, Zilin Huang, Rui Gan, Jiaxi Liu, Yan Zhao, Sikai Chen, and Bin Ran. V2x-qa: A comprehensive reasoning dataset and benchmark for multimodal large language models in autonomous driving across ego, infrastructure, and cooperative views.arXiv preprint arXiv:2604.02710, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[54]
Recent advances in reinforcement learning-based autonomous driving behavior planning: A survey
Jingda Wu, Chao Huang, Hailong Huang, Chen Lv, Yuntong Wang, and Fei-Yue Wang. Recent advances in reinforcement learning-based autonomous driving behavior planning: A survey. Transportation Research Part C: Emerging Technologies, 164:104654, 2024
work page 2024
-
[55]
CARLA: An open urban driving simulator
Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. CARLA: An open urban driving simulator. InProceedings of the 1st Annual Conference on Robot Learning, pages 1–16, 2017
work page 2017
-
[56]
Metadrive: Composing diverse driving scenarios for generalizable reinforcement learning
Quanyi Li, Zhenghao Peng, Lan Feng, Qihang Zhang, Zhenghai Xue, and Bolei Zhou. Metadrive: Composing diverse driving scenarios for generalizable reinforcement learning. TPAMI, 2022
work page 2022
-
[57]
Deepdrive Team. Deepdrive: a simulator that allows anyone with a pc to push the state-of-the-art in self-driving.https://github.com/deepdrive/deepdrive
-
[58]
RAD: Training an end-to-end driving policy via large-scale 3DGS-based reinforcement learning
Hao Gao, Shaoyu Chen, Bo Jiang, Bencheng Liao, Yiang Shi, Xiaoyang Guo, Yuechuan Pu, haoran yin, Xiangyu Li, xinbang zhang, ying zhang, Wenyu Liu, Qian Zhang, and Xinggang Wang. RAD: Training an end-to-end driving policy via large-scale 3DGS-based reinforcement learning. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. 14
work page 2025
-
[59]
Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes
Xiaoyu Zhou, Zhiwei Lin, Xiaojun Shan, Yongtao Wang, Deqing Sun, and Ming-Hsuan Yang. Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 21634–21643, 2024
work page 2024
-
[60]
Coda-4dgs: Dynamic gaussian splatting with context and deformation awareness for autonomous driving
Rui Song, Chenwei Liang, Yan Xia, Walter Zimmer, Hu Cao, Holger Caesar, Andreas Festag, and Alois Knoll. Coda-4dgs: Dynamic gaussian splatting with context and deformation awareness for autonomous driving. InIEEE/CVF International Conference on Computer Vision (ICCV). IEEE/CVF, 2025
work page 2025
-
[61]
Street gaussians: Modeling dynamic urban scenes with gaussian splatting
Yunzhi Yan, Haotong Lin, Chenxu Zhou, Weijie Wang, Haiyang Sun, Kun Zhan, Xianpeng Lang, Xiaowei Zhou, and Sida Peng. Street gaussians: Modeling dynamic urban scenes with gaussian splatting. InEuropean Conference on Computer Vision, pages 156–173. Springer, 2024
work page 2024
-
[62]
EnerGS: Energy-Based Gaussian Splatting with Partial Geometric Priors
Rui Song, Tianhui Cai, Markus Gross, Yun Zhang, Walter Zimmer, Zhiyu Huang, Olaf Wysocki, and Jiaqi Ma. Energs: Energy-based gaussian splatting with partial geometric priors.arXiv preprint arXiv:2604.26238, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[63]
Yurui Chen, Chun Gu, Junzhe Jiang, Xiatian Zhu, and Li Zhang. Periodic vibration gaussian: Dynamic urban scene reconstruction and real-time rendering.International Journal of Computer Vision, 134(3):83, 2026
work page 2026
-
[64]
Opencda: an open coopera- tive driving automation framework integrated with co-simulation
Runsheng Xu, Yi Guo, Xu Han, Xin Xia, Hao Xiang, and Jiaqi Ma. Opencda: an open coopera- tive driving automation framework integrated with co-simulation. In2021 IEEE International Intelligent Transportation Systems Conference (ITSC), pages 1155–1162. IEEE, 2021
work page 2021
-
[65]
Genjia Liu, Yue Hu, Chenxin Xu, Weibo Mao, Junhao Ge, Zhengxiang Huang, Yifan Lu, Yinda Xu, Junkai Xia, Yafei Wang, et al. Towards collaborative autonomous driving: Simulation platform and end-to-end system.IEEE transactions on pattern analysis and machine intelligence, 2025
work page 2025
-
[66]
Scenarionet: Open-source platform for large-scale traffic scenario simulation and modeling
Quanyi Li, Zhenghao Peng, Lan Feng, Zhizheng Liu, Chenda Duan, Wenjie Mo, and Bolei Zhou. Scenarionet: Open-source platform for large-scale traffic scenario simulation and modeling. Advances in Neural Information Processing Systems, 2023
work page 2023
-
[67]
Katharina Winter, Mark Azer, and Fabian B Flohr. Bevdriver: Leveraging bev maps in llms for robust closed-loop driving.arXiv preprint arXiv:2503.03074, 2025
-
[68]
Carla autonomous driving leaderboard
CARLA Team. Carla autonomous driving leaderboard. https://leaderboard.carla.org/ leaderboard/, 2026. Accessed: 2026-05-01
work page 2026
-
[69]
Planning-oriented autonomous driving
Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Xizhou Zhu, Siqi Chai, Senyao Du, Tianwei Lin, Wenhai Wang, Lewei Lu, Xiaosong Jia, Qiang Liu, Jifeng Dai, Yu Qiao, and Hongyang Li. Planning-oriented autonomous driving. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
work page 2023
-
[70]
Vad: Vectorized scene representation for efficient autonomous driving.ICCV, 2023
Bo Jiang, Shaoyu Chen, Qing Xu, Bencheng Liao, Jiajie Chen, Helong Zhou, Qian Zhang, Wenyu Liu, Chang Huang, and Xinggang Wang. Vad: Vectorized scene representation for efficient autonomous driving.ICCV, 2023
work page 2023
-
[71]
Penghao Wu, Xiaosong Jia, Li Chen, Junchi Yan, Hongyang Li, and Yu Qiao. Trajectory-guided control prediction for end-to-end autonomous driving: A simple yet strong baseline.Advances in Neural Information Processing Systems, 35:6119–6132, 2022
work page 2022
-
[72]
Lmdrive: Closed-loop end-to-end driving with large language models
Hao Shao, Yuxuan Hu, Letian Wang, Guanglu Song, Steven L Waslander, Yu Liu, and Hong- sheng Li. Lmdrive: Closed-loop end-to-end driving with large language models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15120–15130, 2024
work page 2024
-
[73]
nuscenes: A multimodal dataset for autonomous driving
Holger Caesar, Varun Bankiti, Alex H Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. nuscenes: A multimodal dataset for autonomous driving. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11621–11631, 2020. 15
work page 2020
-
[74]
Zhao, Chenfeng Xu, Chen Tang, Chenran Li, Mingyu Ding, Masayoshi Tomizuka, and Wei Zhan
Yiheng Li, Seth Z. Zhao, Chenfeng Xu, Chen Tang, Chenran Li, Mingyu Ding, Masayoshi Tomizuka, and Wei Zhan. Pre-training on synthetic driving data for trajectory prediction. In 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5910–5917, 2024. doi: 10.1109/IROS58592.2024.10802492
-
[75]
Qi Chen, Xu Ma, Sihai Tang, Jingda Guo, Qing Yang, and Song Fu. F-cooper: Feature based cooperative perception for autonomous vehicle edge computing system using 3d point clouds. InProceedings of the 4th ACM/IEEE Symposium on Edge Computing, pages 88–100, 2019
work page 2019
-
[76]
Runsheng Xu, Zhengzhong Tu, Hao Xiang, Wei Shao, Bolei Zhou, and Jiaqi Ma. Cobevt: Cooperative bird’s eye view semantic segmentation with sparse transformers.arXiv preprint arXiv:2207.02202, 2022
-
[77]
Urbanverse: Scaling urban simulation by watching city-tour videos
Mingxuan Liu, Honglin He, Elisa Ricci, Wayne Wu, and Bolei Zhou. Urbanverse: Scaling urban simulation by watching city-tour videos. InThe Fourteenth International Conference on Learning Representations, 2026. 16 MDriveAppendices A Additional Details of MDrive Scenarios 17 A.1 Scenario Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
work page 2026
-
[78]
Pre-crash: These scenarios are crafted and grounded by National Highway Traffic Safety Admin- istration (NHTSA) guidance, focusing on challenging scenarios with occlusion, limited perception range, and dangerous interaction behavior. 2.Blocked Lane Obstacle: Diverse static obstacle block the travel lane. 3.Construction Zone: Work-zone lane constriction. 1...
-
[79]
The perception mostly misses on far-range perception, where near-range perception are necessary for planning, thus a lack of far-range perceptions might not hurt necessary planning decisions; 2) the other models are more rule-based planning agents connecting to front-end perception results, which might not exhibit outstanding performance as learning-based...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.