Recognition: unknown
Long-SCOPE: Fully Sparse Long-Range Cooperative 3D Perception
Pith reviewed 2026-05-10 16:44 UTC · model grok-4.3
The pith
Long-SCOPE replaces dense maps with geometry-guided sparse queries and learnable association to achieve accurate cooperative 3D perception at 100-150 meter ranges.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Long-SCOPE is a fully sparse long-range cooperative 3D perception framework featuring Geometry-guided Query Generation to accurately detect small distant objects and learnable Context-Aware Association to robustly match cooperative queries despite severe positional noise, delivering state-of-the-art performance on the V2X-Seq and Griffin datasets especially in the 100-150 m range while maintaining competitive computation and communication costs.
What carries the argument
The Geometry-guided Query Generation module that produces sparse queries using geometric priors for distant objects, paired with the learnable Context-Aware Association module that matches queries across vehicles despite alignment errors, enabling fully sparse rather than dense BEV processing.
If this is right
- Cooperative perception can scale to 150 m without quadratic growth in computation or memory.
- Only sparse queries need to be exchanged, keeping communication bandwidth low for real-time use.
- Occlusion handling and extended sensing horizons become practical in multi-vehicle scenarios.
- Performance advantages appear strongest precisely in the long-range regimes where prior methods degrade.
Where Pith is reading between the lines
- The same sparse-query design could reduce bandwidth further in larger fleets by sharing only selected detections rather than full feature volumes.
- Adding temporal context to the association module might extend the approach to consistent tracking across multiple time steps at distance.
- The framework points toward perception architectures that remain efficient when the number of cooperating agents grows beyond pairs.
Load-bearing premise
The Geometry-guided Query Generation and learnable Context-Aware Association modules will remain accurate and efficient when real-world V2X observation and alignment errors are larger or differently distributed than those in the V2X-Seq and Griffin datasets.
What would settle it
A controlled test on data with substantially larger positional noise or on a new V2X dataset with different error statistics where Long-SCOPE falls below dense baselines would show the modules do not generalize as claimed.
Figures
read the original abstract
Cooperative 3D perception via Vehicle-to-Everything communication is a promising paradigm for enhancing autonomous driving, offering extended sensing horizons and occlusion resolution. However, the practical deployment of existing methods is hindered at long distances by two critical bottlenecks: the quadratic computational scaling of dense BEV representations and the fragility of feature association mechanisms under significant observation and alignment errors. To overcome these limitations, we introduce Long-SCOPE, a fully sparse framework designed for robust long-distance cooperative 3D perception. Our method features two novel components: a Geometry-guided Query Generation module to accurately detect small, distant objects, and a learnable Context-Aware Association module that robustly matches cooperative queries despite severe positional noise. Experiments on the V2X-Seq and Griffin datasets validate that Long-SCOPE achieves state-of-the-art performance, particularly in challenging 100-150 m long-range settings, while maintaining highly competitive computation and communication costs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Long-SCOPE, a fully sparse framework for long-range cooperative 3D perception via V2X communication. It proposes two novel modules—a Geometry-guided Query Generation module for detecting small distant objects and a learnable Context-Aware Association module for robust query matching under positional noise—to address quadratic scaling in dense BEV representations and fragile feature association at long distances. Experiments on the V2X-Seq and Griffin datasets are reported to achieve state-of-the-art performance particularly in the 100-150 m range while maintaining competitive computation and communication costs.
Significance. If the central claims hold under broader conditions, the work could meaningfully advance practical deployment of cooperative perception systems by mitigating key scalability and robustness bottlenecks at extended ranges, with potential benefits for occlusion handling and sensing horizons in autonomous driving.
major comments (2)
- [Experiments] Experiments section: the SOTA claims on V2X-Seq and Griffin lack reported error bars, ablation studies isolating the Geometry-guided Query Generation and Context-Aware Association modules, and failure-case analysis. Without these, it is difficult to verify that the long-range gains are attributable to the proposed components rather than dataset-specific tuning or post-hoc choices.
- [Method and Experiments] Method and Experiments sections: the robustness claim for the learnable Context-Aware Association module under 'severe positional noise' is load-bearing for the central long-range performance argument, yet the evaluation uses only the error statistics present in V2X-Seq and Griffin. No stress tests with higher-magnitude, differently correlated, or non-Gaussian noise (e.g., larger calibration drift or timestamp asynchrony) are described, leaving the transferability to real-world V2X deployments unverified.
minor comments (1)
- [Abstract] Abstract: the specific quantitative metrics (e.g., mAP@100-150m, communication volume in bytes) underlying the 'state-of-the-art' and 'highly competitive costs' statements could be stated explicitly to allow immediate comparison with prior work.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and will revise the manuscript accordingly to improve experimental rigor and robustness validation.
read point-by-point responses
-
Referee: [Experiments] Experiments section: the SOTA claims on V2X-Seq and Griffin lack reported error bars, ablation studies isolating the Geometry-guided Query Generation and Context-Aware Association modules, and failure-case analysis. Without these, it is difficult to verify that the long-range gains are attributable to the proposed components rather than dataset-specific tuning or post-hoc choices.
Authors: We agree that error bars, isolating ablations, and failure-case analysis would strengthen verifiability of the long-range gains. In the revised manuscript we will report error bars over multiple runs with varied random seeds for all main results on V2X-Seq and Griffin. We will expand the ablation table to isolate the individual contributions of the Geometry-guided Query Generation module and the Context-Aware Association module. We will also add a failure-case analysis subsection discussing representative underperformance scenarios at long range. revision: yes
-
Referee: [Method and Experiments] Method and Experiments sections: the robustness claim for the learnable Context-Aware Association module under 'severe positional noise' is load-bearing for the central long-range performance argument, yet the evaluation uses only the error statistics present in V2X-Seq and Griffin. No stress tests with higher-magnitude, differently correlated, or non-Gaussian noise (e.g., larger calibration drift or timestamp asynchrony) are described, leaving the transferability to real-world V2X deployments unverified.
Authors: We acknowledge that the current experiments rely on the noise statistics native to V2X-Seq and Griffin. While these datasets already embed realistic calibration and synchronization errors, we will add explicit stress-test experiments in the revision. These will synthetically increase noise magnitude, introduce correlated errors, and apply non-Gaussian distributions to simulate more extreme calibration drift and timestamp asynchrony, thereby providing stronger evidence for transferability. revision: yes
Circularity Check
No circularity; empirical ML architecture with external dataset validation
full rationale
The paper introduces a sparse neural architecture (Geometry-guided Query Generation + learnable Context-Aware Association) for long-range V2X 3D perception and reports experimental results on the public V2X-Seq and Griffin benchmarks. No equations, derivations, or first-principles predictions appear; performance numbers are obtained by training and testing on held-out data rather than by fitting a parameter and relabeling the fit as a prediction. No self-citation chain is required to justify the core modules, and the evaluation uses independent external datasets. This is the standard non-circular case for an empirical CV paper.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Supervised training on existing V2X datasets produces models that generalize to real-world long-range cooperative scenarios
Reference graph
Works this paper leans on
-
[1]
Cooperative perception for 3D object detec- tion in driving scenarios using infrastructure sensors.IEEE Transactions on Intelligent Transportation Systems, 23(3): 1852–1864, 2022
Eduardo Arnold, Mehrdad Dianati, Robert De Temple, and Saber Fallah. Cooperative perception for 3D object detec- tion in driving scenarios using infrastructure sensors.IEEE Transactions on Intelligent Transportation Systems, 23(3): 1852–1864, 2022. 2
2022
-
[2]
Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gi- ancarlo Baldan, and Oscar Beijbom
Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gi- ancarlo Baldan, and Oscar Beijbom. nuScenes: A multi- modal dataset for autonomous driving. In2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 11618–11628, 2020. 5
2020
-
[3]
CoopDiff: A Diffusion-Guided Approach for Cooperation under Cor- ruptions, 2026
Gong Chen, Chaokun Zhang, and Pengcheng Lv. CoopDiff: A Diffusion-Guided Approach for Cooperation under Cor- ruptions, 2026. 2
2026
-
[4]
CATNet: Collaborative Alignment and Transformation Network for Cooperative Perception, 2026
Gong Chen, Chaokun Zhang, Tao Tang, Pengcheng Lv, Feng Li, and Xin Xie. CATNet: Collaborative Alignment and Transformation Network for Cooperative Perception, 2026. 2
2026
-
[5]
Cooper: Cooperative Perception for Connected Autonomous Vehi- cles Based on 3D Point Clouds
Qi Chen, Sihai Tang, Qing Yang, and Song Fu. Cooper: Cooperative Perception for Connected Autonomous Vehi- cles Based on 3D Point Clouds. In2019 IEEE 39th In- ternational Conference on Distributed Computing Systems (ICDCS), pages 514–524, 2019. 2
2019
-
[6]
TransIFF: An instance-level feature fusion framework for vehicle- infrastructure cooperative 3D detection with transformers
Ziming Chen, Yifeng Shi, and Jinrang Jia. TransIFF: An instance-level feature fusion framework for vehicle- infrastructure cooperative 3D detection with transformers. In 2023 IEEE/CVF International Conference on Computer Vi- sion (ICCV), pages 18159–18168, Paris, France, 2023. IEEE. 2
2023
-
[7]
Hsu-Kuang Chiu, Chien-Yi Wang, Min-Hung Chen, and Stephen F. Smith. Probabilistic 3D Multi-Object Cooperative Tracking for Autonomous Driving via Differentiable Multi- Sensor Kalman Filter. In2024 IEEE International Con- ference on Robotics and Automation (ICRA), pages 18458– 18464, 2024. 2
2024
-
[8]
Point Clus- ter: A Compact Message Unit for Communication-Efficient Collaborative Perception
Zihan Ding, Jiahui Fu, Si Liu, Hongyu Li, Siheng Chen, Hongsheng Li, Shifeng Zhang, and Xu Zhou. Point Clus- ter: A Compact Message Unit for Communication-Efficient Collaborative Perception. InThe Thirteenth International Conference on Learning Representations, 2024. 2, 5
2024
-
[9]
QUEST: Query Stream for Practical Cooperative Per- ception
Siqi Fan, Haibao Yu, Wenxian Yang, Jirui Yuan, and Zaiqing Nie. QUEST: Query Stream for Practical Cooperative Per- ception. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 18436–18442, 2024. 2
2024
-
[10]
Equi- RO: A 4D mmWave Radar Odometry via Equivariant Net- works.IEEE Robotics and Automation Letters, 11(4):4034– 4041, 2026
Zeyu Han, Shuocheng Yang, Minghan Zhu, Fang Zhang, Shaobing Xu, Maani Ghaffari, and Jianqiang Wang. Equi- RO: A 4D mmWave Radar Odometry via Equivariant Net- works.IEEE Robotics and Automation Letters, 11(4):4034– 4041, 2026. 1
2026
-
[11]
Research Challenges and Progress in the End-to-End V2X Cooperative Autonomous Driving Competition
Ruiyang Hao, Haibao Yu, Jiaru Zhong, Chuanye Wang, Ji- ahao Wang, Yiming Kan, Wenxian Yang, Siqi Fan, Huilin Yin, Jianing Qiu, Yao Mu, Jiankai Sun, Li Chen, Walter Zim- mer, Dandan Zhang, Shanghang Zhang, Mac Schwager, Ping Luo, and Zaiqing Nie. Research Challenges and Progress in the End-to-End V2X Cooperative Autonomous Driving Competition. InProceeding...
2025
-
[12]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016. 6
2016
-
[13]
Where2comm: Communication-efficient collabora- tive perception via spatial confidence maps
Yue Hu, Shaoheng Fang, Zixing Lei, Yiqi Zhong, and Siheng Chen. Where2comm: Communication-efficient collabora- tive perception via spatial confidence maps. InAdvances in Neural Information Processing Systems, pages 4874–4886,
-
[14]
Collaboration Helps Camera Overtake LiDAR in 3D Detection
Yue Hu, Yifan Lu, Runsheng Xu, Weidi Xie, Siheng Chen, and Yanfeng Wang. Collaboration Helps Camera Overtake LiDAR in 3D Detection. In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9243–9252, 2023. 1
2023
-
[15]
Communication-efficient collaborative percep- tion via information filling with codebook
Yue Hu, Juntong Peng, Sifei Liu, Junhao Ge, Si Liu, and Si- heng Chen. Communication-efficient collaborative percep- tion via information filling with codebook. InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pages 15481–15490, 2024. 2
2024
-
[16]
Far3D: Expanding the Horizon for Surround-View 3D Object De- tection.Proceedings of the AAAI Conference on Artificial Intelligence, 38(3):2561–2569, 2024
Xiaohui Jiang, Shuailin Li, Yingfei Liu, Shihao Wang, Fan Jia, Tiancai Wang, Lijin Han, and Xiangyu Zhang. Far3D: Expanding the Horizon for Surround-View 3D Object De- tection.Proceedings of the AAAI Conference on Artificial Intelligence, 38(3):2561–2569, 2024. 4
2024
-
[17]
H. W. Kuhn. The Hungarian method for the assignment problem.Naval Research Logistics Quarterly, 2(1-2):83–97,
-
[18]
Learning Distilled Collaboration Graph for Multi-Agent Perception
Yiming Li, Shunli Ren, Pengxiang Wu, Siheng Chen, Chen Feng, and Wenjun Zhang. Learning Distilled Collaboration Graph for Multi-Agent Perception. InAdvances in Neural Information Processing Systems, pages 29541–29552. Cur- ran Associates, Inc., 2021. 1
2021
-
[19]
BEVFormer: Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers
Zhiqi Li, Wenhai Wang, Hongyang Li, Enze Xie, Chong- hao Sima, Tong Lu, Yu Qiao, and Jifeng Dai. BEVFormer: Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers. InLecture Notes in Computer Science, pages 1–18, Cham, 2022. Springer Na- ture Switzerland. 6
2022
-
[20]
Sparse4D v3: Advancing end-to-end 3D de- tection and tracking, 2023
Xuewu Lin, Zixiang Pei, Tianwei Lin, Lichao Huang, and Zhizhong Su. Sparse4D v3: Advancing end-to-end 3D de- tection and tracking, 2023. 3
2023
-
[21]
LightGlue: Local Feature Matching at Light Speed
Philipp Lindenberger, Paul-Edouard Sarlin, and Marc Polle- feys. LightGlue: Local Feature Matching at Light Speed. In2023 IEEE/CVF International Conference on Computer Vision (ICCV), pages 17581–17592, 2023. 5
2023
-
[22]
SparseComm: An Efficient Sparse Communication Framework for Vehicle- Infrastructure Cooperative 3D Detection.Pattern Recogni- tion, 158:110961, 2025
Haizhuang Liu, Huazhen Chu, Junbao Zhuo, Bochao Zou, Jiansheng Chen, and Huimin Ma. SparseComm: An Efficient Sparse Communication Framework for Vehicle- Infrastructure Cooperative 3D Detection.Pattern Recogni- tion, 158:110961, 2025. 1, 2
2025
-
[23]
Robust collabora- tive 3D object detection in presence of pose errors
Yifan Lu, Quanhao Li, Baoan Liu, Mehrdad Dianati, Chen Feng, Siheng Chen, and Yanfeng Wang. Robust collabora- tive 3D object detection in presence of pose errors. In2023 IEEE International Conference on Robotics and Automation (ICRA), pages 4812–4818, 2023. 1, 2
2023
-
[24]
SuperGlue: Learning Feature Matching With Graph Neural Networks
Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. SuperGlue: Learning Feature Matching With Graph Neural Networks. In2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4937–4946, 2020. 5
2020
-
[25]
Concerning nonnegative matrices and doubly stochastic matrices.Pacific Journal of Mathematics, 21(2):343–348, 1967
Richard Sinkhorn and Paul Knopp. Concerning nonnegative matrices and doubly stochastic matrices.Pacific Journal of Mathematics, 21(2):343–348, 1967. 5
1967
-
[26]
Rodolfo Valiente, Mahdi Zaman, Sedat Ozer, and Yaser P. Fallah. Controlling Steering Angle for Cooperative Self- driving Vehicles utilizing CNN and LSTM-based Deep Net- works. In2019 IEEE Intelligent Vehicles Symposium (IV), pages 2423–2428, 2019. 2
2019
-
[27]
Attention is all you need
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neu- ral Information Processing Systems. Curran Associates, Inc.,
-
[28]
VGGT: Visual Geometry Grounded Transformer, 2025
Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. VGGT: Visual Geometry Grounded Transformer, 2025. 5
2025
-
[29]
Griffin: Aerial-Ground Cooperative Detec- tion and Tracking Dataset and Benchmark.Proceedings of the AAAI Conference on Artificial Intelligence, 40(12):9867– 9875, 2026
Jiahao Wang, Xiangyu Cao, Jiaru Zhong, Yuner Zhang, Zeyu Han, Haibao Yu, Chuang Zhang, Lei He, Shaobing Xu, and Jianqiang Wang. Griffin: Aerial-Ground Cooperative Detec- tion and Tracking Dataset and Benchmark.Proceedings of the AAAI Conference on Artificial Intelligence, 40(12):9867– 9875, 2026. 2, 3, 5
2026
-
[30]
SparseCoop: Coop- erative Perception with Kinematic-Grounded Queries.Pro- ceedings of the AAAI Conference on Artificial Intelligence, 40(12):9876–9884, 2026
Jiahao Wang, Zhongwei Jiang, Wenchao Sun, Jiaru Zhong, Haibao Yu, Yuner Zhang, Chenyang Lu, Chuang Zhang, Lei He, Shaobing Xu, and Jianqiang Wang. SparseCoop: Coop- erative Perception with Kinematic-Grounded Queries.Pro- ceedings of the AAAI Conference on Artificial Intelligence, 40(12):9876–9884, 2026. 2, 3, 4, 5, 6, 8
2026
-
[31]
IFTR: An Instance-Level Fusion Transformer for Visual Collaborative Perception
Shaohong Wang, Lu Bin, Xinyu Xiao, Zhiyu Xiang, Hang- guan Shan, and Eryun Liu. IFTR: An Instance-Level Fusion Transformer for Visual Collaborative Perception. InCom- puter Vision – ECCV 2024, pages 124–141, Cham, 2025. Springer Nature Switzerland. 1, 2
2024
-
[32]
V2VNet: Vehicle-to-vehicle communication for joint perception and prediction
Tsun-Hsuan Wang, Sivabalan Manivasagam, Ming Liang, Bin Yang, Wenyuan Zeng, and Raquel Urtasun. V2VNet: Vehicle-to-vehicle communication for joint perception and prediction. InComputer Vision – ECCV 2020, pages 605– 621, Cham, 2020. Springer International Publishing. 1, 2
2020
-
[33]
DETR3D: 3D object detection from multi-view images via 3D-to-2D queries
Yue Wang, Vitor Campagnolo Guizilini, Tianyuan Zhang, Yilun Wang, Hang Zhao, and Justin Solomon. DETR3D: 3D object detection from multi-view images via 3D-to-2D queries. InProceedings of the 5th Conference on Robot Learning, pages 180–191. PMLR, 2022. 4
2022
-
[34]
Yifan Wang, Jianjun Zhou, Haoyi Zhu, Wenzheng Chang, Yang Zhou, Zizun Li, Junyi Chen, Jiangmiao Pang, Chun- hua Shen, and Tong He.π3: Permutation-Equivariant Visual Geometry Learning, 2025. 5
2025
-
[35]
Drones Help Drones: A Collaborative Framework for Multi-Drone Object Trajectory Prediction and Beyond.Advances in Neu- ral Information Processing Systems, 37:64604–64628, 2024
Zhechao Wang, Peirui Cheng, Mingxin Chen, Pengju Tian, Zhirui Wang, Xinming Li, Xue Yang, and Xian Sun. Drones Help Drones: A Collaborative Framework for Multi-Drone Object Trajectory Prediction and Beyond.Advances in Neu- ral Information Processing Systems, 37:64604–64628, 2024. 4
2024
-
[36]
CoopDETR: A Unified Cooperative Perception Framework for 3D Detection via Object Query
Zhe Wang, Shaocong Xu, Xucai Zhuang, Tongda Xu, Yan Wang, Jingjing Liu, Yilun Chen, and Ya-Qin Zhang. CoopDETR: A Unified Cooperative Perception Framework for 3D Detection via Object Query. InIEEE International Conference on Robotics and Automation, ICRA 2025, At- lanta, GA, USA, May 19-23, 2025, pages 2732–2739. IEEE,
2025
-
[37]
Quanmin Wei, Penglin Dai, Wei Li, Bingyi Liu, and Xiao Wu. InfoCom: Kilobyte-Scale Communication-Efficient Collaborative Perception with Information Bottleneck.Pro- ceedings of the AAAI Conference on Artificial Intelligence, 40(35):29731–29739, 2026. 2
2026
-
[38]
V2X-ViT: Vehicle-to-everything cooperative perception with vision transformer
Runsheng Xu, Hao Xiang, Zhengzhong Tu, Xin Xia, Ming- Hsuan Yang, and Jiaqi Ma. V2X-ViT: Vehicle-to-everything cooperative perception with vision transformer. InLecture Notes in Computer Science, pages 107–124, Cham, 2022. Springer Nature Switzerland. 6
2022
-
[39]
MergeOcc: Bridge the Do- main Gap between Different LiDARs for Robust Occupancy Prediction
Zikun Xu and Shaobing Xu. MergeOcc: Bridge the Do- main Gap between Different LiDARs for Robust Occupancy Prediction. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 26539–26548, 2025. 1
2025
-
[40]
BEVHeight: A robust framework for vision-based roadside 3D object detection
Lei Yang, Kaicheng Yu, Tao Tang, Jun Li, Kun Yuan, Li Wang, Xinyu Zhang, and Peng Chen. BEVHeight: A robust framework for vision-based roadside 3D object detection. In 2023 IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 21611–21620, 2023. 4
2023
-
[41]
BEVHeight++: Toward Robust Visual Centric 3D Ob- ject Detection.IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1–18, 2025
Lei Yang, Tao Tang, Jun Li, Kun Yuan, Kai Wu, Peng Chen, Li Wang, Yi Huang, Lei Li, Xinyu Zhang, and Kaicheng Yu. BEVHeight++: Toward Robust Visual Centric 3D Ob- ject Detection.IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1–18, 2025. 4
2025
-
[42]
RINO: Accurate, Robust Radar- Inertial Odometry With Non-Iterative Estimation.IEEE Transactions on Automation Science and Engineering, 22: 20420–20434, 2025
Shuocheng Yang, Yueming Cao, Shengbo Eben Li, Jianqiang Wang, and Shaobing Xu. RINO: Accurate, Robust Radar- Inertial Odometry With Non-Iterative Estimation.IEEE Transactions on Automation Science and Engineering, 22: 20420–20434, 2025. 1
2025
-
[43]
DAIR-V2X: A large-scale dataset for vehicle-infrastructure cooperative 3D object detection
Haibao Yu, Yizhen Luo, Mao Shu, Yiyi Huo, Zebang Yang, Yifeng Shi, Zhenglong Guo, Hanyu Li, Xing Hu, Jirui Yuan, and Zaiqing Nie. DAIR-V2X: A large-scale dataset for vehicle-infrastructure cooperative 3D object detection. In 2022 IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 21329–21338, 2022. 3
2022
-
[44]
V2X-seq: A large-scale sequential dataset for vehicle- infrastructure cooperative perception and forecasting
Haibao Yu, Wenxian Yang, Hongzhi Ruan, Zhenwei Yang, Yingjuan Tang, Xu Gao, Xin Hao, Yifeng Shi, Yifeng Pan, Ning Sun, Juan Song, Jirui Yuan, Ping Luo, and Zaiqing Nie. V2X-seq: A large-scale sequential dataset for vehicle- infrastructure cooperative perception and forecasting. In 2023 IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR...
2023
-
[45]
End-to-End Au- tonomous Driving Through V2X Cooperation.Proceed- ings of the AAAI Conference on Artificial Intelligence, 39 (9):9598–9606, 2025
Haibao Yu, Wenxian Yang, Jiaru Zhong, Zhenwei Yang, Siqi Fan, Ping Luo, and Zaiqing Nie. End-to-End Au- tonomous Driving Through V2X Cooperation.Proceed- ings of the AAAI Conference on Artificial Intelligence, 39 (9):9598–9606, 2025. 2, 5, 6
2025
-
[46]
Generating evidential BEV maps in contin- uous driving space.ISPRS Journal of Photogrammetry and Remote Sensing, 204:27–41, 2023
Yunshuang Yuan, Hao Cheng, Michael Ying Yang, and Monika Sester. Generating evidential BEV maps in contin- uous driving space.ISPRS Journal of Photogrammetry and Remote Sensing, 204:27–41, 2023. 2
2023
-
[47]
SparseAlign: A Fully Sparse Framework for Coop- erative Object Detection
Yunshuang Yuan, Yan Xia, Daniel Cremers, and Monika Sester. SparseAlign: A Fully Sparse Framework for Coop- erative Object Detection. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 22296– 22305, 2025. 1, 2
2025
-
[48]
Leveraging Temporal Contexts to Enhance Vehicle-Infrastructure Cooperative Per- ception
Jiaru Zhong, Haibao Yu, Tianyi Zhu, Jiahui Xu, Wenxian Yang, Zaiqing Nie, and Chao Sun. Leveraging Temporal Contexts to Enhance Vehicle-Infrastructure Cooperative Per- ception. In2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), pages 915–922,
-
[49]
CoopTrack: Exploring End-to-End Learning for Efficient Cooperative Sequential Perception
Jiaru Zhong, Jiahao Wang, Jiahui Xu, Xiaofan Li, Zaiqing Nie, and Haibao Yu. CoopTrack: Exploring End-to-End Learning for Efficient Cooperative Sequential Perception. In 2025 IEEE/CVF International Conference on Computer Vi- sion (ICCV), pages 26954–26965, 2025. 1, 2, 6
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.