CORP: A Multi-Modal Dataset for Campus-Oriented Roadside Perception Tasks
Pith reviewed 2026-05-24 02:16 UTC · model grok-4.3
The pith
CORP is the first public benchmark dataset for multi-modal roadside perception in campus settings.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors propose CORP as the first public benchmark dataset tailored for multi-modal roadside perception tasks under campus scenarios. Collected in a university campus, CORP consists of over 205k images plus 102k point clouds captured from 18 cameras and 9 LiDAR sensors with different configurations mounted on roadside utility poles to provide diverse viewpoints. The annotations encompass multi-dimensional information beyond 2D and 3D bounding boxes, providing extra support for 3D seamless tracking and instance segmentation with unique IDs and pixel masks for identifying targets, to enhance the understanding of objects and their behaviors distributed across the campus premises.
What carries the argument
The CORP dataset, built from synchronized multi-modal sensor streams on utility poles together with extended labels for tracking and segmentation.
If this is right
- Researchers can train and benchmark multi-modal fusion methods on synchronized campus image and point cloud streams from varied viewpoints.
- Algorithms for 3D object tracking can exploit the unique IDs across frames to maintain identities through campus scenes.
- Instance segmentation models gain access to pixel masks that link 2D and 3D annotations for the same targets.
- Perception systems for intelligent transportation can be evaluated on residential-area challenges such as pedestrian and cyclist behaviors near campus buildings.
Where Pith is reading between the lines
- The multi-view utility-pole setup could be replicated in other non-arterial environments such as parks to test whether the same annotation style generalizes.
- Comparison experiments between CORP and urban datasets would quantify how much viewpoint and scene-type differences affect current model performance.
- The dataset's scale and sensor diversity make it suitable for studying long-term object re-identification across repeated campus routes.
Load-bearing premise
Campus scenarios exhibit entirely distinct characteristics from urban arterial roads that are not addressed by existing datasets.
What would settle it
Demonstration that perception models trained solely on existing urban roadside datasets reach equivalent accuracy on campus tasks without retraining or new labels.
Figures
read the original abstract
Numerous roadside perception datasets have been introduced to propel advancements in autonomous driving and intelligent transportation systems research and development. However, it has been observed that the majority of their concentrates is on urban arterial roads, inadvertently overlooking residential areas such as parks and campuses that exhibit entirely distinct characteristics. In light of this gap, we propose CORP, which stands as the first public benchmark dataset tailored for multi-modal roadside perception tasks under campus scenarios. Collected in a university campus, CORP consists of over 205k images plus 102k point clouds captured from 18 cameras and 9 LiDAR sensors. These sensors with different configurations are mounted on roadside utility poles to provide diverse viewpoints within the campus region. The annotations of CORP encompass multi-dimensional information beyond 2D and 3D bounding boxes, providing extra support for 3D seamless tracking and instance segmentation with unique IDs and pixel masks for identifying targets, to enhance the understanding of objects and their behaviors distributed across the campus premises. Unlike other roadside datasets about urban traffic, CORP extends the spectrum to highlight the challenges for multi-modal perception in campuses and other residential areas.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces CORP, claimed as the first public multi-modal roadside perception benchmark for campus scenarios. It comprises over 205k images and 102k point clouds captured by 18 cameras and 9 LiDARs mounted on utility poles, with annotations extending beyond 2D/3D bounding boxes to include unique IDs for 3D tracking and pixel masks for instance segmentation. The central claim is that existing datasets focus on urban arterial roads while overlooking distinct characteristics of residential/campus areas, and that CORP fills this gap with its sensor diversity and multi-dimensional labels.
Significance. If the novelty and distinctness claims hold, CORP would offer a useful addition to the literature by providing data from an underrepresented environment (university campus) with rich annotations supporting tracking and segmentation tasks. The multi-view, multi-modal sensor configuration is a concrete strength for perception research.
major comments (2)
- [Abstract] Abstract: The assertion that CORP 'stands as the first public benchmark dataset tailored for multi-modal roadside perception tasks under campus scenarios' and that campuses 'exhibit entirely distinct characteristics' is load-bearing for the contribution but is not supported by any quantitative comparisons (object density, trajectory statistics, scene diversity metrics, or similar) to prior roadside datasets; without these, the gap-filling claim cannot be evaluated.
- [Data annotation / labeling sections] Annotation description (full text, data collection and labeling sections): No details are provided on annotation validation procedures, quality control, or metrics such as inter-annotator agreement; this directly affects the claim that the 'multi-dimensional information' and 'unique IDs and pixel masks' meaningfully enhance understanding of objects and behaviors.
minor comments (2)
- [Abstract] Abstract: Typo/grammar: 'the majority of their concentrates is' should be rephrased to 'the majority of their concentration is' or 'most of their focus is'.
- [Abstract] Abstract: The phrasing 'Unlike other roadside datasets about urban traffic' is imprecise; consider 'Unlike other roadside datasets focused on urban traffic'.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. We address each major comment below and commit to revisions that directly respond to the concerns raised.
read point-by-point responses
-
Referee: [Abstract] Abstract: The assertion that CORP 'stands as the first public benchmark dataset tailored for multi-modal roadside perception tasks under campus scenarios' and that campuses 'exhibit entirely distinct characteristics' is load-bearing for the contribution but is not supported by any quantitative comparisons (object density, trajectory statistics, scene diversity metrics, or similar) to prior roadside datasets; without these, the gap-filling claim cannot be evaluated.
Authors: We agree that the distinctness claim would be strengthened by quantitative evidence. In the revised manuscript we will insert a new comparison subsection (or table) reporting concrete metrics—object density per frame, average trajectory duration, number of unique object classes per scene, and scene entropy measures—computed on CORP versus representative prior roadside datasets focused on arterial roads. This addition will allow readers to evaluate the claimed gap directly. revision: yes
-
Referee: [Data annotation / labeling sections] Annotation description (full text, data collection and labeling sections): No details are provided on annotation validation procedures, quality control, or metrics such as inter-annotator agreement; this directly affects the claim that the 'multi-dimensional information' and 'unique IDs and pixel masks' meaningfully enhance understanding of objects and behaviors.
Authors: We accept that the absence of quality-control details weakens the annotation claims. The revised manuscript will add a concise subsection under Data Annotation that describes the multi-stage validation workflow (initial labeling followed by independent review by two additional annotators), the resolution protocol for disagreements, and the computed inter-annotator agreement scores (e.g., IoU for boxes and masks, ID consistency for tracking). revision: yes
Circularity Check
No significant circularity
full rationale
The paper introduces a new multi-modal roadside perception dataset (CORP) collected on a university campus. It contains no equations, derivations, fitted parameters, predictions, or uniqueness theorems. The central claim—that CORP is the first public benchmark for campus scenarios because prior datasets focus on urban arterial roads—is presented as an empirical observation rather than a derived result. No load-bearing step reduces by construction to the paper's own inputs, self-citations, or ansatzes. The contribution is self-contained as a data-collection and annotation effort.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Haowen Xu, Anne Berres, Sarah A. Tennille, Srinath K. Ravulaparthy, Chieh Wang, and Jibonananda Sanyal. Continuous emulation and multiscale visualization of traffic flow using stationary roadside sensor data. IEEE Transactions on Intelligent Transportation Systems, 23(8):10530–10541, 2022
work page 2022
-
[2]
Vips: Real-time perception fusion for infrastructure-assisted autonomous driving
Shuyao Shi, Jiahe Cui, Zhehao Jiang, Zhenyu Yan, Guoliang Xing, Jianwei Niu, and Zhenchao Ouyang. Vips: Real-time perception fusion for infrastructure-assisted autonomous driving. In Proceedings of the 28th Annual International Conference on Mobile Computing And Networking , MobiCom ’22, page 133–146, New York, NY , USA, 2022. Association for Computing Machinery
work page 2022
-
[3]
Pranav Mantini, Zhenggang Li, and K. Shishir Shah. A day on campus - an anomaly detection dataset for events in a single camera. In Hiroshi Ishikawa, Cheng-Lin Liu, Tomas Pajdla, and Jianbo Shi, editors, Computer Vision – ACCV 2020, pages 619–635, Cham, 2021. Springer International Publishing
work page 2020
-
[4]
Dair-v2x: A large-scale dataset for vehicle-infrastructure cooperative 3d object detection
Haibao Yu, Yizhen Luo, Mao Shu, Yiyi Huo, Zebang Yang, Yifeng Shi, Zhenglong Guo, Hanyu Li, Xing Hu, Jirui Yuan, and Zaiqing Nie. Dair-v2x: A large-scale dataset for vehicle-infrastructure cooperative 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 21361–21370, June 2022
work page 2022
-
[5]
Xiaoqing Ye, Mao Shu, Hanyu Li, Yifeng Shi, Yingying Li, Guangjie Wang, Xiao Tan, and Errui Ding. Rope3d: The roadside perception dataset for autonomous driving and monocular 3d object detection task. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 21341–21350, June 2022
work page 2022
-
[6]
Walter Zimmer, Christian Creß, Huu Tung Nguyen, and Alois C. Knoll. A9 intersection dataset: All you need for urban 3d camera-lidar roadside perception, 2023
work page 2023
-
[7]
Ips300+: a challenging multi-modal data sets for intersection perception system
Huanan Wang, Xinyu Zhang, Zhiwei Li, Jun Li, Kun Wang, Zhu Lei, and Ren Haibing. Ips300+: a challenging multi-modal data sets for intersection perception system. In 2022 International Conference on Robotics and Automation (ICRA), pages 2539–2545, 2022. 11
work page 2022
-
[8]
A9-dataset: Multi-sensor infrastructure-based dataset for mobility research
Christian Creß, Walter Zimmer, Leah Strand, Maximilian Fortkord, Siyi Dai, Venkatnarayanan Lakshmi- narasimhan, and Alois Knoll. A9-dataset: Multi-sensor infrastructure-based dataset for mobility research. In 2022 IEEE Intelligent V ehicles Symposium (IV), pages 965–970, 2022
work page 2022
-
[9]
Bevheight: A robust framework for vision-based roadside 3d object detection
Lei Yang, Kaicheng Yu, Tao Tang, Jun Li, Kun Yuan, Li Wang, Xinyu Zhang, and Peng Chen. Bevheight: A robust framework for vision-based roadside 3d object detection. In IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), March 2023
work page 2023
-
[10]
Bevdepth: Acquisition of reliable depth for multi-view 3d object detection, 2022
Yinhao Li, Zheng Ge, Guanyi Yu, Jinrong Yang, Zengran Wang, Yukang Shi, Jianjian Sun, and Zeming Li. Bevdepth: Acquisition of reliable depth for multi-view 3d object detection, 2022
work page 2022
-
[11]
Zhiqi Li, Wenhai Wang, Hongyang Li, Enze Xie, Chonghao Sima, Tong Lu, Qiao Yu, and Jifeng Dai. Bevformer: Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers, 2022
work page 2022
-
[12]
A revisit of sparse coding based anomaly detection in stacked rnn framework
Weixin Luo, Wen Liu, and Shenghua Gao. A revisit of sparse coding based anomaly detection in stacked rnn framework. In 2017 IEEE International Conference on Computer Vision (ICCV) , pages 341–349, 2017
work page 2017
-
[13]
A new comprehensive benchmark for semi-supervised video anomaly detection and anticipation
Congqi Cao, Yue Lu, Peng Wang, and Yanning Zhang. A new comprehensive benchmark for semi-supervised video anomaly detection and anticipation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 20392–20401, June 2023
work page 2023
-
[14]
Developing and testing robust autonomy: The university of sydney campus data set
Wei Zhou, Julie Stephany Berrio, Charika De Alvis, Mao Shan, Stewart Worrall, James Ward, and Eduardo Nebot. Developing and testing robust autonomy: The university of sydney campus data set. IEEE Intelligent Transportation Systems Magazine, 12(4):23–40, 2020
work page 2020
-
[15]
Campus3d: A photogrammetry point cloud benchmark for hierarchical understanding of outdoor scene
Xinke Li, Chongshou Li, Zekun Tong, Andrew Lim, Junsong Yuan, Yuwei Wu, Jing Tang, and Raymond Huang. Campus3d: A photogrammetry point cloud benchmark for hierarchical understanding of outdoor scene. MM ’20, page 238–246, New York, NY , USA, 2020. Association for Computing Machinery
work page 2020
-
[16]
Haibao Yu, Wenxian Yang, Hongzhi Ruan, Zhenwei Yang, Yingjuan Tang, Xu Gao, Xin Hao, Yifeng Shi, Yifeng Pan, Ning Sun, Juan Song, Jirui Yuan, Ping Luo, and Zaiqing Nie. V2x-seq: A large-scale sequential dataset for vehicle-infrastructure cooperative perception and forecasting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogn...
work page 2023
-
[17]
Int2: Interactive trajectory prediction at intersections
Zhijie Yan, Pengfei Li, Zheng Fu, Shaocong Xu, Yongliang Shi, Xiaoxue Chen, Yuhang Zheng, Yang Li, Tianyu Liu, Chuxuan Li, Nairui Luo, Xu Gao, Yilun Chen, Zuoxu Wang, Yifeng Shi, Pengfei Huang, Zhengxiao Han, Jirui Yuan, Jiangtao Gong, Guyue Zhou, Hang Zhao, and Hao Zhao. Int2: Interactive trajectory prediction at intersections. In Proceedings of the IEEE...
work page 2023
-
[18]
You only look once: Unified, real-time object detection
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pages 779–788, 2016
work page 2016
-
[19]
Yolo9000: Better, faster, stronger
Joseph Redmon and Ali Farhadi. Yolo9000: Better, faster, stronger. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 6517–6525, 2017
work page 2017
-
[20]
Yolov3: An incremental improvement, 2018
Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement, 2018
work page 2018
-
[21]
Yolov4: Optimal speed and accuracy of object detection, 2020
Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. Yolov4: Optimal speed and accuracy of object detection, 2020
work page 2020
- [22]
-
[23]
You only learn one representation: Unified network for multiple tasks, 2021
Chien-Yao Wang, I-Hau Yeh, and Hong-Yuan Mark Liao. You only learn one representation: Unified network for multiple tasks, 2021
work page 2021
-
[24]
Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, 2022
Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, 2022
work page 2022
-
[25]
YOLO by Ultralytics, January 2023
Glenn Jocher, Ayush Chaurasia, and Jing Qiu. YOLO by Ultralytics, January 2023
work page 2023
-
[26]
Modnet: Motion and appearance based moving object detection network for autonomous driving
Mennatullah Siam, Heba Mahgoub, Mohamed Zahran, Senthil Yogamani, Martin Jagersand, and Ahmad El-Sallab. Modnet: Motion and appearance based moving object detection network for autonomous driving. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC) , pages 2859–2864, 2018
work page 2018
-
[27]
Eslam Mohamed, Mahmoud Ewaisha, Mennatullah Siam, Hazem Rashed, Senthil Yogamani, Waleed Hamdy, Mohamed El-Dakdouky, and Ahmad El-Sallab. Monocular instance motion segmentation for autonomous driving: Kitti instancemotseg dataset and multi-task baseline. In 2021 IEEE Intelligent V ehicles Symposium (IV), pages 114–121, 2021
work page 2021
-
[28]
Learning to segment rigid motions from two frames
Gengshan Yang and Deva Ramanan. Learning to segment rigid motions from two frames. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 1266–1275, 2021. 12
work page 2021
-
[29]
Discovering objects that can move
Zhipeng Bao, Pavel Tokmakov, Allan Jabri, Yu-Xiong Wang, Adrien Gaidon, and Martial Hebert. Discovering objects that can move. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 11779–11788, 2022
work page 2022
-
[30]
Segmenting moving objects via an object-centric layered representation
Junyu Xie, Weidi Xie, and Andrew Zisserman. Segmenting moving objects via an object-centric layered representation. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 28023–28036. Curran Associates, Inc., 2022
work page 2022
-
[31]
Lang, Sourabh V ora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom
Alex H. Lang, Sourabh V ora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom. Pointpillars: Fast encoders for object detection from point clouds. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12689–12697, 2019
work page 2019
-
[32]
V oxelnet: End-to-end learning for point cloud based 3d object detection
Yin Zhou and Oncel Tuzel. V oxelnet: End-to-end learning for point cloud based 3d object detection. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 4490–4499, 2018
work page 2018
-
[33]
3dssd: Point-based 3d single stage object detector
Zetong Yang, Yanan Sun, Shu Liu, and Jiaya Jia. 3dssd: Point-based 3d single stage object detector. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 11037–11045, 2020
work page 2020
-
[34]
V oxel r-cnn: Towards high performance voxel-based 3d object detection, 2021
Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, and Houqiang Li. V oxel r-cnn: Towards high performance voxel-based 3d object detection, 2021
work page 2021
-
[35]
Center-based 3d object detection and tracking
Tianwei Yin, Xingyi Zhou, and Philipp Krähenbühl. Center-based 3d object detection and tracking. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 11779–11788, 2021
work page 2021
-
[36]
Neighbor-vote: Improving monocular 3d object detection through neighbor distance voting
Xiaomeng Chu, Jiajun Deng, Yao Li, Zhenxun Yuan, Yanyong Zhang, Jianmin Ji, and Yu Zhang. Neighbor-vote: Improving monocular 3d object detection through neighbor distance voting. In Proceedings of the 29th ACM International Conference on Multimedia, MM ’21, page 5239–5247, New York, NY , USA, 2021. Association for Computing Machinery
work page 2021
-
[37]
Objects are different: Flexible monocular 3d object detection
Yunpeng Zhang, Jiwen Lu, and Jie Zhou. Objects are different: Flexible monocular 3d object detection. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages 3288–3297, 2021
work page 2021
-
[38]
Danila Rukhovich, Anna V orontsova, and Anton Konushin. Imvoxelnet: Image to voxels projection for monocular and multi-view general-purpose 3d object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2397–2406, 2022
work page 2022
-
[39]
Monoatt: Online monocular 3d object detection with adaptive token transformer
Yunsong Zhou, Hongzi Zhu, Quan Liu, Shan Chang, and Minyi Guo. Monoatt: Online monocular 3d object detection with adaptive token transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 17493–17503, June 2023
work page 2023
-
[40]
Unimode: Unified monocular 3d object detection, 2024
Zhuoling Li, Xiaogang Xu, SerNam Lim, and Hengshuang Zhao. Unimode: Unified monocular 3d object detection, 2024
work page 2024
-
[41]
Jia Jinrang, Zhenjia Li, and Yifeng Shi. Monouni: A unified vehicle and infrastructure-side monocular 3d object detection network with sufficient depth clues. In A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advances in Neural Information Processing Systems , volume 36, pages 11703–11715. Curran Associates, Inc., 2023
work page 2023
-
[42]
Pointpainting: Sequential fusion for 3d object detection
Sourabh V ora, Alex H Lang, Bassam Helou, and Oscar Beijbom. Pointpainting: Sequential fusion for 3d object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 4604–4612, 2020
work page 2020
-
[43]
Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation
Zhijian Liu, Haotian Tang, Alexander Amini, Xinyu Yang, Huizi Mao, Daniela L Rus, and Song Han. Bevfusion: Multi-task multi-sensor fusion with unified bird’s-eye view representation. In2023 IEEE International Conference on Robotics and Automation (ICRA) , pages 2774–2781. IEEE, 2023
work page 2023
-
[44]
Jin Hyeok Yoo, Yecheol Kim, Jisong Kim, and Jun Won Choi. 3d-cvf: Generating joint camera and lidar features using cross-view spatial feature fusion for 3d object detection. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVII 16 , pages 720–736. Springer, 2020
work page 2020
-
[45]
Simple online and realtime tracking
Alex Bewley, Zongyuan Ge, Lionel Ott, Fabio Ramos, and Ben Upcroft. Simple online and realtime tracking. In 2016 IEEE International Conference on Image Processing (ICIP) , pages 3464–3468, 2016
work page 2016
-
[46]
Simple online and realtime tracking with a deep association metric
Nicolai Wojke, Alex Bewley, and Dietrich Paulus. Simple online and realtime tracking with a deep association metric. In 2017 IEEE International Conference on Image Processing (ICIP) , pages 3645–3649. IEEE, 2017
work page 2017
-
[47]
Observation-centric sort: Rethinking sort for robust multi-object tracking
Jinkun Cao, Jiangmiao Pang, Xinshuo Weng, Rawal Khirodkar, and Kris Kitani. Observation-centric sort: Rethinking sort for robust multi-object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9686–9696, June 2023. 13
work page 2023
-
[48]
3d multi-object tracking: A baseline and new evaluation metrics
Xinshuo Weng, Jianren Wang, David Held, and Kris Kitani. 3d multi-object tracking: A baseline and new evaluation metrics. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages 10359–10366, 2020
work page 2020
-
[49]
Cross-modal 3d object detection and tracking for auto-driving
Yihan Zeng, Chao Ma, Ming Zhu, Zhiming Fan, and Xiaokang Yang. Cross-modal 3d object detection and tracking for auto-driving. In Proc. Int. Conf. Intell. Robots Syst , pages 3850–3857. IEEE, 2021
work page 2021
- [50]
-
[51]
A flexible new technique for camera calibration
Zhengyou Zhang. A flexible new technique for camera calibration. IEEE Transactions on pattern analysis and machine intelligence, 22(11):1330–1334, 2000
work page 2000
-
[52]
Berg, Wan-Yen Lo, Piotr Dollár, and Ross Girshick
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, and Ross Girshick. Segment anything, 2023
work page 2023
-
[53]
Sustech points: A portable 3d point cloud interactive annotation platform system
E Li, Shuaijun Wang, Chengyang Li, Dachuan Li, Xiangbin Wu, and Qi Hao. Sustech points: A portable 3d point cloud interactive annotation platform system. In 2020 IEEE Intelligent V ehicles Symposium (IV), pages 1108–1115, 2020
work page 2020
-
[54]
Chenjie Wang, Chengyuan Li, Jun Liu, Bin Luo, Xin Su, Yajun Wang, and Yan Gao. U2-onet: A two-level nested octave u-structure network with a multi-scale attention mechanism for moving object segmentation. Remote Sensing, 13(1), 2021
work page 2021
-
[55]
Chenjie Wang, Chengyuan Li, Bin Luo, Wei Wang, and Jun Liu. Riwnet: A moving object instance segmentation network being robust in adverse weather conditions, 2021
work page 2021
-
[56]
Real-time vehicle distance estimation using single view geometry
Ahmed Ali, Ali Hassan, Afsheen Rafaqat Ali, Hussam Ullah Khan, Wajahat Kazmi, and Aamer Zaheer. Real-time vehicle distance estimation using single view geometry. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1100–1109, 2020
work page 2020
-
[57]
Youngseok Kim and Dongsuk Kum. Deep learning based vehicle position and orientation estimation via inverse perspective mapping image. In 2019 IEEE Intelligent V ehicles Symposium (IV), pages 317–323, 2019
work page 2019
-
[58]
Joint vehicle detection and distance prediction via monocular depth estimation
Chao Shen, Xiangmo Zhao, Zhanwen Liu, Tao Gao, and Jiang Xu. Joint vehicle detection and distance prediction via monocular depth estimation. IET Intelligent Transport Systems, 14(7):753–763, 2020
work page 2020
-
[59]
Towards generalization across depth for monocular 3d object detection
Andrea Simonelli, Samuel Rota Buló, Lorenzo Porzi, Elisa Ricci, and Peter Kontschieder. Towards generalization across depth for monocular 3d object detection. In Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm, editors, Computer Vision – ECCV 2020, pages 767–782, Cham, 2020. Springer International Publishing
work page 2020
-
[60]
Towards model generalization for monocular 3d object detection, 2022
Zhenyu Li, Zehui Chen, Ang Li, Liangji Fang, Qinhong Jiang, Xianming Liu, and Junjun Jiang. Towards model generalization for monocular 3d object detection, 2022
work page 2022
-
[61]
Massimo Bertozz, Alberto Broggi, and Alessandra Fascioli. Stereo inverse perspective mapping: theory and applications. Image and Vision Computing, 16(8):585–590, 1998. 14 Appendix .1 Rationale for P3D In roadside scenarios, the camera sensor is usually positioned at a certain height Hc above a local ground plane. We define the camera and the road coordina...
work page 1998
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.