TSBOW -- Traffic Surveillance Benchmark for Occluded Vehicles Under Various Weather Conditions

Chi Dai Tran; Duong Khac Vu; Duong Nguyen-Ngoc Tran; Huy-Hung Nguyen; Hyung-Joon Jeon; Hyung-Min Jeon; Jae Wook Jeon; Long Hoang Pham; Ngoc Doan-Minh Huynh; Quoc Pham-Nam Ho

arxiv: 2602.05414 · v2 · pith:XO7D4KV3new · submitted 2026-02-05 · 💻 cs.CV

TSBOW -- Traffic Surveillance Benchmark for Occluded Vehicles Under Various Weather Conditions

Ngoc Doan-Minh Huynh , Duong Nguyen-Ngoc Tran , Long Hoang Pham , Tai Huu-Phuong Tran , Hyung-Joon Jeon , Huy-Hung Nguyen , Duong Khac Vu , Hyung-Min Jeon

show 5 more authors

Son Hong Phan Quoc Pham-Nam Ho Chi Dai Tran Trinh Le Ba Khanh Jae Wook Jeon

This is my paper

Pith reviewed 2026-05-21 14:09 UTC · model grok-4.3

classification 💻 cs.CV

keywords occluded vehicle detectiontraffic surveillanceextreme weatherobject detection datasetintelligent transportation systemsCCTV videourban traffic monitoringcomputer vision benchmark

0 comments

The pith

TSBOW introduces a dataset of over 32 hours of real urban traffic video with annotations for occluded vehicles across extreme weather.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents the Traffic Surveillance Benchmark for Occluded vehicles under various Weather conditions, or TSBOW, as a large real-world collection meant to support better vehicle detection when objects are partly hidden or weather is severe. It draws from more than 32 hours of CCTV footage in busy cities and supplies over 48,000 hand-labeled frames plus millions of additional semi-labeled ones across eight traffic categories. The work argues that prior datasets stop at mild haze, rain, or snow and therefore miss the stronger degradations now common with global warming. By including varied road layouts, camera angles, and participant scales, the dataset is positioned to help train systems that keep traffic flowing safely even when visibility drops.

Core claim

The TSBOW dataset supplies more than 32 hours of real-world urban traffic footage, over 48,000 manually annotated frames, and 3.2 million semi-labeled frames covering eight classes from large vehicles to pedestrians, collected under the full range of annual weather conditions to improve occluded vehicle detection.

What carries the argument

The TSBOW dataset, a collection of real-world CCTV traffic videos with bounding-box annotations for occluded participants under diverse weather.

If this is right

Object detectors can be retrained to maintain performance when vehicles are partially hidden or when visibility is reduced by heavy weather.
CCTV monitoring systems for intelligent transportation can operate more reliably across all seasons.
Researchers gain a testbed for measuring how different weather types and occlusion levels affect detection of cars, trucks, bikes, and pedestrians.
New algorithms that adapt to signal degradation or recover missing parts of vehicles can be developed and compared on the same data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The dataset could be paired with weather simulation tools to create even larger training sets without new collection effort.
It offers a way to quantify how much each weather factor hurts current detectors and to guide targeted improvements.
Cities could use the data to decide where to add or upgrade cameras for year-round traffic oversight.

Load-bearing premise

That real urban footage gathered under extreme weather and occlusions will provide the variations needed to train detectors that generalize beyond what lighter-weather datasets already allow.

What would settle it

An experiment in which object detectors trained on TSBOW show no accuracy gain on occluded vehicles during heavy rain, snow, or fog compared with detectors trained only on earlier public datasets.

Figures

Figures reproduced from arXiv: 2602.05414 by Chi Dai Tran, Duong Khac Vu, Duong Nguyen-Ngoc Tran, Huy-Hung Nguyen, Hyung-Joon Jeon, Hyung-Min Jeon, Jae Wook Jeon, Long Hoang Pham, Ngoc Doan-Minh Huynh, Quoc Pham-Nam Ho, Son Hong Phan, Tai Huu-Phuong Tran, Trinh Le Ba Khanh.

**Figure 1.** Figure 1: Scenes from the TSBOW dataset, comprising 198 videos recorded across four distinct scenarios spanning all seasons (sunny/cloudy, haze/fog, rain, snow) over a year. The dataset emphasizes adverse weather conditions and densely populated urban areas with heavy traffic, addressing significant challenges in image degradation and vehicle occlusion. For object detection task, most existing traffic surveillance b… view at source ↗

**Figure 2.** Figure 2: Detailed overview of the data collection and annotation pipeline. The process commences with the recording and categorization of videos during the data collection phase. Subsequently, the videos are preprocessed and allocated to a team of annotators for manual labeling. Next, a state-of-the-art model is fine-tuned to automatically annotate the remaining frames. The resulting annotations are then verified a… view at source ↗

**Figure 3.** Figure 3: Suwon recording locations in TSBOW dataset. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: An example of detecting vehicles in heavy snow by using the YOLOv8x and YOLO11x models [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: An example of road types (RT) and scales (S) [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization of annotated instances of different classes in TSBOW dataset. [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗

**Figure 7.** Figure 7: Challenges by Weather Conditions in TSBOW. [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 8.** Figure 8: Selected scenes for comparison with other datasets [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 1.** Figure 1: Comparison with other datasets about weather conditions and scales [PITH_FULL_IMAGE:figures/full_fig_p014_1.png] view at source ↗

**Figure 2.** Figure 2: Comparison weather conditions and disaster [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗

**Figure 3.** Figure 3: Car accident in the TSBOW dataset 2.3 Train-Test Split Videos capture intervals between red-light phases, each lasts approximately two minutes. The three-minute training splits include vehicle flows and queues. The extended five-minute testing splits capture more flows for comprehensive evaluation. Also, varied CCTVs provide diverse viewpoints, mitigating over-optimistic results from similarities. 2.4 Diff… view at source ↗

**Figure 4.** Figure 4: Examples of Road Types and Scales in the TSBOW dataset [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗

**Figure 5.** Figure 5: Intersections under different viewpoints in diverse weather conditions [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

**Figure 6.** Figure 6: Model performances under different weather conditions [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗

**Figure 7.** Figure 7: Model performances when training on different datasets [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗

read the original abstract

Global warming has intensified the frequency and severity of extreme weather events, which degrade CCTV signal and video quality while disrupting traffic flow, thereby increasing traffic accident rates. Existing datasets, often limited to light haze, rain, and snow, fail to capture extreme weather conditions. To address this gap, this study introduces the Traffic Surveillance Benchmark for Occluded vehicles under various Weather conditions (TSBOW), a comprehensive dataset designed to enhance occluded vehicle detection across diverse annual weather scenarios. Comprising over 32 hours of real-world traffic data from densely populated urban areas, TSBOW includes more than 48,000 manually annotated and 3.2 million semi-labeled frames; bounding boxes spanning eight traffic participant classes from large vehicles to micromobility devices and pedestrians. We establish an object detection benchmark for TSBOW, highlighting challenges posed by occlusions and adverse weather. With its varied road types, scales, and viewpoints, TSBOW serves as a critical resource for advancing Intelligent Transportation Systems. Our findings underscore the potential of CCTV-based traffic monitoring, pave the way for new research and applications. The TSBOW dataset is publicly available at: https://github.com/SKKUAutoLab/TSBOW.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TSBOW releases a sizable real-world traffic video dataset aimed at occluded vehicles in varied weather, but the abstract gives little quantitative backing for how extreme those conditions are compared to prior work.

read the letter

Hi, This paper's main offering is TSBOW, a dataset with more than 32 hours of real-world urban traffic footage focused on occluded vehicles across different weather conditions. It includes over 48,000 manually annotated frames and 3.2 million semi-labeled ones, with bounding boxes for eight classes of traffic participants. They do a decent job highlighting the practical challenges for object detection in these scenarios and make the data available publicly. That could be helpful for people working on robust CCTV-based monitoring systems, especially as weather extremes become more common. The collection from densely populated areas with varied roads and viewpoints is a plus for real-world applicability. The soft spot is that the description of the weather conditions stays qualitative. There are no numbers on things like visibility or rainfall intensity, and no tables comparing severity to previous datasets that covered only lighter conditions. Without those, it's difficult to judge how much new ground this actually breaks. The paper also skips details on annotation guidelines or quality checks. This is the kind of thing that might interest researchers in applied computer vision for transportation. Someone looking for new data to train or evaluate models under adverse conditions could find it relevant. I think it should go to peer review so the methods and any benchmark results can be properly examined. Best regards,

Referee Report

2 major / 2 minor

Summary. The paper introduces the TSBOW dataset for traffic surveillance, comprising over 32 hours of real-world urban CCTV footage under diverse weather conditions with occlusions. It includes more than 48,000 manually annotated frames and 3.2 million semi-labeled frames across eight traffic participant classes (from large vehicles to pedestrians and micromobility devices). The authors establish an object detection benchmark highlighting challenges from occlusions and adverse weather, and release the dataset publicly to support advances in intelligent transportation systems.

Significance. A dataset that demonstrably extends coverage to extreme weather conditions with reliable annotations and public availability would be a useful addition to the field, enabling better evaluation of robust detection methods for real-world traffic monitoring where existing resources are limited to milder conditions. The scale and multi-class coverage are positive aspects if the collection and labeling quality can be verified.

major comments (2)

[Dataset description] Dataset description section: The central claim that TSBOW fills a gap by capturing extreme weather (beyond light haze, rain, and snow in prior datasets) relies on categorical descriptions without quantitative severity metrics such as visibility ranges, precipitation rates, or side-by-side comparison tables. This leaves the comprehensiveness assertion unsupported by objective evidence.
[Annotation and collection protocol] Annotation and collection protocol subsection: No details are supplied on the data collection protocol (camera specs, locations, time spans), inter-annotator agreement, or validation steps for the 48,000 manual annotations. These omissions are load-bearing for the reliability of the established benchmark and the claim of a comprehensive resource.

minor comments (2)

[Abstract] The abstract mentions 'diverse annual weather scenarios' but does not list the specific weather categories used; adding an explicit enumeration would improve clarity.
[Benchmark results] Figure captions for the benchmark results could more explicitly note the evaluation metrics and any baseline methods compared.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript introducing the TSBOW dataset. We address each major comment in turn below, indicating where revisions will be made to improve the presentation of the dataset and its supporting details.

read point-by-point responses

Referee: [Dataset description] Dataset description section: The central claim that TSBOW fills a gap by capturing extreme weather (beyond light haze, rain, and snow in prior datasets) relies on categorical descriptions without quantitative severity metrics such as visibility ranges, precipitation rates, or side-by-side comparison tables. This leaves the comprehensiveness assertion unsupported by objective evidence.

Authors: We agree that the current description would benefit from quantitative support. In the revised version, we will augment the Dataset description section with available quantitative severity metrics drawn from the collection period, such as measured visibility ranges and precipitation rates for the captured conditions. We will also insert a side-by-side comparison table against prior datasets to make the extension to more extreme weather conditions explicit and evidence-based. revision: yes
Referee: [Annotation and collection protocol] Annotation and collection protocol subsection: No details are supplied on the data collection protocol (camera specs, locations, time spans), inter-annotator agreement, or validation steps for the 48,000 manual annotations. These omissions are load-bearing for the reliability of the established benchmark and the claim of a comprehensive resource.

Authors: We concur that these protocol details are essential for establishing benchmark reliability. We will expand the Annotation and collection protocol subsection to include the camera specifications (model, resolution, and frame rate), the specific urban locations and time spans of footage collection, inter-annotator agreement statistics (e.g., percentage overlap and Cohen’s kappa), and the multi-stage validation process applied to the 48,000 manual annotations. revision: yes

Circularity Check

0 steps flagged

No circularity in dataset introduction paper

full rationale

This manuscript is a dataset release paper whose central contribution is the collection, annotation, and public release of the TSBOW traffic surveillance data (32 hours of urban footage, 48k manual + 3.2M semi-labeled frames, eight classes). No derivation chain, equations, fitted parameters, or predictive modeling appears in the provided text; the claims about filling gaps in extreme-weather coverage rest on the empirical description of the collection process itself rather than any self-referential definition, fitted-input prediction, or load-bearing self-citation. The work is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

As a dataset paper the work rests on standard computer-vision data practices rather than new mathematical axioms or fitted parameters.

axioms (1)

domain assumption Urban traffic footage collected under real conditions adequately represents the range of occlusion and weather challenges for general traffic surveillance
The motivation and utility claims depend on this representativeness without further justification in the abstract.

pith-pipeline@v0.9.0 · 5812 in / 1225 out tokens · 62159 ms · 2026-05-21T14:09:11.690617+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages · 1 internal anchor

[1]

The American Statistician 36(3a):153–157 Charoenphakdee N, Cui Z, Zhang Y, et al (2021) Classification with rejection based on cost- sensitive classification

doi:10.1109/CVPR.2009.5206848. Zhengxia Zou, Keyan Chen, Zhenwei Shi, Yuhong Guo, and Jieping Ye. Object detection in 20 years: A survey. Proceedings of the IEEE, 111(3):257–276,

work page doi:10.1109/cvpr.2009.5206848 2009
[2]

Lawrence Zitnick

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. Microsoft coco: Common objects in context. In David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars, editors,Computer Vision – ECCV 2014, pages 740–755, Cham,

work page 2014
[3]

Lvis: A dataset for large vocabulary instance segmentation

Agrim Gupta, Piotr Dollár, and Ross Girshick. Lvis: A dataset for large vocabulary instance segmentation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5351–5359,

work page 2019
[4]

Daquan Zhou, Kai Wang, Jianyang Gu, Xiangyu Peng, Dongze Lian, Yifan Zhang, Yang You, and Jiashi Feng

doi:10.1109/CVPR.2019.00550. Daquan Zhou, Kai Wang, Jianyang Gu, Xiangyu Peng, Dongze Lian, Yifan Zhang, Yang You, and Jiashi Feng. Dataset quantization. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 17205–17216,

work page doi:10.1109/cvpr.2019.00550 2019
[5]

doi:https://doi.org/10.1016/j.cviu.2020.102907

ISSN 1077-3142. doi:https://doi.org/10.1016/j.cviu.2020.102907. URL https://www.sciencedirect.com/science/article/pii/S1077314220300035. Zhihong Sun, Guoheng Wei, Wei Fu, Mang Ye, Kui Jiang, Chao Liang, Tingting Zhu, Tao He, and Mithun Mukherjee. Multiple pedestrian tracking under occlusion: A survey and outlook.IEEE Transactions on Circuits and Systems f...

work page doi:10.1016/j.cviu.2020.102907 2020
[6]

doi:10.1145/3664647.3681153

ISBN 9798400706868. doi:10.1145/3664647.3681153. URL https://doi.org/10.1145/3664647.3681153. 11 TSBOW: Traffic Surveillance Benchmark for Occluded Vehicles Under Various Weather Conditions Aayush Atul Verma, Bharatesh Chakravarthi, Arpitsinh Vaghela, Hua Wei, and Yezhou Yang. Etram: Event-based traffic monitoring dataset. InProceedings - 2024 IEEE/CVF Co...

work page doi:10.1145/3664647.3681153 2024
[7]

2024 , burl =

doi:10.1109/CVPR52733.2024.02136. Mark Weber, Jun Xie, Maxwell D. Collins, Yukun Zhu, Paul V oigtlaender, Hartwig Adam, Bradley Green, Andreas Geiger, Bastian Leibe, Daniel Cremers, Aljosa Osep, Laura Leal-Taixé, and Liang-Chieh Chen. STEP: segmenting and tracking every pixel.CoRR, abs/2102.11859,

work page doi:10.1109/cvpr52733.2024.02136 2024
[8]

Haiji Liang and Ruize Han

URL https://arxiv.org/abs/2102.11859. Haiji Liang and Ruize Han. Ovt-b: A new large-scale benchmark for open-vocabulary multi-object tracking. In A. Glober- son, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 14849–14863. Curran Associates, Inc.,

work page arXiv
[9]

cc/paper_files/paper/2024/file/1adeeac24ce6168e20bcee85645720e9-Paper-Datasets_and_Benchmarks_Track.pdf

URL https://proceedings.neurips. cc/paper_files/paper/2024/file/1adeeac24ce6168e20bcee85645720e9-Paper-Datasets_and_Benchmarks_Track.pdf. Carlos A. Diaz-Ruiz, Youya Xia, Yurong You, Jose Nino, Junan Chen, Josephine Monica, Xiangyu Chen, Katie Luo, Yan Wang, Marc Emond, Wei-Lun Chao, Bharath Hariharan, Kilian Q. Weinberger, and Mark Campbell. Ithaca365: Da...

work page 2024
[10]

URL https://proceedings.neurips.cc/paper_files/paper/2022/file/ 58168e8a92994655d6da3939e7cc0918-Paper-Datasets_and_Benchmarks.pdf. Canon. Canon eos 500d. https://www.canon.co.uk/for_home/product_finder/cameras/digital_slr/eos_550d/,

work page 2022
[11]

doi:10.1109/TITS.2018.2872502

ISSN 1524-9050. doi:10.1109/TITS.2018.2872502. Ross Girshick. Fast r-cnn. InProceedings of the IEEE international conference on computer vision, pages 1440–1448,

work page doi:10.1109/tits.2018.2872502 2018
[12]

YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review

ISSN 2075-1702. doi:10.3390/machines11070677. URL https://www.mdpi.com/2075-1702/11/7/677. Priyanto Hidayatullah, Nurjannah Syakrani, Muhammad Rizqi Sholahuddin, Trisna Gelar, and Refdinal Tubagus. Yolov8 to yolo11: A comprehensive architecture in-depth comparative review.arXiv preprint arXiv:2501.13400,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.3390/machines11070677 2075
[13]

Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, and Mubarak Shah

URL https://arxiv.org/abs/2502.14740. Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, and Mubarak Shah. Transformers in vision: A survey.ACM computing surveys (CSUR), 54(10s):1–41,

work page arXiv
[14]

Disaster

13 TSBOW: Traffic Surveillance Benchmark for Occluded Vehicles Under Various Weather Conditions Figure 1: Comparison with other datasets about weather conditions and scales Supplementary Materials 1 Related Works Fig. 1 compares weather conditions and scales across the TSBOW dataset and other datasets, including UA VDT, UA-DETRAC, and AAURainSnow. UA-DETR...

work page 2023

[1] [1]

The American Statistician 36(3a):153–157 Charoenphakdee N, Cui Z, Zhang Y, et al (2021) Classification with rejection based on cost- sensitive classification

doi:10.1109/CVPR.2009.5206848. Zhengxia Zou, Keyan Chen, Zhenwei Shi, Yuhong Guo, and Jieping Ye. Object detection in 20 years: A survey. Proceedings of the IEEE, 111(3):257–276,

work page doi:10.1109/cvpr.2009.5206848 2009

[2] [2]

Lawrence Zitnick

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. Microsoft coco: Common objects in context. In David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars, editors,Computer Vision – ECCV 2014, pages 740–755, Cham,

work page 2014

[3] [3]

Lvis: A dataset for large vocabulary instance segmentation

Agrim Gupta, Piotr Dollár, and Ross Girshick. Lvis: A dataset for large vocabulary instance segmentation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5351–5359,

work page 2019

[4] [4]

Daquan Zhou, Kai Wang, Jianyang Gu, Xiangyu Peng, Dongze Lian, Yifan Zhang, Yang You, and Jiashi Feng

doi:10.1109/CVPR.2019.00550. Daquan Zhou, Kai Wang, Jianyang Gu, Xiangyu Peng, Dongze Lian, Yifan Zhang, Yang You, and Jiashi Feng. Dataset quantization. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 17205–17216,

work page doi:10.1109/cvpr.2019.00550 2019

[5] [5]

doi:https://doi.org/10.1016/j.cviu.2020.102907

ISSN 1077-3142. doi:https://doi.org/10.1016/j.cviu.2020.102907. URL https://www.sciencedirect.com/science/article/pii/S1077314220300035. Zhihong Sun, Guoheng Wei, Wei Fu, Mang Ye, Kui Jiang, Chao Liang, Tingting Zhu, Tao He, and Mithun Mukherjee. Multiple pedestrian tracking under occlusion: A survey and outlook.IEEE Transactions on Circuits and Systems f...

work page doi:10.1016/j.cviu.2020.102907 2020

[6] [6]

doi:10.1145/3664647.3681153

ISBN 9798400706868. doi:10.1145/3664647.3681153. URL https://doi.org/10.1145/3664647.3681153. 11 TSBOW: Traffic Surveillance Benchmark for Occluded Vehicles Under Various Weather Conditions Aayush Atul Verma, Bharatesh Chakravarthi, Arpitsinh Vaghela, Hua Wei, and Yezhou Yang. Etram: Event-based traffic monitoring dataset. InProceedings - 2024 IEEE/CVF Co...

work page doi:10.1145/3664647.3681153 2024

[7] [7]

2024 , burl =

doi:10.1109/CVPR52733.2024.02136. Mark Weber, Jun Xie, Maxwell D. Collins, Yukun Zhu, Paul V oigtlaender, Hartwig Adam, Bradley Green, Andreas Geiger, Bastian Leibe, Daniel Cremers, Aljosa Osep, Laura Leal-Taixé, and Liang-Chieh Chen. STEP: segmenting and tracking every pixel.CoRR, abs/2102.11859,

work page doi:10.1109/cvpr52733.2024.02136 2024

[8] [8]

Haiji Liang and Ruize Han

URL https://arxiv.org/abs/2102.11859. Haiji Liang and Ruize Han. Ovt-b: A new large-scale benchmark for open-vocabulary multi-object tracking. In A. Glober- son, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 14849–14863. Curran Associates, Inc.,

work page arXiv

[9] [9]

cc/paper_files/paper/2024/file/1adeeac24ce6168e20bcee85645720e9-Paper-Datasets_and_Benchmarks_Track.pdf

URL https://proceedings.neurips. cc/paper_files/paper/2024/file/1adeeac24ce6168e20bcee85645720e9-Paper-Datasets_and_Benchmarks_Track.pdf. Carlos A. Diaz-Ruiz, Youya Xia, Yurong You, Jose Nino, Junan Chen, Josephine Monica, Xiangyu Chen, Katie Luo, Yan Wang, Marc Emond, Wei-Lun Chao, Bharath Hariharan, Kilian Q. Weinberger, and Mark Campbell. Ithaca365: Da...

work page 2024

[10] [10]

URL https://proceedings.neurips.cc/paper_files/paper/2022/file/ 58168e8a92994655d6da3939e7cc0918-Paper-Datasets_and_Benchmarks.pdf. Canon. Canon eos 500d. https://www.canon.co.uk/for_home/product_finder/cameras/digital_slr/eos_550d/,

work page 2022

[11] [11]

doi:10.1109/TITS.2018.2872502

ISSN 1524-9050. doi:10.1109/TITS.2018.2872502. Ross Girshick. Fast r-cnn. InProceedings of the IEEE international conference on computer vision, pages 1440–1448,

work page doi:10.1109/tits.2018.2872502 2018

[12] [12]

YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review

ISSN 2075-1702. doi:10.3390/machines11070677. URL https://www.mdpi.com/2075-1702/11/7/677. Priyanto Hidayatullah, Nurjannah Syakrani, Muhammad Rizqi Sholahuddin, Trisna Gelar, and Refdinal Tubagus. Yolov8 to yolo11: A comprehensive architecture in-depth comparative review.arXiv preprint arXiv:2501.13400,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.3390/machines11070677 2075

[13] [13]

Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, and Mubarak Shah

URL https://arxiv.org/abs/2502.14740. Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, and Mubarak Shah. Transformers in vision: A survey.ACM computing surveys (CSUR), 54(10s):1–41,

work page arXiv

[14] [14]

Disaster

13 TSBOW: Traffic Surveillance Benchmark for Occluded Vehicles Under Various Weather Conditions Figure 1: Comparison with other datasets about weather conditions and scales Supplementary Materials 1 Related Works Fig. 1 compares weather conditions and scales across the TSBOW dataset and other datasets, including UA VDT, UA-DETRAC, and AAURainSnow. UA-DETR...

work page 2023