arxiv: 2602.05414 · v1 · submitted 2026-02-05 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

TSBOW: Traffic Surveillance Benchmark for Occluded Vehicles Under Various Weather Conditions

Ngoc Doan-Minh Huynh , Duong Nguyen-Ngoc Tran , Long Hoang Pham , Tai Huu-Phuong Tran , Hyung-Joon Jeon , Huy-Hung Nguyen , Duong Khac Vu , Hyung-Min Jeon

show 5 more authors

Son Hong Phan Quoc Pham-Nam Ho Chi Dai Tran Trinh Le Ba Khanh Jae Wook Jeon

Authors on Pith no claims yet

Pith reviewed 2026-05-16 07:03 UTC · model grok-4.3

classification 💻 cs.CV

keywords traffic surveillanceoccluded vehiclesweather conditionsbenchmark datasetobject detectionCCTVintelligent transportation

0 comments

The pith

The TSBOW dataset supplies 32 hours of real urban traffic video to benchmark detection of occluded vehicles in extreme weather.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces TSBOW as a new benchmark dataset to improve occluded vehicle detection in various weather conditions. It consists of over 32 hours of CCTV footage from dense urban areas, including more than 48,000 manually annotated frames and 3.2 million semi-labeled frames covering eight classes of traffic participants. The dataset addresses limitations in existing resources that only cover light weather issues and aims to support better intelligent transportation systems by providing diverse road types, scales, and viewpoints. If successful, it would allow models to handle real-world challenges from occlusions and adverse weather like heavy rain or snow, potentially lowering accident rates.

Core claim

The Traffic Surveillance Benchmark for Occluded vehicles under various Weather conditions (TSBOW) is a comprehensive dataset with over 32 hours of real-world traffic data, more than 48,000 manually annotated and 3.2 million semi-labeled frames, bounding boxes for eight traffic participant classes, designed to enhance occluded vehicle detection across diverse annual weather scenarios and establish an object detection benchmark.

What carries the argument

The TSBOW dataset itself, which provides annotated CCTV traffic footage under various weather to serve as a testbed for object detection algorithms.

If this is right

Improved object detection models for traffic monitoring in bad weather.
Better performance of intelligent transportation systems during extreme weather events.
New research directions for handling occlusions in vehicle detection.
Public availability enables widespread testing and comparison of algorithms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Future work could combine TSBOW with synthetic data generation for even more extreme conditions.
Testing on this dataset might highlight the need for weather-specific preprocessing in detection pipelines.
The emphasis on micromobility and pedestrians opens uses in broader urban monitoring systems.

Load-bearing premise

The 32 hours of urban footage and manual annotations accurately capture the full range of real-world occlusion and extreme-weather challenges without significant collection or labeling biases.

What would settle it

If state-of-the-art detectors trained on TSBOW show no improvement over those trained on prior datasets when tested on new extreme weather videos collected independently.

Figures

Figures reproduced from arXiv: 2602.05414 by Chi Dai Tran, Duong Khac Vu, Duong Nguyen-Ngoc Tran, Huy-Hung Nguyen, Hyung-Joon Jeon, Hyung-Min Jeon, Jae Wook Jeon, Long Hoang Pham, Ngoc Doan-Minh Huynh, Quoc Pham-Nam Ho, Son Hong Phan, Tai Huu-Phuong Tran, Trinh Le Ba Khanh.

**Figure 1.** Figure 1: Scenes from the TSBOW dataset, comprising 198 videos recorded across four distinct scenarios spanning all seasons (sunny/cloudy, haze/fog, rain, snow) over a year. The dataset emphasizes adverse weather conditions and densely populated urban areas with heavy traffic, addressing significant challenges in image degradation and vehicle occlusion. For object detection task, most existing traffic surveillance b… view at source ↗

**Figure 2.** Figure 2: Detailed overview of the data collection and annotation pipeline. The process commences with the recording and categorization of videos during the data collection phase. Subsequently, the videos are preprocessed and allocated to a team of annotators for manual labeling. Next, a state-of-the-art model is fine-tuned to automatically annotate the remaining frames. The resulting annotations are then verified a… view at source ↗

**Figure 3.** Figure 3: Suwon recording locations in TSBOW dataset. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: An example of detecting vehicles in heavy snow by using the YOLOv8x and YOLO11x models [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: An example of road types (RT) and scales (S) [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization of annotated instances of different classes in TSBOW dataset. [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗

**Figure 7.** Figure 7: Challenges by Weather Conditions in TSBOW. [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 8.** Figure 8: Selected scenes for comparison with other datasets [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 1.** Figure 1: Comparison with other datasets about weather conditions and scales [PITH_FULL_IMAGE:figures/full_fig_p014_1.png] view at source ↗

**Figure 2.** Figure 2: Comparison weather conditions and disaster [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗

**Figure 3.** Figure 3: Car accident in the TSBOW dataset 2.3 Train-Test Split Videos capture intervals between red-light phases, each lasts approximately two minutes. The three-minute training splits include vehicle flows and queues. The extended five-minute testing splits capture more flows for comprehensive evaluation. Also, varied CCTVs provide diverse viewpoints, mitigating over-optimistic results from similarities. 2.4 Diff… view at source ↗

**Figure 4.** Figure 4: Examples of Road Types and Scales in the TSBOW dataset [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗

**Figure 5.** Figure 5: Intersections under different viewpoints in diverse weather conditions [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

**Figure 6.** Figure 6: Model performances under different weather conditions [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗

**Figure 7.** Figure 7: Model performances when training on different datasets [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗

read the original abstract

Global warming has intensified the frequency and severity of extreme weather events, which degrade CCTV signal and video quality while disrupting traffic flow, thereby increasing traffic accident rates. Existing datasets, often limited to light haze, rain, and snow, fail to capture extreme weather conditions. To address this gap, this study introduces the Traffic Surveillance Benchmark for Occluded vehicles under various Weather conditions (TSBOW), a comprehensive dataset designed to enhance occluded vehicle detection across diverse annual weather scenarios. Comprising over 32 hours of real-world traffic data from densely populated urban areas, TSBOW includes more than 48,000 manually annotated and 3.2 million semi-labeled frames; bounding boxes spanning eight traffic participant classes from large vehicles to micromobility devices and pedestrians. We establish an object detection benchmark for TSBOW, highlighting challenges posed by occlusions and adverse weather. With its varied road types, scales, and viewpoints, TSBOW serves as a critical resource for advancing Intelligent Transportation Systems. Our findings underscore the potential of CCTV-based traffic monitoring, pave the way for new research and applications. The TSBOW dataset is publicly available at: https://github.com/SKKUAutoLab/TSBOW.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TSBOW is a useful new dataset release for occluded vehicles in tougher weather, but the paper is light on validation details.

read the letter

TSBOW is a useful new dataset release for occluded vehicles in tougher weather, but the paper is light on validation details. The core contribution is the collection of over 32 hours of real urban CCTV footage with annotations across eight classes that include micromobility devices and pedestrians, plus a benchmark setup to expose detection failures under occlusion and adverse conditions. This directly targets the gap the abstract notes in existing traffic datasets that stop at light haze, rain, or snow. Releasing the data publicly and including scale from 48k manual plus 3.2M semi-labeled frames gives the field something concrete to work with for intelligent transportation systems research. The motivation around global warming effects on CCTV and traffic flow is straightforward and relevant. The soft spots sit in the missing mechanics. The writeup supplies no breakdown of weather-condition distribution, occlusion severity statistics, annotation protocols, inter-annotator agreement, or checks that the chosen sites and time windows actually hit rare extremes without bias. The semi-labeling step is described only in passing, so its error characteristics stay unknown. These omissions make it difficult to judge how representative the 32 hours really are. This paper is mainly for researchers building or testing object detectors for traffic surveillance who need harder real-world cases. A reader focused on robustness under weather and occlusion would extract value from the data release itself even if the supporting text stays brief. It deserves peer review because the dataset idea addresses a practical need and the release can be refined with fuller documentation on collection and quality.

Referee Report

3 major / 1 minor

Summary. The paper introduces the TSBOW dataset, a collection of over 32 hours of real-world urban CCTV traffic footage containing more than 48,000 manually annotated frames and 3.2 million semi-labeled frames across eight traffic participant classes. It is positioned as filling a gap in existing datasets by covering occluded vehicles under diverse and extreme weather conditions, and the authors establish an object detection benchmark to demonstrate the challenges posed by occlusions and adverse weather.

Significance. If the dataset collection, annotation quality, and coverage of extreme weather are rigorously validated, TSBOW would provide a valuable new resource for developing robust detection models in intelligent transportation systems, addressing limitations of prior datasets restricted to mild conditions.

major comments (3)

[Abstract] Abstract: the claim that TSBOW captures 'extreme weather conditions' and 'diverse annual weather scenarios' is unsupported because the text supplies no quantitative distribution of weather types, occlusion severity levels, or site/time sampling strategy.
[Dataset construction] Dataset construction section: no annotation protocol, inter-annotator agreement statistics, or quality-control procedures are described for the 48k manual labels or the 3.2M semi-labeled frames, which is load-bearing for any claim that the benchmark reliably measures real-world performance.
[Benchmark] Benchmark results: the manuscript states that an object detection benchmark is established but reports neither baseline mAP scores, specific model architectures tested, nor how the semi-labeled data was incorporated into training or evaluation, preventing assessment of the dataset's utility.

minor comments (1)

The GitHub link is given but the manuscript should include a brief data-card summary (format, splits, license) directly in the text for readers who cannot immediately access the repository.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We appreciate the opportunity to clarify and strengthen the presentation of the TSBOW dataset. Below we address each major comment point by point.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that TSBOW captures 'extreme weather conditions' and 'diverse annual weather scenarios' is unsupported because the text supplies no quantitative distribution of weather types, occlusion severity levels, or site/time sampling strategy.

Authors: We agree that the abstract would benefit from quantitative support for these claims. In the revised version, we will expand the abstract slightly and add a dedicated subsection in the dataset description that provides the distribution of weather conditions (e.g., percentages for heavy rain, fog, snow, etc.), occlusion severity levels, and details on the site and time sampling strategy used for data collection. revision: yes
Referee: [Dataset construction] Dataset construction section: no annotation protocol, inter-annotator agreement statistics, or quality-control procedures are described for the 48k manual labels or the 3.2M semi-labeled frames, which is load-bearing for any claim that the benchmark reliably measures real-world performance.

Authors: We acknowledge that the manuscript lacks a detailed description of the annotation process. We will add a new subsection detailing the annotation protocol, including the guidelines provided to annotators, the tools used, inter-annotator agreement metrics (such as Cohen's kappa or IoU thresholds for agreement), and quality control procedures including review by experts and consistency checks. For the semi-labeled frames, we will clarify the semi-supervised labeling approach and any validation steps. revision: yes
Referee: [Benchmark] Benchmark results: the manuscript states that an object detection benchmark is established but reports neither baseline mAP scores, specific model architectures tested, nor how the semi-labeled data was incorporated into training or evaluation, preventing assessment of the dataset's utility.

Authors: The current manuscript introduces the benchmark by describing the evaluation setup and highlighting the challenges, but does not include numerical baseline results. To address this, we will include baseline experiments using standard object detection models such as Faster R-CNN and YOLOv8, reporting mAP scores under different weather and occlusion conditions. We will also specify how the semi-labeled data is used (e.g., for training with pseudo-labels or as additional unlabeled data) and provide details on the train/validation/test splits. revision: yes

Circularity Check

0 steps flagged

Dataset release paper exhibits no circularity in derivations

full rationale

The paper introduces a new benchmark dataset TSBOW consisting of real-world traffic footage and annotations. There are no equations, predictions, or derivations claimed that could reduce to self-referential inputs. The contribution is empirical data collection and benchmarking, which is self-contained without circular logic.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper introduces no free parameters, mathematical axioms, or invented entities; it consists of empirical video collection and standard annotation practices.

pith-pipeline@v0.9.0 · 5581 in / 1114 out tokens · 41607 ms · 2026-05-16T07:03:57.061225+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages · 1 internal anchor

[1]

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L

doi:10.1109/CVPR.2009.5206848. Zhengxia Zou, Keyan Chen, Zhenwei Shi, Yuhong Guo, and Jieping Ye. Object detection in 20 years: A survey. Proceedings of the IEEE, 111(3):257–276,

work page doi:10.1109/cvpr.2009.5206848 2009
[2]

Lawrence Zitnick

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. Microsoft coco: Common objects in context. In David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars, editors,Computer Vision – ECCV 2014, pages 740–755, Cham,

work page 2014
[3]

Lvis: A dataset for large vocabulary instance segmentation

Agrim Gupta, Piotr Dollár, and Ross Girshick. Lvis: A dataset for large vocabulary instance segmentation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5351–5359,

work page 2019
[4]

Daquan Zhou, Kai Wang, Jianyang Gu, Xiangyu Peng, Dongze Lian, Yifan Zhang, Yang You, and Jiashi Feng

doi:10.1109/CVPR.2019.00550. Daquan Zhou, Kai Wang, Jianyang Gu, Xiangyu Peng, Dongze Lian, Yifan Zhang, Yang You, and Jiashi Feng. Dataset quantization. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 17205–17216,

work page doi:10.1109/cvpr.2019.00550 2019
[5]

doi:https://doi.org/10.1016/j.cviu.2020.102907

ISSN 1077-3142. doi:https://doi.org/10.1016/j.cviu.2020.102907. URL https://www.sciencedirect.com/science/article/pii/S1077314220300035. Zhihong Sun, Guoheng Wei, Wei Fu, Mang Ye, Kui Jiang, Chao Liang, Tingting Zhu, Tao He, and Mithun Mukherjee. Multiple pedestrian tracking under occlusion: A survey and outlook.IEEE Transactions on Circuits and Systems f...

work page doi:10.1016/j.cviu.2020.102907 2020
[6]

doi:10.1145/3664647.3681153

ISBN 9798400706868. doi:10.1145/3664647.3681153. URL https://doi.org/10.1145/3664647.3681153. 11 TSBOW: Traffic Surveillance Benchmark for Occluded Vehicles Under Various Weather Conditions Aayush Atul Verma, Bharatesh Chakravarthi, Arpitsinh Vaghela, Hua Wei, and Yezhou Yang. Etram: Event-based traffic monitoring dataset. InProceedings - 2024 IEEE/CVF Co...

work page doi:10.1145/3664647.3681153 2024
[7]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp

doi:10.1109/CVPR52733.2024.02136. Mark Weber, Jun Xie, Maxwell D. Collins, Yukun Zhu, Paul V oigtlaender, Hartwig Adam, Bradley Green, Andreas Geiger, Bastian Leibe, Daniel Cremers, Aljosa Osep, Laura Leal-Taixé, and Liang-Chieh Chen. STEP: segmenting and tracking every pixel.CoRR, abs/2102.11859,

work page doi:10.1109/cvpr52733.2024.02136 2024
[8]

Haiji Liang and Ruize Han

URL https://arxiv.org/abs/2102.11859. Haiji Liang and Ruize Han. Ovt-b: A new large-scale benchmark for open-vocabulary multi-object tracking. In A. Glober- son, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 14849–14863. Curran Associates, Inc.,

work page arXiv
[9]

cc/paper_files/paper/2024/file/1adeeac24ce6168e20bcee85645720e9-Paper-Datasets_and_Benchmarks_Track.pdf

URL https://proceedings.neurips. cc/paper_files/paper/2024/file/1adeeac24ce6168e20bcee85645720e9-Paper-Datasets_and_Benchmarks_Track.pdf. Carlos A. Diaz-Ruiz, Youya Xia, Yurong You, Jose Nino, Junan Chen, Josephine Monica, Xiangyu Chen, Katie Luo, Yan Wang, Marc Emond, Wei-Lun Chao, Bharath Hariharan, Kilian Q. Weinberger, and Mark Campbell. Ithaca365: Da...

work page 2024
[10]

URL https://proceedings.neurips.cc/paper_files/paper/2022/file/ 58168e8a92994655d6da3939e7cc0918-Paper-Datasets_and_Benchmarks.pdf. Canon. Canon eos 500d. https://www.canon.co.uk/for_home/product_finder/cameras/digital_slr/eos_550d/,

work page 2022
[11]

doi:10.1109/TITS.2018.2872502

ISSN 1524-9050. doi:10.1109/TITS.2018.2872502. Ross Girshick. Fast r-cnn. InProceedings of the IEEE international conference on computer vision, pages 1440–1448,

work page doi:10.1109/tits.2018.2872502 2018
[12]

YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review

ISSN 2075-1702. doi:10.3390/machines11070677. URL https://www.mdpi.com/2075-1702/11/7/677. Priyanto Hidayatullah, Nurjannah Syakrani, Muhammad Rizqi Sholahuddin, Trisna Gelar, and Refdinal Tubagus. Yolov8 to yolo11: A comprehensive architecture in-depth comparative review.arXiv preprint arXiv:2501.13400,

work page internal anchor Pith review Pith/arXiv arXiv doi:10.3390/machines11070677 2075
[13]

Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, and Mubarak Shah

URL https://arxiv.org/abs/2502.14740. Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, and Mubarak Shah. Transformers in vision: A survey.ACM computing surveys (CSUR), 54(10s):1–41,

work page arXiv
[14]

Disaster

13 TSBOW: Traffic Surveillance Benchmark for Occluded Vehicles Under Various Weather Conditions Figure 1: Comparison with other datasets about weather conditions and scales Supplementary Materials 1 Related Works Fig. 1 compares weather conditions and scales across the TSBOW dataset and other datasets, including UA VDT, UA-DETRAC, and AAURainSnow. UA-DETR...

work page 2023