pith. sign in

arxiv: 2602.05414 · v2 · pith:XO7D4KV3new · submitted 2026-02-05 · 💻 cs.CV

TSBOW -- Traffic Surveillance Benchmark for Occluded Vehicles Under Various Weather Conditions

Pith reviewed 2026-05-21 14:09 UTC · model grok-4.3

classification 💻 cs.CV
keywords occluded vehicle detectiontraffic surveillanceextreme weatherobject detection datasetintelligent transportation systemsCCTV videourban traffic monitoringcomputer vision benchmark
0
0 comments X

The pith

TSBOW introduces a dataset of over 32 hours of real urban traffic video with annotations for occluded vehicles across extreme weather.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents the Traffic Surveillance Benchmark for Occluded vehicles under various Weather conditions, or TSBOW, as a large real-world collection meant to support better vehicle detection when objects are partly hidden or weather is severe. It draws from more than 32 hours of CCTV footage in busy cities and supplies over 48,000 hand-labeled frames plus millions of additional semi-labeled ones across eight traffic categories. The work argues that prior datasets stop at mild haze, rain, or snow and therefore miss the stronger degradations now common with global warming. By including varied road layouts, camera angles, and participant scales, the dataset is positioned to help train systems that keep traffic flowing safely even when visibility drops.

Core claim

The TSBOW dataset supplies more than 32 hours of real-world urban traffic footage, over 48,000 manually annotated frames, and 3.2 million semi-labeled frames covering eight classes from large vehicles to pedestrians, collected under the full range of annual weather conditions to improve occluded vehicle detection.

What carries the argument

The TSBOW dataset, a collection of real-world CCTV traffic videos with bounding-box annotations for occluded participants under diverse weather.

If this is right

  • Object detectors can be retrained to maintain performance when vehicles are partially hidden or when visibility is reduced by heavy weather.
  • CCTV monitoring systems for intelligent transportation can operate more reliably across all seasons.
  • Researchers gain a testbed for measuring how different weather types and occlusion levels affect detection of cars, trucks, bikes, and pedestrians.
  • New algorithms that adapt to signal degradation or recover missing parts of vehicles can be developed and compared on the same data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The dataset could be paired with weather simulation tools to create even larger training sets without new collection effort.
  • It offers a way to quantify how much each weather factor hurts current detectors and to guide targeted improvements.
  • Cities could use the data to decide where to add or upgrade cameras for year-round traffic oversight.

Load-bearing premise

That real urban footage gathered under extreme weather and occlusions will provide the variations needed to train detectors that generalize beyond what lighter-weather datasets already allow.

What would settle it

An experiment in which object detectors trained on TSBOW show no accuracy gain on occluded vehicles during heavy rain, snow, or fog compared with detectors trained only on earlier public datasets.

Figures

Figures reproduced from arXiv: 2602.05414 by Chi Dai Tran, Duong Khac Vu, Duong Nguyen-Ngoc Tran, Huy-Hung Nguyen, Hyung-Joon Jeon, Hyung-Min Jeon, Jae Wook Jeon, Long Hoang Pham, Ngoc Doan-Minh Huynh, Quoc Pham-Nam Ho, Son Hong Phan, Tai Huu-Phuong Tran, Trinh Le Ba Khanh.

Figure 1
Figure 1. Figure 1: Scenes from the TSBOW dataset, comprising 198 videos recorded across four distinct scenarios spanning all seasons (sunny/cloudy, haze/fog, rain, snow) over a year. The dataset emphasizes adverse weather conditions and densely populated urban areas with heavy traffic, addressing significant challenges in image degradation and vehicle occlusion. For object detection task, most existing traffic surveillance b… view at source ↗
Figure 2
Figure 2. Figure 2: Detailed overview of the data collection and annotation pipeline. The process commences with the recording and categorization of videos during the data collection phase. Subsequently, the videos are preprocessed and allocated to a team of annotators for manual labeling. Next, a state-of-the-art model is fine-tuned to automatically annotate the remaining frames. The resulting annotations are then verified a… view at source ↗
Figure 3
Figure 3. Figure 3: Suwon recording locations in TSBOW dataset. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: An example of detecting vehicles in heavy snow by using the YOLOv8x and YOLO11x models [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: An example of road types (RT) and scales (S) [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Visualization of annotated instances of different classes in TSBOW dataset. [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Challenges by Weather Conditions in TSBOW. [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Selected scenes for comparison with other datasets [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗
Figure 1
Figure 1. Figure 1: Comparison with other datasets about weather conditions and scales [PITH_FULL_IMAGE:figures/full_fig_p014_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison weather conditions and disaster [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Car accident in the TSBOW dataset 2.3 Train-Test Split Videos capture intervals between red-light phases, each lasts approximately two minutes. The three-minute training splits include vehicle flows and queues. The extended five-minute testing splits capture more flows for comprehensive evaluation. Also, varied CCTVs provide diverse viewpoints, mitigating over-optimistic results from similarities. 2.4 Diff… view at source ↗
Figure 4
Figure 4. Figure 4: Examples of Road Types and Scales in the TSBOW dataset [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Intersections under different viewpoints in diverse weather conditions [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Model performances under different weather conditions [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Model performances when training on different datasets [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗
read the original abstract

Global warming has intensified the frequency and severity of extreme weather events, which degrade CCTV signal and video quality while disrupting traffic flow, thereby increasing traffic accident rates. Existing datasets, often limited to light haze, rain, and snow, fail to capture extreme weather conditions. To address this gap, this study introduces the Traffic Surveillance Benchmark for Occluded vehicles under various Weather conditions (TSBOW), a comprehensive dataset designed to enhance occluded vehicle detection across diverse annual weather scenarios. Comprising over 32 hours of real-world traffic data from densely populated urban areas, TSBOW includes more than 48,000 manually annotated and 3.2 million semi-labeled frames; bounding boxes spanning eight traffic participant classes from large vehicles to micromobility devices and pedestrians. We establish an object detection benchmark for TSBOW, highlighting challenges posed by occlusions and adverse weather. With its varied road types, scales, and viewpoints, TSBOW serves as a critical resource for advancing Intelligent Transportation Systems. Our findings underscore the potential of CCTV-based traffic monitoring, pave the way for new research and applications. The TSBOW dataset is publicly available at: https://github.com/SKKUAutoLab/TSBOW.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces the TSBOW dataset for traffic surveillance, comprising over 32 hours of real-world urban CCTV footage under diverse weather conditions with occlusions. It includes more than 48,000 manually annotated frames and 3.2 million semi-labeled frames across eight traffic participant classes (from large vehicles to pedestrians and micromobility devices). The authors establish an object detection benchmark highlighting challenges from occlusions and adverse weather, and release the dataset publicly to support advances in intelligent transportation systems.

Significance. A dataset that demonstrably extends coverage to extreme weather conditions with reliable annotations and public availability would be a useful addition to the field, enabling better evaluation of robust detection methods for real-world traffic monitoring where existing resources are limited to milder conditions. The scale and multi-class coverage are positive aspects if the collection and labeling quality can be verified.

major comments (2)
  1. [Dataset description] Dataset description section: The central claim that TSBOW fills a gap by capturing extreme weather (beyond light haze, rain, and snow in prior datasets) relies on categorical descriptions without quantitative severity metrics such as visibility ranges, precipitation rates, or side-by-side comparison tables. This leaves the comprehensiveness assertion unsupported by objective evidence.
  2. [Annotation and collection protocol] Annotation and collection protocol subsection: No details are supplied on the data collection protocol (camera specs, locations, time spans), inter-annotator agreement, or validation steps for the 48,000 manual annotations. These omissions are load-bearing for the reliability of the established benchmark and the claim of a comprehensive resource.
minor comments (2)
  1. [Abstract] The abstract mentions 'diverse annual weather scenarios' but does not list the specific weather categories used; adding an explicit enumeration would improve clarity.
  2. [Benchmark results] Figure captions for the benchmark results could more explicitly note the evaluation metrics and any baseline methods compared.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript introducing the TSBOW dataset. We address each major comment in turn below, indicating where revisions will be made to improve the presentation of the dataset and its supporting details.

read point-by-point responses
  1. Referee: [Dataset description] Dataset description section: The central claim that TSBOW fills a gap by capturing extreme weather (beyond light haze, rain, and snow in prior datasets) relies on categorical descriptions without quantitative severity metrics such as visibility ranges, precipitation rates, or side-by-side comparison tables. This leaves the comprehensiveness assertion unsupported by objective evidence.

    Authors: We agree that the current description would benefit from quantitative support. In the revised version, we will augment the Dataset description section with available quantitative severity metrics drawn from the collection period, such as measured visibility ranges and precipitation rates for the captured conditions. We will also insert a side-by-side comparison table against prior datasets to make the extension to more extreme weather conditions explicit and evidence-based. revision: yes

  2. Referee: [Annotation and collection protocol] Annotation and collection protocol subsection: No details are supplied on the data collection protocol (camera specs, locations, time spans), inter-annotator agreement, or validation steps for the 48,000 manual annotations. These omissions are load-bearing for the reliability of the established benchmark and the claim of a comprehensive resource.

    Authors: We concur that these protocol details are essential for establishing benchmark reliability. We will expand the Annotation and collection protocol subsection to include the camera specifications (model, resolution, and frame rate), the specific urban locations and time spans of footage collection, inter-annotator agreement statistics (e.g., percentage overlap and Cohen’s kappa), and the multi-stage validation process applied to the 48,000 manual annotations. revision: yes

Circularity Check

0 steps flagged

No circularity in dataset introduction paper

full rationale

This manuscript is a dataset release paper whose central contribution is the collection, annotation, and public release of the TSBOW traffic surveillance data (32 hours of urban footage, 48k manual + 3.2M semi-labeled frames, eight classes). No derivation chain, equations, fitted parameters, or predictive modeling appears in the provided text; the claims about filling gaps in extreme-weather coverage rest on the empirical description of the collection process itself rather than any self-referential definition, fitted-input prediction, or load-bearing self-citation. The work is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

As a dataset paper the work rests on standard computer-vision data practices rather than new mathematical axioms or fitted parameters.

axioms (1)
  • domain assumption Urban traffic footage collected under real conditions adequately represents the range of occlusion and weather challenges for general traffic surveillance
    The motivation and utility claims depend on this representativeness without further justification in the abstract.

pith-pipeline@v0.9.0 · 5812 in / 1225 out tokens · 62159 ms · 2026-05-21T14:09:11.690617+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages · 1 internal anchor

  1. [1]

    The American Statistician 36(3a):153–157 Charoenphakdee N, Cui Z, Zhang Y, et al (2021) Classification with rejection based on cost- sensitive classification

    doi:10.1109/CVPR.2009.5206848. Zhengxia Zou, Keyan Chen, Zhenwei Shi, Yuhong Guo, and Jieping Ye. Object detection in 20 years: A survey. Proceedings of the IEEE, 111(3):257–276,

  2. [2]

    Lawrence Zitnick

    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. Microsoft coco: Common objects in context. In David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars, editors,Computer Vision – ECCV 2014, pages 740–755, Cham,

  3. [3]

    Lvis: A dataset for large vocabulary instance segmentation

    Agrim Gupta, Piotr Dollár, and Ross Girshick. Lvis: A dataset for large vocabulary instance segmentation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5351–5359,

  4. [4]

    Daquan Zhou, Kai Wang, Jianyang Gu, Xiangyu Peng, Dongze Lian, Yifan Zhang, Yang You, and Jiashi Feng

    doi:10.1109/CVPR.2019.00550. Daquan Zhou, Kai Wang, Jianyang Gu, Xiangyu Peng, Dongze Lian, Yifan Zhang, Yang You, and Jiashi Feng. Dataset quantization. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 17205–17216,

  5. [5]

    doi:https://doi.org/10.1016/j.cviu.2020.102907

    ISSN 1077-3142. doi:https://doi.org/10.1016/j.cviu.2020.102907. URL https://www.sciencedirect.com/science/article/pii/S1077314220300035. Zhihong Sun, Guoheng Wei, Wei Fu, Mang Ye, Kui Jiang, Chao Liang, Tingting Zhu, Tao He, and Mithun Mukherjee. Multiple pedestrian tracking under occlusion: A survey and outlook.IEEE Transactions on Circuits and Systems f...

  6. [6]

    doi:10.1145/3664647.3681153

    ISBN 9798400706868. doi:10.1145/3664647.3681153. URL https://doi.org/10.1145/3664647.3681153. 11 TSBOW: Traffic Surveillance Benchmark for Occluded Vehicles Under Various Weather Conditions Aayush Atul Verma, Bharatesh Chakravarthi, Arpitsinh Vaghela, Hua Wei, and Yezhou Yang. Etram: Event-based traffic monitoring dataset. InProceedings - 2024 IEEE/CVF Co...

  7. [7]

    2024 , burl =

    doi:10.1109/CVPR52733.2024.02136. Mark Weber, Jun Xie, Maxwell D. Collins, Yukun Zhu, Paul V oigtlaender, Hartwig Adam, Bradley Green, Andreas Geiger, Bastian Leibe, Daniel Cremers, Aljosa Osep, Laura Leal-Taixé, and Liang-Chieh Chen. STEP: segmenting and tracking every pixel.CoRR, abs/2102.11859,

  8. [8]

    Haiji Liang and Ruize Han

    URL https://arxiv.org/abs/2102.11859. Haiji Liang and Ruize Han. Ovt-b: A new large-scale benchmark for open-vocabulary multi-object tracking. In A. Glober- son, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 14849–14863. Curran Associates, Inc.,

  9. [9]

    cc/paper_files/paper/2024/file/1adeeac24ce6168e20bcee85645720e9-Paper-Datasets_and_Benchmarks_Track.pdf

    URL https://proceedings.neurips. cc/paper_files/paper/2024/file/1adeeac24ce6168e20bcee85645720e9-Paper-Datasets_and_Benchmarks_Track.pdf. Carlos A. Diaz-Ruiz, Youya Xia, Yurong You, Jose Nino, Junan Chen, Josephine Monica, Xiangyu Chen, Katie Luo, Yan Wang, Marc Emond, Wei-Lun Chao, Bharath Hariharan, Kilian Q. Weinberger, and Mark Campbell. Ithaca365: Da...

  10. [10]

    URL https://proceedings.neurips.cc/paper_files/paper/2022/file/ 58168e8a92994655d6da3939e7cc0918-Paper-Datasets_and_Benchmarks.pdf. Canon. Canon eos 500d. https://www.canon.co.uk/for_home/product_finder/cameras/digital_slr/eos_550d/,

  11. [11]

    doi:10.1109/TITS.2018.2872502

    ISSN 1524-9050. doi:10.1109/TITS.2018.2872502. Ross Girshick. Fast r-cnn. InProceedings of the IEEE international conference on computer vision, pages 1440–1448,

  12. [12]

    YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review

    ISSN 2075-1702. doi:10.3390/machines11070677. URL https://www.mdpi.com/2075-1702/11/7/677. Priyanto Hidayatullah, Nurjannah Syakrani, Muhammad Rizqi Sholahuddin, Trisna Gelar, and Refdinal Tubagus. Yolov8 to yolo11: A comprehensive architecture in-depth comparative review.arXiv preprint arXiv:2501.13400,

  13. [13]

    Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, and Mubarak Shah

    URL https://arxiv.org/abs/2502.14740. Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, and Mubarak Shah. Transformers in vision: A survey.ACM computing surveys (CSUR), 54(10s):1–41,

  14. [14]

    Disaster

    13 TSBOW: Traffic Surveillance Benchmark for Occluded Vehicles Under Various Weather Conditions Figure 1: Comparison with other datasets about weather conditions and scales Supplementary Materials 1 Related Works Fig. 1 compares weather conditions and scales across the TSBOW dataset and other datasets, including UA VDT, UA-DETRAC, and AAURainSnow. UA-DETR...