pith. machine review for the scientific record. sign in

arxiv: 2604.01044 · v2 · submitted 2026-04-01 · 💻 cs.CV

Recognition: no theorem link

A global dataset of continuous urban dashcam driving

Authors on Pith no claims yet

Pith reviewed 2026-05-13 22:27 UTC · model grok-4.3

classification 💻 cs.CV
keywords dashcam dataseturban drivingcomputer visionobject detectionglobal datasetYouTube videostraffic scenescross-domain robustness
0
0 comments X

The pith

CROWD supplies 20,000 hours of routine urban dashcam video from 238 countries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CROWD, a collection of minute-long, unedited front-facing dashcam segments drawn from ordinary YouTube videos. The segments focus on everyday city driving and deliberately exclude crashes, aftermath footage, or heavily edited content. Coverage spans 7,103 places across all six inhabited continents, with manual labels for time of day and vehicle type plus automated detections and tracks for 80 common object classes. Researchers receive only video identifiers and segment boundaries, so the underlying videos stay on YouTube while annotations remain reproducible. The scale and curation choices aim to support testing of computer-vision models under varied real-world conditions rather than incident-specific scenes.

Core claim

The authors assembled CROWD (City Road Observations With Dashcams) from 42,032 publicly available YouTube videos, yielding 51,753 manually screened segments that total 20,275.56 hours. These temporally contiguous clips represent routine urban driving in 238 countries and territories, each annotated at the segment level for day versus night and vehicle type, together with YOLOv11x detections and BoT-SORT tracks for all 80 MS-COCO classes.

What carries the argument

The CROWD dataset of manually curated, temporally contiguous urban dashcam segments screened from YouTube videos and augmented with manual labels plus machine-generated object detections and tracks.

If this is right

  • Models can be benchmarked for robustness across continents using pre-computed detections rather than requiring new data collection.
  • Studies of traffic interactions gain access to large-scale, unedited sequences that avoid contamination from crash-centric videos.
  • Reproducible experiments become possible worldwide because only video IDs and timestamps are distributed.
  • Geographic variation in driving scenes can be examined at the scale of thousands of distinct inhabited places.
  • Cross-domain transfer tests gain a ready-made split by continent, time of day, and vehicle type.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the geographic spread holds, models trained on CROWD may reduce performance gaps in regions underrepresented in current driving datasets.
  • The exclusion of crashes narrows the dataset's direct use for safety-critical detection but makes it complementary to incident-focused collections.
  • Future extensions could add weather or road-condition labels to the existing segment metadata without altering the core release format.

Load-bearing premise

The manual curation process from available YouTube videos produces a representative sample of routine urban driving without significant geographic or content-selection biases.

What would settle it

A breakdown showing that more than 70 percent of the segments originate from five or fewer countries, or that a substantial fraction contain edited or incident-focused content, would falsify the claim of broad routine coverage.

Figures

Figures reproduced from arXiv: 2604.01044 by Md Shadab Alam, Olena Bazilinska, Pavlo Bazilinskyy.

Figure 1
Figure 1. Figure 1: Geographic coverage of CROWD across 7,103 localities world [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Upload date distribution of YouTube videos contributing at [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
read the original abstract

We introduce CROWD (City Road Observations With Dashcams), a manually curated dataset of ordinary, minute scale, temporally contiguous, unedited, front facing urban dashcam segments screened and segmented from publicly available YouTube videos. CROWD is designed to support cross-domain robustness and interaction analysis by prioritising routine driving and explicitly excluding crashes, crash aftermath, and other edited or incident-focused content. The release contains 51,753 segment records spanning 20,275.56 hours (42,032 videos), covering 7,103 named inhabited places in 238 countries and territories across all six inhabited continents (Africa, Asia, Europe, North America, South America and Oceania), with segment level manual labels for time of day (day or night) and vehicle type. To lower the barrier for benchmarking, we provide per-segment CSV files of machine-generated detections for all 80 MS-COCO classes produced with YOLOv11x, together with segment-local multi-object tracks (BoT-SORT); e.g. person, bicycle, motorcycle, car, bus, truck, traffic light, stop sign, etc. CROWD is distributed as video identifiers with segment boundaries and derived annotations, enabling reproducible research without redistributing the underlying videos.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces CROWD (City Road Observations With Dashcams), a manually curated dataset of 51,753 temporally contiguous, unedited urban dashcam segments extracted from publicly available YouTube videos. The release spans 20,275.56 hours across 7,103 named places in 238 countries and territories, with manual labels for time of day and vehicle type plus pre-computed YOLOv11x detections and BoT-SORT tracks for all 80 MS-COCO classes. The dataset is positioned to support cross-domain robustness and interaction analysis by prioritizing routine driving and excluding crashes or edited content; it is distributed via video identifiers and segment boundaries rather than raw video files.

Significance. If the curation criteria and coverage claims hold, the dataset fills a notable gap by supplying a large-scale, globally distributed collection of ordinary urban driving footage with derived annotations that lower the barrier to benchmarking. The emphasis on minute-scale contiguous segments and explicit exclusion of incident content distinguishes it from existing dashcam resources and could enable more representative evaluations of robustness across geographies and conditions.

major comments (1)
  1. [Abstract] Abstract and dataset description: the central claim that the 51,753 segments constitute a representative sample of routine urban driving across 238 countries rests on the manual screening and segmentation process from YouTube; however, no quantitative validation (e.g., normalized coverage statistics by country population, urban density, or video-upload demographics) is provided to assess potential selection biases from video availability or curator decisions.
minor comments (2)
  1. [Dataset Release] Dataset release section: a summary table or supplementary figure showing segment counts per continent, top countries, and time-of-day split would improve transparency of the claimed global coverage.
  2. [Annotations] Annotations paragraph: clarify the exact criteria and any inter-annotator agreement metrics used for the manual time-of-day and vehicle-type labels to allow users to gauge label reliability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their positive assessment and recommendation for minor revision. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract and dataset description: the central claim that the 51,753 segments constitute a representative sample of routine urban driving across 238 countries rests on the manual screening and segmentation process from YouTube; however, no quantitative validation (e.g., normalized coverage statistics by country population, urban density, or video-upload demographics) is provided to assess potential selection biases from video availability or curator decisions.

    Authors: We agree that the manuscript should more precisely distinguish broad geographic coverage from statistical representativeness. The dataset is assembled from publicly available YouTube videos, which carry inherent biases in uploader demographics, regional upload rates, and content popularity; our manual screening adds an additional layer of selection. We did not supply normalized coverage statistics (e.g., segments per capita or per urban km) because no compatible external benchmark of total routine urban driving footage per country exists. In the revised version we will (1) edit the abstract and introduction to state explicitly that CROWD provides extensive but not necessarily representative coverage across 238 countries, and (2) add a short subsection under Limitations that discusses these YouTube-derived biases and advises users on appropriate interpretation. These changes clarify scope without requiring new external data. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset release paper with purely descriptive claims

full rationale

This is a data-release paper introducing the CROWD dataset of dashcam segments curated from YouTube videos. The central claims concern the existence, scale, coverage, and annotation properties of the released data (51,753 segments, 20,275 hours, 238 countries, manual labels for time-of-day and vehicle type, plus YOLO detections). No mathematical derivations, equations, predictions, fitted parameters, or uniqueness theorems appear. No self-citations are used to justify load-bearing premises, and the curation process is presented as an empirical fact rather than derived from prior results. The paper is self-contained against external benchmarks; any concerns about geographic or selection bias belong to correctness or representativeness, not circularity in a derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a dataset introduction paper containing no free parameters, mathematical axioms, or postulated entities; the contribution rests entirely on the curation process and annotations described.

pith-pipeline@v0.9.0 · 5520 in / 1020 out tokens · 54428 ms · 2026-05-13T22:27:29.719878+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · 3 internal anchors

  1. [1]

    WHO global air quality guidelines: particulate matter (PM2.5 and PM10), ozone, nitrogen dioxide, sulfur dioxide and carbon monoxide.https://www

    World Health Organization. Global status report on road safety 2023 . World Health Organization, Geneva, 2023. ISBN 978-92-4-008651 -7. URL https://www.who.int/publications/i/item/9789240086517. Global report. 15

  2. [2]

    Road safety ope n data, 2025

    Department for Transport, UK Government. Road safety ope n data, 2025. URL https://www.gov.uk/government/statistical-data-sets /road-safety-open-data

  3. [3]

    Dep artment of Transportation

    National Highway Traffic Safety Administration (NHTSA), U.S. Dep artment of Transportation. Fatality analysis reporting system (fars), 20 25. URL https://www.nhtsa.gov/research-data/fatality-analys is-reporting-system-fars

  4. [4]

    T. A. Dingus, S. G. Klauer, V. L. Neale, A. Petersen, S. E. Lee, J . Sudweeks, M. A. Perez, J. Hankey, D. Ramsey, S. Gupta, C. Bucher, Z. R. Doerzaph, J. Jermeland, a nd R. R. Knipling. The 100-car naturalistic driving study, phase ii: Results of the 100-car field expe riment. Technical report (dot hs 810 593), National Highway Traffic Safety Administration,...

  5. [5]

    Second strategic highway research program naturalistic driving st udy methods

    Jonathan F Antin, Suzie Lee, Miguel A Perez, Thomas A Dingus, Jo nathan M Hankey, and Ann Brach. Second strategic highway research program naturalistic driving st udy methods. Safety Science , 119: 2–10, 2019. doi: https://doi.org/10.1016/j.ssci.2019.01.016

  6. [6]

    Udrive: the european naturalistic driving study

    Rob Eenink, Yvonne Barnard, Martin Baumann, Xavier Augros, a nd Fabian Utesch. Udrive: the european naturalistic driving study. In Proceedings of Transport Research Arena. IFSTTAR, 2014. doi: 10.1007/s12544-016-0202-z

  7. [7]

    Safety data access — strat egic highway re- search program 2 (shrp 2)

    Transportation Research Board. Safety data access — strat egic highway re- search program 2 (shrp 2). Transportation Research Board (TR B), Na- tional Academies of Sciences, Engineering, and Medicine, 2010. URL https://www.trb.org/StrategicHighwayResearchProgram 2SHRP2/SHRP2DataSafetyAccess.aspx . Accessed: 2026-02-22

  8. [8]

    Are we ready f or autonomous driving? the kitti vision benchmark suite

    Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready f or autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recogni tion, pages 3354–3361. IEEE, 2012. doi: https://doi.org/10.1109/CVPR.2012 .6248074

  9. [9]

    The Cityscapes dataset for semantic urban scene understanding,

    Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Be- nenson, Uwe Franke, Stefan Roth, and Bernt Schiele. The citysca pes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pa ttern recognition, pages 3213–3223, 2016. doi: https://doi.org/10.1109/CVPR.2016.350

  10. [10]

    The apolloscape dataset for autonomous driving

    Xinyu Huang, Xinjing Cheng, Qichuan Geng, Binbin Cao, Dingfu Zho u, Peng Wang, Yuanqing Lin, and Ruigang Yang. The apolloscape dataset for autonomous driving . In Proceedings of the IEEE conference on computer vision and pattern recognition work shops, pages 954–960, 2018. doi: https: //doi.org/10.1109/CVPRW.2018.00141

  11. [11]

    Bdd100k: A diverse driving dataset for hetero geneous multitask learning

    Fisher Yu, Haofeng Chen, Xin Wang, Wenqi Xian, Yingying Chen, F angchen Liu, Vashisht Madhavan, and Trevor Darrell. Bdd100k: A diverse driving dataset for hetero geneous multitask learning. In Proceedings of the IEEE/CVF conference on computer vision a nd pattern recognition, pages 2636–2645,

  12. [12]

    doi: https://doi.org/10.1109/cvpr42600.2020.00271

  13. [13]

    The mapillary vistas dataset for semantic understanding of street scenes

    Gerhard Neuhold, Tobias Ollmann, Samuel Rota Bulo, and Peter K ontschieder. The mapillary vistas dataset for semantic understanding of street scenes. In Proceedings of the IEEE international conference on computer vision , pages 4990–4999, 2017. doi: 10.1109/ICCV.2017.534

  14. [14]

    Kitti-360: A novel datas et and benchmarks for urban scene understanding in 2d and 3d

    Yiyi Liao, Jun Xie, and Andreas Geiger. Kitti-360: A novel datas et and benchmarks for urban scene understanding in 2d and 3d. IEEE Transactions on Pattern Analysis and Machine Intellig ence, 45(3): 3292–3310, 2022. doi: https://doi.org/10.1109/TPAMI.2022.3179 507

  15. [15]

    Beurer-Kellner, B

    Jakob Geyer, Yohannes Kassahun, Mentar Mahmudi, Xavier Ric ou, Rupesh Durgesh, Andrew S Chung, Lorenz Hauswald, Viet Hoang Pham, Maximilian M¨ uhlegg, Sebastian Do rn, et al. A2d2: Audi au- tonomous driving dataset. arXiv preprint arXiv:2004.06320 , 2020. doi: https://doi.org/10.48550/arX iv.2004.06320. 16

  16. [16]

    Context-awarecrowdcounting, in: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, Computer Vision Foundation / IEEE

    Ming-Fang Chang, John Lambert, Patsorn Sangkloy, Jagjeet Singh, Slawomir Bak, Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan, et al. Argovers e: 3d tracking and forecasting with rich maps. In Proceedings of the IEEE/CVF conference on computer vision a nd pattern recognition , pages 8748–8757, 2019. doi: https://doi.org/10.1109/CVPR.2019 .00895

  17. [17]

    Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting

    Benjamin Wilson, William Qi, Tanmay Agarwal, John Lambert, Jagjee t Singh, Siddhesh Khandelwal, Bowen Pan, Ratnesh Kumar, Andrew Hartnett, Jhony Kaesemode l Pontes, et al. Argoverse 2: Next generation datasets for self-driving perception and forecasting . arXiv preprint arXiv:2301.00493 , 2023. doi: https://doi.org/10.48550/arXiv.2301.00493

  18. [18]

    nuscenes: A mult imodal dataset for autonomous driving

    Holger Caesar, Varun Bankiti, Alex H Lang, Sourabh Vora, Venic e Erin Liong, Qiang Xu, Anush Krish- nan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. nuscenes: A mult imodal dataset for autonomous driving. In Proceedings of the IEEE/CVF conference on computer vision a nd pattern recognition, pages 11621–11631, 2020. doi: https://doi.org/10.1109/CVPR42600....

  19. [19]

    Scalability in percept ion for autonomous driving: Waymo open dataset

    Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Choua rd, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, et al. Scalability in percept ion for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF conference on computer vision a nd pattern recognition, pages 2446–2454, 2020. doi: https://doi.org/10....

  20. [20]

    Pandaset: Advanced sensor suite da taset for autonomous driving

    Pengchuan Xiao, Zhenlei Shao, Steven Hao, Zishuo Zhang, Xiao lin Chai, Judy Jiao, Zesong Li, Jian Wu, Kai Sun, Kun Jiang, et al. Pandaset: Advanced sensor suite da taset for autonomous driving. In 2021 IEEE international intelligent transportation syste ms conference (ITSC), pages 3095–3101. IEEE,

  21. [21]

    doi: https://doi.org/10.1109/ITSC48978.2021.9565009

  22. [22]

    Flohr, and Dariu M

    Markus Braun, Sebastian Krebs, Fabian B. Flohr, and Dariu M. G avrila. Eurocity persons: A novel benchmark for person detection in traffic scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1–1, 2019. ISSN 0162-8828. doi: https://doi.org/10.1109 /TPAMI.2019.2897684

  23. [24]

    Joint attent ion in autonomous driving (jaad)

    Iuliia Kotseruba, Amir Rasouli, and John K Tsotsos. Joint attent ion in autonomous driving (jaad). arXiv preprint arXiv:1609.04741 , 2016. doi: https://doi.org/10.48550/arXiv.1609.04741

  24. [25]

    Unbiased look at dataset b ias

    Antonio Torralba and Alexei A Efros. Unbiased look at dataset b ias. In CVPR 2011 , pages 1521–1528. IEEE, 2011. doi: https://doi.org/10.1109/CVPR.2011.5995347

  25. [26]

    Dada-2000: Can driving accident be predicted by driver attention ƒ analyzed by a benchmark

    Jianwu Fang, Dingxin Yan, Jiahuan Qiao, Jianru Xue, He Wang, and Sen Li. Dada-2000: Can driving accident be predicted by driver attention ƒ analyzed by a benchmark. In 2019 IEEE In- telligent Transportation Systems Conference (ITSC) , pages 4303–4309. IEEE, 2019. doi: https: //doi.org/10.1109/ITSC.2019.8917218

  26. [27]

    ISO 3166 — c ountry codes

    International Organization for Standardization. ISO 3166 — c ountry codes. https://www.iso.org/iso-3166-country-codes.html . Accessed 3 March 2026

  27. [28]

    Seman tic object classes in video: A high- definition ground truth database

    Gabriel J Brostow, Julien Fauqueur, and Roberto Cipolla. Seman tic object classes in video: A high- definition ground truth database. Pattern recognition letters , 30(2):88–97, 2009

  28. [29]

    Pedestrian detection: A benchmark

    Piotr Doll´ ar, Christian Wojek, Bernt Schiele, and Pietro Perona . Pedestrian detection: A benchmark. In 2009 IEEE conference on computer vision and pattern recogni tion, pages 304–311. IEEE, 2009

  29. [30]

    Mot ives and concerns of dashcam video sharing

    Sangkeun Park, Joohyun Kim, Rabeb Mizouni, and Uichin Lee. Mot ives and concerns of dashcam video sharing. In Proceedings of the 2016 CHI Conference on Human Factors in Co mputing Systems , pages 4758–4769, 2016. 17

  30. [31]

    Dashcam witness: Video sharing motives and privacy concerns across different nations

    Joohyun Kim, Sangkeun Park, and Uichin Lee. Dashcam witness: Video sharing motives and privacy concerns across different nations. IEEE Access, 8:110425–110437, 2020

  31. [32]

    Martens, and Pavlo Bazilinskyy

    Md Shadab Alam, Marieke H. Martens, and Pavlo Bazilinskyy. Pede strian planet: What youtube driving from 233 countries and territories teaches us about the world. In Proceedings of the 17th International Conference on Automotive User Interfaces and Interactive V ehicular Applications , AutomotiveUI ’25, page 180–197, New York, NY, USA, 2025. Association f...

  32. [33]

    D 2-city: a large-scale dashcam video dataset of diverse traffic scena rios

    Zhengping Che, Guangyu Li, Tracy Li, Bo Jiang, Xuefeng Shi, Xins heng Zhang, Ying Lu, Guobin Wu, Yan Liu, and Jieping Ye. D 2-city: a large-scale dashcam video dataset of diverse traffic scena rios. arXiv preprint arXiv:1904.01975, 2019. doi: https://doi.org/10.48550/arXiv.1904.01975

  33. [34]

    4seasons: A cross-season dataset for mult i-weather slam in autonomous driving

    Patrick Wenzel, Rui Wang, Nan Yang, Qing Cheng, Qadeer Khan, Lukas Von Stumberg, Niclas Zeller, and Daniel Cremers. 4seasons: A cross-season dataset for mult i-weather slam in autonomous driving. In DAGM German Conference on Pattern Recognition , pages 404–417. Springer, 2020

  34. [35]

    A3carscene: An audio-visual dataset for driving scene understa nding

    Michela Cantarini, Leonardo Gabrielli, Adriano Mancini, Stefano Sq uartini, and Roberto Longo. A3carscene: An audio-visual dataset for driving scene understa nding. Data in Brief , 48:109146, 2023. doi: https://doi.org/10.1016/j.dib.2023.109146

  35. [36]

    Boreas: A multi-season autonomous driving dataset

    Keenan Burnett, David J Yoon, Yuchen Wu, Andrew Z Li, Haowei Zhang, Shichen Lu, Jingxing Qian, Wei-Kang Tseng, Andrew Lambert, Keith YK Leung, et al. Boreas: A multi-season autonomous driving dataset. The International Journal of Robotics Research , 42(1-2):33–42, 2023

  36. [37]

    Canadian adverse driving conditions d ataset

    Matthew Pitropov, Danson Evan Garcia, Jason Rebello, Michael Smart, Carlos Wang, Krzysztof Czar- necki, and Steven Waslander. Canadian adverse driving conditions d ataset. The International Journal of Robotics Research, 40(4-5):681–690, 2021. doi: https://doi.org/10.1177/0278364 920979368

  37. [38]

    Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

    Harald Schafer, Eder Santana, Andrew Haden, and Riccardo B iasini. A commute in data: The comma2k19 dataset. arXiv preprint arXiv:1812.05752 , 2018. doi: https://doi.org/10.48550/arXiv .1812.05752

  38. [39]

    Predicting the driver’s focus of attention: the dr (eye) ve project

    Andrea Palazzi, Davide Abati, Francesco Solera, Rita Cucchiara , et al. Predicting the driver’s focus of attention: the dr (eye) ve project. IEEE transactions on pattern analysis and machine intellig ence, 41 (7):1720–1733, 2018. doi: https://doi.org/10.1109/TPAMI.2018.2 845370

  39. [40]

    Toward driving scene un- derstanding: A dataset for learning driver behavior and causal re asoning

    Vasili Ramanishka, Yi-Ting Chen, Teruhisa Misu, and Kate Saenko . Toward driving scene un- derstanding: A dataset for learning driver behavior and causal re asoning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages 7699–7707, 2018. doi: https: //doi.org/10.1109/CVPR.2018.00803

  40. [41]

    Idd-crs: A comprehensive video dataset for critic al road scenarios in unstructured en- vironments

    Ravi Shankar Mishra, Chirag Parikh, Anbumani Subramanian, C V Jawahar, and Ravi Kiran Sar- vadevabhatla. Idd-crs: A comprehensive video dataset for critic al road scenarios in unstructured en- vironments. In 2025 IEEE Intelligent Vehicles Symposium (IV) , pages 309–315. IEEE, 2025. doi: https://doi.org/10.1109/IV64158.2025.11097494

  41. [42]

    In: IEEE International Conference on Robotics and Automation, ICRA 2024, Yokohama, Japan, May 13-17, 2024

    Chirag Parikh, Rohit Saluja, CV Jawahar, and Ravi Kiran Sarvad evabhatla. Idd-x: A multi-view dataset for ego-relative important object localization and explanation in den se and unstructured traffic. In 2024 IEEE International Conference on Robotics and Automation ( ICRA), pages 14815–14821. IEEE, 2024. doi: https://doi.org/10.1109/ICRA57147.2024.10609989

  42. [43]

    NuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles

    Holger Caesar, Juraj Kabzan, Kok Seang Tan, Whye Kit Fong, E ric Wolff, Alex Lang, Luke Fletcher, Os- car Beijbom, and Sammy Omari. nuplan: A closed-loop ml-based plannin g benchmark for autonomous vehicles. arXiv preprint arXiv:2106.11810 , 2021. doi: 10.48550/arXiv.2106.11810

  43. [44]

    One million scenes for aut onomous driving: Once dataset

    Jiageng Mao, Minzhe Niu, Chenhan Jiang, Hanxue Liang, Jingheng Chen, Xiaodan Liang, Yamin Li, Chaoqiang Ye, Wei Zhang, Zhenguo Li, et al. One million scenes for aut onomous driving: Once dataset. arXiv preprint arXiv:2106.11037 , 2021. doi: https://doi.org/10.48550/arXiv.2106.11037. 18

  44. [45]

    In: CVPR

    Jiazhi Yang, Shenyuan Gao, Yihang Qiu, Li Chen, Tianyu Li, Bo Da i, Kashyap Chitta, Penghao Wu, Jia Zeng, Ping Luo, et al. Generalized predictive model for autonomo us driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni tion, pages 14662–14672, 2024. doi: https://doi.org/10.1109/CVPR52733.2024.01389

  45. [46]

    1 year, 1000 km: The oxford robotcar dataset

    Will Maddern, Geoffrey Pascoe, Chris Linegar, and Paul Newman . 1 year, 1000 km: The oxford robotcar dataset. The International Journal of Robotics Research , 36(1):3–15, 2017. doi: https://doi.org/10.117 7/0278364916679498

  46. [47]

    Physicalai-autonomous-vehicles

    NVIDIA Corporation. Physicalai-autonomous-vehicles. Huggin g Face dataset, 2025. URL https://huggingface.co/datasets/nvidia/PhysicalAI-A utonomous-Vehicles . Access requires agreeing to the NVIDIA Autonomous Vehicle Dataset License Agree ment

  47. [48]

    Zenseact open dataset: A large-scale and diverse multimodal dataset for autonomous driving

    Mina Alibeigi, William Ljungbergh, Adam Tonderski, Georg Hess, Ada m Lilja, Carl Lindstr¨ om, Daria Motorniuk, Junsheng Fu, Jenny Widahl, and Christoffer Petersson . Zenseact open dataset: A large-scale and diverse multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 20178–20188, 2023...

  48. [49]

    Mad: Motion appearance decoupling for efficient driving world models,

    Ahmad Rahimi, Valentin Gerard, Eloi Zablocki, Matthieu Cord, and Alexandre Alahi. Mad: Motion appearance decoupling for efficient driving world models. arXiv preprint arXiv:2601.09452 , 2026. doi: 10.48550/arXiv.2601.09452

  49. [50]

    Global internet phenomena report 2024, 2024

    Sandvine. Global internet phenomena report 2024, 2024. URL https://www.sandvine.com/hubfs/Sandvine_Redesign_2019/Downloads/2024/GIPR/GIPR%202024.pdf

  50. [51]

    The effects of autonomous sensory meridian response (asmr) videos versus walking tour video s on asmr experience, positive affect and state relaxation

    Tobias Lohaus, Sara Y¨ uksekdag, Silja Bellingrath, and Patrizia Thoma. The effects of autonomous sensory meridian response (asmr) videos versus walking tour video s on asmr experience, positive affect and state relaxation. Plos one , 18(1):e0277990, 2023. doi: https://doi.org/10.1016/j.jad.2021 .12.015

  51. [52]

    Asmr, explained: why millions of people are watching youtube videos of someone whispering, 2015

    German Lopez. Asmr, explained: why millions of people are watching youtube videos of someone whispering, 2015. URL https://www.vox.com/2015/7/15/8965393/asmr-video-yo utube-autonomous-sensory-meridian-response

  52. [53]

    Sensory det erminants of the autonomous sensory meridian response (asmr): understanding the triggers

    Emma L Barratt, Charles Spence, and Nick J Davis. Sensory det erminants of the autonomous sensory meridian response (asmr): understanding the triggers. PeerJ, 5:e3846, 2017. doi: https://doi.org/10.7 717/peerj.3846

  53. [54]

    How do youtubers make money? a lesson learned from th e most subscribed youtuber channels

    Bo Han. How do youtubers make money? a lesson learned from th e most subscribed youtuber channels. International Journal of Business Information Systems , 33(1):132–143, 2020. doi: https://doi.org/10.1 504/IJBIS.2020.10026504

  54. [55]

    Nineteen years of asmr on y outube: A multilingual, theme-level analysis of 42,268 videos

    Md Shadab Alam and Pavlo Bazilinskyy. Nineteen years of asmr on y outube: A multilingual, theme-level analysis of 42,268 videos. 2026

  55. [56]

    Principles and Recommendations for Popula- tion and Housing Censuses, Revision 3

    United Nations Statistics Division. Principles and Recommendations for Popula- tion and Housing Censuses, Revision 3 . Number Series M No. 67/Rev.3 in Sta- tistical Papers. United Nations, New York, 2017. ISBN 978-92-1- 161597-5. URL https://unstats.un.org/unsd/demographic-social/Stan dards-and-Methods/files/Principles_and_Recommend

  56. [57]

    Standard country or area co des for statistical use (m49)

    United Nations Statistics Division. Standard country or area co des for statistical use (m49). URL https://unstats.un.org/unsd/methodology/m49/

  57. [58]

    You only look once: Unified, real-time object detection

    Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhad i. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pa ttern recognition, pages 779–788, 2016. doi: 10.48550/arXiv.1506.02640

  58. [59]

    Microsoft coco: Common objects in contex t

    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietr o Perona, Deva Ramanan, Piotr Doll´ ar, and C Lawrence Zitnick. Microsoft coco: Common objects in contex t. In European conference on computer vision, pages 740–755. Springer, 2014. doi: https://doi.org/10.1007/9 78-3-319-10602-1 48. 19

  59. [60]

    Bot-sort: R obust associations multi-pedestrian tracking

    Nir Aharon, Roy Orfaig, and Ben-Zion Bobrovsky. Bot-sort: R obust associations multi-pedestrian tracking. arXiv preprint arXiv:2206.14651 , 2022. doi: https://doi.org/10.48550/arXiv.2206.14651

  60. [61]

    Crowd: Cit y road observations with dashcams [dataset]

    Md Shadab Alam, Olena Bazilinska, and Pavlo Bazilinskyy. Crowd: Cit y road observations with dashcams [dataset]. 2026. doi: https://doi.org/10.4121/06e 9bb9a-a064-412b-b0f3-9ac5dd62ea16. Version described in this Data Descriptor. Author Contributions Conceptualization: Md Shadab Alam, Pavlo Bazilinskyy; Methodology: Md Shadab Alam, Pavlo Bazilin- skyy; So...