A City-Scale Dataset of Traffic Flows, Travel Times, and Urban Context
Pith reviewed 2026-05-20 23:41 UTC · model grok-4.3
The pith
A Padua traffic dataset integrates AVI flows, travel times, and urban context over three months.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present a multi-source traffic dataset derived from Automatic Vehicle Identification recordings in Padua, Italy, spanning from February 2026 to April 2026. The dataset combines traffic volume time series, aggregated at 10-minute intervals, with time-varying trajectory-based flow statistics including transition probability matrices, average travel times, and flow residuals. To enrich the traffic measurements with urban contextual information, we integrate Points Of Interests, demographic data, meteorological variables, and road infrastructure data. All components are accessible through a Python class that loads temporal and contextual data exploiting a spatio-temporal graph representation.
What carries the argument
A Python class that loads temporal and contextual data through a spatio-temporal graph representation integrating traffic volumes, flow statistics, and urban context layers.
If this is right
- Analyses can directly link specific points of interest and demographic profiles to observed transition probabilities and travel times.
- Models of congestion can incorporate meteorological and infrastructure variables as time-varying covariates.
- The 10-minute resolution supports studies of short-term flow residuals and their relation to daily routines.
- The graph structure allows consistent querying of both spatial adjacency and temporal evolution within the same data object.
Where Pith is reading between the lines
- Planners in other mid-sized European cities could adapt the same multi-source fusion approach to create comparable local datasets.
- Machine-learning predictors trained on the integrated graph might generalize better than models using traffic counts alone.
- Longer-term extensions could test whether the same validation patterns hold when the dataset is updated with post-2026 records.
Load-bearing premise
The Automatic Vehicle Identification recordings and derived flow statistics accurately represent overall city traffic without major coverage gaps or sensor biases that would distort the reported volumes, transition matrices, and travel times.
What would settle it
Independent vehicle counts collected by manual observers or alternative sensors at a sample of road segments during rush hours, compared against the dataset's reported volumes and flows, would show large systematic discrepancies if the recordings fail to represent city-wide traffic.
Figures
read the original abstract
We present a multi-source traffic dataset derived from Automatic Vehicle Identification (AVI) recordings in Padua, Italy, spanning from February 2026 to April 2026. The dataset combines traffic volume time series, aggregated at 10-minute intervals, with time-varying trajectory-based flow statistics including transition probability matrices, average travel times, and flow residuals. To enrich the traffic measurements with urban contextual information, we integrate Points Of Interests (POIs), demographic data, meteorological variables, and road infrastructure data. All components are accessible through a Python class that loads temporal and contextual data exploiting a spatio-temporal graph representation. Validation analyses confirm that the dataset captures expected traffic patterns, such as morning and evening rush hours, as well as weekdays vs. weekend days traffic routines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a multi-source traffic dataset for Padua, Italy, derived from Automatic Vehicle Identification (AVI) recordings spanning February to April 2026. It provides 10-minute aggregated traffic volume time series along with trajectory-based statistics including transition probability matrices, average travel times, and flow residuals. These are integrated with contextual layers consisting of Points of Interest, demographic data, meteorological variables, and road infrastructure. Access is facilitated by a Python class that exploits a spatio-temporal graph representation. Validation analyses are reported to reproduce expected traffic patterns such as morning and evening rush hours as well as weekday versus weekend differences.
Significance. If the AVI-derived statistics accurately represent city-wide traffic, the dataset would constitute a useful open resource for urban mobility research by combining high-resolution flow data with rich contextual information and a convenient graph-based access interface. The reproduction of standard diurnal and weekly patterns provides basic evidence of utility for exploratory studies. However, the absence of coverage and bias metrics limits the strength of the city-scale claim and reduces immediate applicability for quantitative modeling.
major comments (1)
- Validation section (and abstract): the reported analyses confirm expected rush-hour and weekday-weekend patterns but supply no quantitative metrics, error bars, sensor coverage fractions, or comparisons against independent counts. This directly affects the central claim that the dataset furnishes a reliable city-scale representation, because unquantified gaps or biases in AVI recordings would systematically distort volumes, transition matrices, and travel times even while preserving high-level temporal signatures.
minor comments (1)
- Abstract and data-access description: the term 'flow residuals' is introduced without a concise definition or formula; adding one sentence would improve clarity for readers unfamiliar with the derivation.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the validation of our dataset. We address the major comment below.
read point-by-point responses
-
Referee: Validation section (and abstract): the reported analyses confirm expected rush-hour and weekday-weekend patterns but supply no quantitative metrics, error bars, sensor coverage fractions, or comparisons against independent counts. This directly affects the central claim that the dataset furnishes a reliable city-scale representation, because unquantified gaps or biases in AVI recordings would systematically distort volumes, transition matrices, and travel times even while preserving high-level temporal signatures.
Authors: We agree that the validation analyses in the current manuscript are primarily descriptive and do not include quantitative metrics, error bars, sensor coverage fractions, or comparisons to independent counts. This limitation does weaken the strength of the city-scale representation claim, as unquantified biases in the AVI data could affect the derived statistics. In the revised manuscript we will add a dedicated subsection to the validation section reporting sensor coverage (number and spatial distribution of AVI points, estimated fraction of total network traffic captured based on road class), along with basic quantitative descriptors such as standard deviations and coefficients of variation for the 10-minute flow time series during peak periods. We will also update the abstract to reference these additions and more explicitly note the data-source limitations. However, we do not have access to independent traffic counts from other sensors or manual surveys, so direct bias comparisons against external benchmarks cannot be performed. revision: partial
- Direct quantitative comparisons against independent traffic counts from external sources, as no such auxiliary datasets were available to the authors.
Circularity Check
No circularity: dataset release with no derivations or predictions
full rationale
This paper is a data release describing the collection and packaging of AVI-derived traffic volumes, transition matrices, travel times, and urban context variables for Padua. No mathematical derivations, fitted parameters, predictions, or self-referential computations are claimed or present. Validation consists only of confirming expected diurnal and weekly patterns at a descriptive level, which does not reduce to any input by construction or rely on self-citation chains. The work is self-contained as an empirical dataset contribution without load-bearing steps that could be circular.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption AVI sensor recordings provide representative samples of vehicle movements across the monitored road network
- domain assumption Integration of POI, demographic, meteorological, and infrastructure data can be performed without introducing systematic alignment errors
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Validation analyses confirm that the dataset captures expected traffic patterns, such as morning and evening rush hours, as well as weekdays vs. weekend days traffic routines.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Transition probability matrix pτ(si,sj) = Cτ(si,sj)/Oτ(si)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Human mobility: Models and applications.Physics Reports, 734:1–74, 2018
Hugo Barbosa, Marc Barthelemy, Gourab Ghoshal, Charlotte R James, Maxime Lenormand, Thomas Louail, Ronaldo Menezes, Jos´ e J Ramasco, Filippo Simini, and Marcello Tomasini. Human mobility: Models and applications.Physics Reports, 734:1–74, 2018
work page 2018
-
[2]
Gianni Barlacchi, Marco De Nadai, Roberto Larcher, Antonio Casella, Cristiana Chitic, Giovanni Torrisi, Fabrizio Antonelli, Alessandro Vespignani, Alex Pentland, and Bruno Lepri. A multi-source dataset of urban life in the city of milan and the province of trentino.Scientific data, 2(1):1–15, 2015
work page 2015
-
[3]
Batty.The new science of cities
M. Batty.The new science of cities. MIT Press, 2013. 14
work page 2013
-
[4]
Ciro Beneduce, Tania Gull´ on Mu˜ noz-Repiso, Bruno Lepri, and Massimiliano Luca. pyspain- mobility: a python package to access and manage spanish open mobility data.arXiv preprint arXiv:2506.13385, 2025
-
[5]
Claudia Bergroth, Olle J¨ arv, Henrikki Tenkanen, Matti Manninen, and Tuuli Toivonen. A 24-hour population distribution dataset based on mobile phone data from helsinki metropolitan area, finland. Scientific data, 9(1):39, 2022
work page 2022
-
[6]
A unified theory of urban living.Nature, 467(7318):912–913, 2010
Luis Bettencourt and Geoffrey West. A unified theory of urban living.Nature, 467(7318):912–913, 2010
work page 2010
-
[7]
Lu´ ıs MA Bettencourt, Jos´ e Lobo, Dirk Helbing, Christian K¨ uhnert, and Geoffrey B West. Growth, innovation, scaling, and the pace of life in cities.Proceedings of the national academy of sciences, 104(17):7301–7306, 2007
work page 2007
-
[8]
Paul Blanchard and Stefania Rubrichi. A highly granular temporary migration dataset derived from mobile phone data in senegal.Scientific Data, 12(1):1051, 2025
work page 2025
-
[9]
Bike flow prediction with multi-graph convolutional networks
Di Chai, Leye Wang, and Qiang Yang. Bike flow prediction with multi-graph convolutional networks. InProceedings of the 26th ACM SIGSPATIAL international conference on advances in geographic information systems, pages 397–400, 2018
work page 2018
-
[10]
A density-based algorithm for discovering clusters in large spatial databases with noise
Martin Ester, Hans-Peter Kriegel, J¨ org Sander, and Xiaowei Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. In Evangelos Simoudis, Jiawei Han, and Usama M. Fayyad, editors,Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, Oregon, USA, pages 226–231. ...
work page 1996
-
[11]
Mobility networks in greater mexico city.Scientific data, 11(1):84, 2024
Marisol Flores-Garrido, Guillermo de Anda-J´ auregui, Plinio Guzm´ an, Amilcar Meneses-Viveros, Alfredo Hern´ andez-´Alvarez, Erika Cruz-Bonilla, and Maribel Hern´ andez-Rosales. Mobility networks in greater mexico city.Scientific data, 11(1):84, 2024
work page 2024
-
[12]
Sheida Hadavi, Heleen Buldeo Rai, Sara Verlinde, He Huang, Cathy Macharis, and Tias Guns. Analyzing passenger and freight vehicle movements from automatic-number plate recognition camera data.European Transport Research Review, 12(1):37, 2020
work page 2020
-
[13]
Chunguang He, Dianhai Wang, Mengwei Chen, Guomin Qian, and Zhengyi Cai. Link dynamic vehicle count estimation based on travel time distribution using license plate recognition data. Transportmetrica A: transport science, 19(2):2012299, 2023
work page 2023
-
[14]
Yuhao Kang, Song Gao, Yunlei Liang, Mingxiao Li, Jinmeng Rao, and Jake Kruse. Multiscale dynamic human mobility flow dataset in the us during the covid-19 epidemic.Scientific data, 7(1):390, 2020
work page 2020
-
[15]
Pattama Krataithong, Chutiporn Anutariya, and Marut Buranarach. A taxi trajectory and social media data management platform for tourist behavior analysis.Sustainability, 14(8):4677, 2022
work page 2022
-
[16]
Ferran Larroya, Ofelia D´ ıaz, Oleguer Sagarra, Pol Colomer Sim´ on, Salva Ferr´ e, Esteban Moro, and Josep Perell´ o. Home-to-school pedestrian mobility gps data from a citizen science experiment in the barcelona area.Scientific data, 10(1):428, 2023
work page 2023
-
[17]
High-resolution multi-source traffic data in new zealand.Scientific Data, 11(1):1216, 2024
Bo Li, Ruotao Yu, Zijun Chen, Yingzhe Ding, Mingxia Yang, Jinghua Li, Jianxiao Wang, and Haiwang Zhong. High-resolution multi-source traffic data in new zealand.Scientific Data, 11(1):1216, 2024
work page 2024
-
[18]
Dynamic graph convolutional recurrent network for traffic prediction: Benchmark and solution
Fuxian Li, Jie Feng, Huan Yan, Guangyin Jin, Fan Yang, Funing Sun, Depeng Jin, and Yong Li. Dynamic graph convolutional recurrent network for traffic prediction: Benchmark and solution. ACM Trans. Knowl. Discov. Data, 17(1):9:1–9:21, 2023
work page 2023
-
[19]
Gary Chan, Ruiyuan Li, Yang Liu, Ming Zhang, Chih-Chieh Hung, and Wen-Chih Peng
Guanyao Li, Shuhan Zhong, Xingdong Deng, Letian Xiang, S.-H. Gary Chan, Ruiyuan Li, Yang Liu, Ming Zhang, Chih-Chieh Hung, and Wen-Chih Peng. A lightweight and accurate spatial-temporal transformer for traffic forecasting.IEEE Trans. Knowl. Data Eng., 35(11):10967–10980, 2023
work page 2023
-
[20]
Diffusion convolutional recurrent neural net- work: Data-driven traffic forecasting
Yaguang Li, Rose Yu, Cyrus Shahabi, and Yan Liu. Diffusion convolutional recurrent neural net- work: Data-driven traffic forecasting. In6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. Open- Review.net, 2018. 15
work page 2018
-
[21]
Cross city traffic flow gener- ation via retrieval augmented diffusion model
Yudong Li, Jingyuan Wang, Xie Yu, Peiyu Wang, and Qian Huang. Cross city traffic flow gener- ation via retrieval augmented diffusion model. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025
work page 2025
-
[22]
Yue Li, Qunshan Zhao, and Mingshu Wang. High-resolution traffic flow data from the urban traffic control system in glasgow.Scientific Data, 12(1):253, 2025
work page 2025
-
[23]
Msdr: Multi-step dependency relation networks for spatial temporal forecasting
Dachuan Liu, Jin Wang, Shuo Shang, and Peng Han. Msdr: Multi-step dependency relation networks for spatial temporal forecasting. InProceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pages 1042–1050, 2022
work page 2022
-
[24]
Yonghong Liu, Xinyi Chen, Rui Hu, Wei Huang, Li Li, Jiemin Xie, and Dawen Yao. Vehicle trajec- tory reconstruction with sparse automatic license plate recognition data for urban road networks. Transportmetrica A: Transport Science, pages 1–24, 2025
work page 2025
-
[25]
A survey on deep learning for human mobility.ACM Computing Surveys (CSUR), 55(1):1–44, 2021
Massimiliano Luca, Gianni Barlacchi, Bruno Lepri, and Luca Pappalardo. A survey on deep learning for human mobility.ACM Computing Surveys (CSUR), 55(1):1–44, 2021
work page 2021
-
[26]
Massimiliano Luca, Gian Maria Campedelli, Simone Centellegher, Michele Tizzoni, and Bruno Lepri. Crime, inequality and public health: A survey of emerging trends in urban data science.Frontiers in big Data, 6:1124526, 2023
work page 2023
-
[27]
City-scale high-resolution traffic datasets with refined networks for hierarchical traffic control
Qinzhou Ma, Xinling Guo, Weifan Zhong, Zhaocheng He, Zicheng Su, Wenfei Ma, and Renxin Zhong. City-scale high-resolution traffic datasets with refined networks for hierarchical traffic control. Scientific Data, 2026
work page 2026
-
[28]
M. Mazzoli, A. Molas, A. Bassolas, M. Lenormand, P. Colet, and J. J. Ramasco. Field theory for recurrent mobility.Nature Communications, 10:3895, 2019
work page 2019
-
[29]
Baichuan Mo, Ruimin Li, and Jingchen Dai. Estimating dynamic origin–destination demand: A hybrid framework using license plate recognition data.Computer-Aided Civil and Infrastructure Engineering, 35(7):734–752, 2020
work page 2020
-
[30]
Luca Pappalardo, Giuliano Cornacchia, Victor Navarro, Loreto Bravo, and Leo Ferres. A dataset to assess mobility changes in chile following local quarantines.Scientific Data, 10(1):6, 2023
work page 2023
-
[31]
Xinyi Qi, Yanjie Ji, Wenhao Li, and Shuichao Zhang. Vehicle trajectory reconstruction on urban traffic network using automatic license plate recognition data.IEEE Access, 9:49110–49120, 2021
work page 2021
-
[32]
Wenming Rao, Yao-Jan Wu, Jingxin Xia, Jishun Ou, and Robert Kluger. Origin-destination pat- tern estimation based on trajectory reconstruction using automatic license plate recognition data. Transportation Research Part C: Emerging Technologies, 95:29–46, 2018
work page 2018
-
[33]
A deep gravity model for mobility flows generation.Nature communications, 12(1):6576, 2021
Filippo Simini, Gianni Barlacchi, Massimilano Luca, and Luca Pappalardo. A deep gravity model for mobility flows generation.Nature communications, 12(1):6576, 2021
work page 2021
-
[34]
Michele Spanu, Marco Bertolusso, G¨ ulnaziye Bing¨ ol, Luigi Serreli, Christian Giovanni Castangia, Matteo Anedda, Mauro Fadda, Massimo Farina, and Daniele D Giusto. Smart cities mobility monitoring through automatic license plate recognition and vehicle discrimination. In2021 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (...
work page 2021
-
[35]
Junkai Sun, Junbo Zhang, Qiaofei Li, Xiuwen Yi, Yuxuan Liang, and Yu Zheng. Predicting citywide crowd flows in irregular regions using multi-view graph convolutional networks.IEEE Trans. Knowl. Data Eng., 34(5):2348–2359, 2022
work page 2022
-
[36]
Senzhang Wang, Hao Miao, Jiyue Li, and Jiannong Cao. Spatio-temporal knowledge transfer for ur- ban crowd flow prediction via deep attentive adaptation networks.IEEE Transactions on Intelligent Transportation Systems, 23(5):4695–4705, 2021
work page 2021
-
[37]
Yimin Wang, Yixian Chen, Guilong Li, Yuhuan Lu, Zhaocheng He, Zhi Yu, and Weiwei Sun. City- scale holographic traffic flow data based on vehicular trajectory resampling.Scientific Data, 10(1):57, 2023
work page 2023
-
[38]
Graph wavenet for deep spatial-temporal graph modeling
Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, and Chengqi Zhang. Graph wavenet for deep spatial-temporal graph modeling. In Sarit Kraus, editor,Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, pages 1907–1913. ijcai.org, 2019. 16
work page 2019
-
[39]
Takahiro Yabe, Kota Tsubouchi, Toru Shimizu, Yoshihide Sekimoto, Kaoru Sezaki, Esteban Moro, and Alex Pentland. Yjmob100k: City-scale and longitudinal dataset of anonymized human mobility trajectories.Scientific Data, 11(1):397, 2024
work page 2024
-
[40]
Coupled layer-wise graph convo- lution for transportation demand prediction
Junchen Ye, Leilei Sun, Bowen Du, Yanjie Fu, and Hui Xiong. Coupled layer-wise graph convo- lution for transportation demand prediction. InProceedings of the AAAI conference on artificial intelligence, volume 35, pages 4617–4625, 2021
work page 2021
-
[41]
Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting
Bing Yu, Haoteng Yin, and Zhanxing Zhu. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. In J´ erˆ ome Lang, editor,Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden, pages 3634–3640. ijcai.org, 2018
work page 2018
-
[42]
City-scale vehicle trajectory data from traffic camera videos.Scientific data, 10(1):711, 2023
Fudan Yu, Huan Yan, Rui Chen, Guozhen Zhang, Yu Liu, Meng Chen, and Yong Li. City-scale vehicle trajectory data from traffic camera videos.Scientific data, 10(1):711, 2023
work page 2023
-
[43]
Haiyang Yu, Shuai Yang, Zhihai Wu, and Xiaolei Ma. Vehicle trajectory reconstruction from automatic license plate reader data.International journal of distributed sensor networks, 14(2):1550147718755637, 2018
work page 2018
-
[44]
T-drive: driving directions based on taxi trajectories
Jing Yuan, Yu Zheng, Chengyang Zhang, Wenlei Xie, Xing Xie, Guangzhong Sun, and Yan Huang. T-drive: driving directions based on taxi trajectories. InProceedings of the 18th SIGSPATIAL International conference on advances in geographic information systems, pages 99–108, 2010
work page 2010
-
[45]
Traffic flow forecasting with spatial-temporal graph diffusion network
Xiyue Zhang, Chao Huang, Yong Xu, Lianghao Xia, Peng Dai, Liefeng Bo, Junbo Zhang, and Yu Zheng. Traffic flow forecasting with spatial-temporal graph diffusion network. InProceedings of the AAAI conference on artificial intelligence, volume 35, pages 15008–15015, 2021
work page 2021
-
[46]
Yatao Zhang, Tianhong Zhao, Song Gao, and Martin Raubal. Incorporating multimodal context information into traffic speed forecasting through graph deep learning.International Journal of Geographical Information Science, 37(9):1909–1935, 2023. 9 Author Contributions R. Cappi conducted the analyses, developing the code for data collection, pre-processing, a...
work page 1909
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.