Towards Inclusive Mobility Modeling: Characterizing and Evaluating Elderly Trajectory Patterns in Urban Systems
Pith reviewed 2026-07-01 05:56 UTC · model grok-4.3
The pith
Models trained on the full Citi Bike population overestimate elderly step length by 4.5 percent and dwell time by 8.9 percent.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Elderly riders exhibit structurally distinct mobility signatures with localized activity spaces of 958 m versus 1,189 m for young riders, lower mobility entropy of 1.82 versus 4.15, and asymmetric off-peak temporal patterns. Models trained on majority-dominated data systematically misrepresent elderly behavior on spatial metrics; the Markov model trained on the full population overestimates elderly step length by 4.5 percent and dwell time by 8.9 percent, whereas the elderly-specific model achieves substantially lower errors across most metrics. Higher-capability models do not necessarily improve subgroup-level fidelity when demographic data remain limited.
What carries the argument
Comparison of first-order Markov chain and Qwen3-4B model fine-tuned with QLoRA, each trained under three demographic regimes (full population, young riders only, elderly riders only) and evaluated on fidelity to held-out elderly trajectory metrics.
If this is right
- Relying on majority-dominated training data produces biased synthetic trajectories for elderly mobility.
- Elderly-specific training yields lower errors on step length, dwell time, and entropy metrics.
- Higher model capacity alone does not guarantee better fidelity for underrepresented subgroups when demographic samples are small.
- Demographic segmentation improves representation in downstream urban planning applications.
Where Pith is reading between the lines
- Similar segmentation biases likely appear in other public transit or ride-hailing datasets that lack age labels.
- Collecting explicit age or proxy variables at the point of data capture would allow direct validation of the segmentation method.
- Urban simulation tools used for infrastructure planning could incorporate separate elderly modules to reduce planning errors for aging populations.
Load-bearing premise
That elderly riders can be reliably segmented from the Citi Bike dataset without recorded ages and that observed pattern differences are caused by age rather than trip purpose, location, or data artifacts.
What would settle it
A dataset containing verified rider ages that shows no statistically significant difference in spatial metric errors between full-population and elderly-only trained models when both are tested on the same elderly trajectories.
Figures
read the original abstract
The rapid advance of smart cities increasingly depends on trajectory data mining, yet underrepresented demographic groups, particularly the elderly, are often sparsely represented in public mobility datasets. This underrepresentation can introduce systematic bias into mobility modeling and downstream urban planning. Using the 2016-2020 Jersey City subset of the Citi Bike System Data, this study quantitatively examines how the absence of underrepresented subgroups' mobility signatures affects mobility modeling, using synthetic trajectory generation as a case study. The analysis reveals that elderly riders exhibit a structurally distinct mobility signature, including localized activity spaces (958 m vs. 1,189 m for young riders), lower mobility entropy (1.82 vs. 4.15), and asymmetric off-peak temporal patterns. To demonstrate that relying on majority-dominated training data yields biased synthetic outcomes, we further evaluate both a first-order Markov chain and a Qwen3-4B model fine-tuned with QLoRA across three demographic training settings: the full population, young riders only, and elderly riders only. Results show that models trained on majority-dominated populations systematically misrepresent elderly mobility behavior, particularly for spatial mobility metrics. The Markov model trained on the full population overestimates elderly step length by 4.5% and dwell time by 8.9%, whereas the elderly-specific model achieves substantially lower errors across most metrics. Comparisons between the Markov and LLM-based frameworks further show that higher-capability models do not necessarily improve subgroup-level fidelity under limited demographic data. These findings underscore the importance of demographic representation in mobility modeling and its downstream applications for underrepresented populations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that elderly riders in the 2016-2020 Jersey City Citi Bike dataset exhibit distinct mobility signatures (activity space 958 m vs. 1,189 m; entropy 1.82 vs. 4.15) compared to young riders, and that Markov chain and QLoRA-fine-tuned Qwen3-4B models trained on full-population or young-only data systematically bias synthetic elderly trajectories (e.g., full-population Markov overestimates step length by 4.5% and dwell time by 8.9%), while elderly-only training reduces errors; it concludes that demographic underrepresentation introduces bias in mobility modeling.
Significance. If the elderly segmentation is reliable, the concrete numeric gaps and cross-model comparisons provide evidence that majority-dominated training harms subgroup fidelity, with implications for equitable urban systems. The dual use of a simple Markov baseline and an LLM-based generator, plus held-out real elderly trajectories as benchmark, strengthens the evaluation design.
major comments (2)
- [Abstract; Data and Methods section] The method for identifying and segmenting 'elderly' trips within the Citi Bike dataset (which records only start/end stations, duration, and timestamps, with no age or demographic fields) is never described. This is load-bearing for the central claim in the Abstract and all evaluation results, because any unstated proxy (temporal/spatial patterns or external linkage) may capture confounders such as residential location or trip purpose rather than age, rendering the reported differences (958 m vs. 1,189 m activity space, 4.5 % and 8.9 % errors) non-demonstrably age-driven.
- [Evaluation and Results section] No statistical significance tests, confidence intervals, or details on preprocessing/baseline selection are provided for the metric differences or model errors (e.g., step-length and dwell-time biases in the Markov full-population setting). This undermines verification that the observed gaps are systematic rather than artifacts of how the elderly held-out set was constructed.
minor comments (1)
- [Abstract] Model name 'Qwen3-4B' should be cross-checked against official releases for accuracy in the methods description.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and will incorporate revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract; Data and Methods section] The method for identifying and segmenting 'elderly' trips within the Citi Bike dataset (which records only start/end stations, duration, and timestamps, with no age or demographic fields) is never described. This is load-bearing for the central claim in the Abstract and all evaluation results, because any unstated proxy (temporal/spatial patterns or external linkage) may capture confounders such as residential location or trip purpose rather than age, rendering the reported differences (958 m vs. 1,189 m activity space, 4.5 % and 8.9 % errors) non-demonstrably age-driven.
Authors: We agree that the segmentation procedure is central to the claims and acknowledge that its description was omitted from the Data and Methods section in the current manuscript. In the revision we will insert a complete account of how elderly trips were identified (including any external linkage or proxy rules), together with an explicit discussion of possible confounders and evidence that the observed mobility differences are attributable to age rather than location or purpose. This addition will directly address the load-bearing concern. revision: yes
-
Referee: [Evaluation and Results section] No statistical significance tests, confidence intervals, or details on preprocessing/baseline selection are provided for the metric differences or model errors (e.g., step-length and dwell-time biases in the Markov full-population setting). This undermines verification that the observed gaps are systematic rather than artifacts of how the elderly held-out set was constructed.
Authors: We accept that the absence of statistical tests, confidence intervals, and preprocessing details weakens verifiability. The revised manuscript will add appropriate significance tests (e.g., t-tests or non-parametric equivalents) and 95 % confidence intervals for all reported metric differences and model errors. We will also expand the Evaluation section with a full description of the preprocessing pipeline and the criteria used to select the Markov and LLM baselines. revision: yes
Circularity Check
No significant circularity; evaluation uses held-out external benchmarks.
full rationale
The paper trains Markov and QLoRA models on different demographic subsets of the Citi Bike data and evaluates synthetic trajectory metrics (step length, dwell time, activity space, entropy) against held-out real elderly trajectories. This supplies an independent test set rather than deriving errors from parameters fitted on the evaluation data itself. No equations, self-citations, or ansatzes are shown to reduce the reported performance gaps to tautological inputs or prior author results. The segmentation of elderly trips, while potentially reliant on proxies, is a data-processing choice whose validity is external to the derivation chain and does not create self-definitional or fitted-input circularity in the modeling results.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Elderly riders can be identified and isolated within the Citi Bike trip records
- domain assumption Differences in activity space and entropy are attributable to age rather than location, purpose, or sampling bias
Reference graph
Works this paper leans on
-
[1]
In: Proceedings of the AAAI Conference on Artificial Intelligence
Agostini, G., Pierson, E., Garg, N.: A bayesian spatial model to correct under- reporting in urban crowdsourcing. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 38 (2024) Title Suppressed Due to Excessive Length 13
2024
-
[2]
Nature563(7729), 639–643 (2018)
Alessandretti, L., Sapiezynski, P., Sekara, V., Lehmann, S., Baronchelli, A.: The scales of human mobility. Nature563(7729), 639–643 (2018)
2018
-
[3]
International Journal of Computational Geometry & Applications5(1-2), 75–91 (1995)
Alt, H., Godau, M.: Computing the fréchet distance between two polygonal curves. International Journal of Computational Geometry & Applications5(1-2), 75–91 (1995)
1995
-
[4]
Artificial Intelli- gence Review56, 14605–14638 (2023)
Cao, C., Li, Y., et al.: Mobility trajectory generation: a survey. Artificial Intelli- gence Review56, 14605–14638 (2023)
2023
-
[5]
In: Proceedings of the 40th International Conference on Machine Learning (ICML)
Chen, J., Wang, H., Zhao, J., Li, Y., Jin, D.: Differentially private trajectory gener- ation with guaranteed utility. In: Proceedings of the 40th International Conference on Machine Learning (ICML). vol. 202, pp. 4896–4911 (2023)
2023
-
[6]
Transport Reviews44(2), 295–318 (2024)
Chen, X., Zhao, P., Di, X.: Age-friendly cities and technologies: Opportunities and challenges for mobility. Transport Reviews44(2), 295–318 (2024)
2024
-
[7]
ACM Trans- actions on Spatial Algorithms and Systems7(3), 1–28 (2021)
Cunningham, E., et al.: Privacy-preserving synthetic location data. ACM Trans- actions on Spatial Algorithms and Systems7(3), 1–28 (2021)
2021
-
[8]
In: Advances in Neural Information Processing Systems (NeurIPS)
Dettmers, T., Pagnoni, A., Holtzman, A., Zettlemoyer, L.: Qlora: Efficient finetun- ing of quantized language models. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 36, pp. 9842–9863 (2023)
2023
-
[9]
Reviews in Transport Economics and Policy (2017)
Feng, J., Dijst, M.: Mobility of older people: A systematic review. Reviews in Transport Economics and Policy (2017)
2017
-
[10]
In: Proceedings of the 37th AAAI Conference on Artificial Intelligence
Feng, J., Li, Y., Yang, C., Zhou, F., Jin, D.: Spatial-temporal diffusion probabilistic learning for trajectory generation. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence. vol. 37, pp. 15146–15154 (2023)
2023
-
[11]
Nature453(7196), 779–782 (2008)
González, M.C., Hidalgo, C.A., Barabási, A.L.: Understanding individual human mobility patterns. Nature453(7196), 779–782 (2008)
2008
-
[12]
Journal of Transport Geography18(5), 624–633 (2010)
Hjorthol, R., Levin, L., Sirén, A.: Mobility in different generations of older persons: thedevelopmentofdailytravelindifferentcohortsindenmark,norwayandsweden. Journal of Transport Geography18(5), 624–633 (2010)
2010
-
[13]
Travel Behaviour and Society30, 264– 276 (2023)
Huang, Z., Xu, Y., Li, Q., Long, Y.: Understanding the mobility patterns of the elderly using large-scale mobile phone data. Travel Behaviour and Society30, 264– 276 (2023)
2023
-
[14]
In: Proceedings of the AAAI Conference on Artificial Intelligence
Jiang, R., Yang, X., et al.: Continuous trajectory generation based on two-stage gan. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 37, pp. 4374–4382 (2023)
2023
-
[15]
ACM Transactions on Spatial Algorithms and Systems10(1), 1–28 (2024)
Lucas, B., Pappalardo, L., Barlacchi, G., Pileggi, S., Simini, F.: Evaluating the quality of synthetically generated mobility data: A comprehensive framework. ACM Transactions on Spatial Algorithms and Systems10(1), 1–28 (2024)
2024
-
[16]
ACM Computing Surveys54(6), Article 115 (2022)
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Computing Surveys54(6), Article 115 (2022)
2022
-
[17]
Scientific Reports3, 1376 (2013)
de Montjoye, Y.A., Hidalgo, C.A., Verleysen, M., Blondel, V.D.: Unique in the crowd: The privacy bounds of human mobility. Scientific Reports3, 1376 (2013)
2013
-
[18]
15- minute city
Moreno, C., Allam, Z., Chabaud, D., Gall, C., Pratlong, F.: Introducing the “15- minute city”: Sustainability, resilience and place identity in future post-pandemic cities. Smart Cities4(1), 93–111 (2021)
2021
-
[19]
com/system-data(2024)
Motivate International Inc.: Citi Bike NYC System Data.https://citibikenyc. com/system-data(2024)
2024
-
[20]
Nature624(7992), 586–592 (2023)
Nilforoshan, H., Lanchantin, W., Pierson, E., et al.: Human mobility networks reveal increased segregation in large cities. Nature624(7992), 586–592 (2023)
2023
-
[21]
Nature Communications9(1), 1–11 (2018) 14 Wang et al
Pappalardo, L., Simini, F.: Data-driven generation of spatio-temporal routines. Nature Communications9(1), 1–11 (2018) 14 Wang et al
2018
-
[22]
Computers, Environment and Urban Systems108, 102146 (2024)
Qi, H., Chen, Z., Zhang, Y., Ratti, C.: Unequal mobility: A data-driven analysis of demographic disparities in urban movement patterns. Computers, Environment and Urban Systems108, 102146 (2024)
2024
-
[23]
Qwen Team: Qwen3 technical report. arXiv preprint arXiv:2505.09388 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[24]
In: Leibniz International Proceedings in Informatics (LIPIcs)
Rao, J., Gao, S., et al.: Lstm-trajgan: A deep learning approach to trajectory privacy protection. In: Leibniz International Proceedings in Informatics (LIPIcs). vol. 177 (2020)
2020
-
[25]
IEEE Transactions on Acoustics, Speech, and Signal Processing 26(1), 43–49 (1978)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 26(1), 43–49 (1978)
1978
-
[26]
Science327(5968), 1018–1021 (2010)
Song, C., Qu, Z., Blumm, N., Barabási, A.L.: Limits of predictability in human mobility. Science327(5968), 1018–1021 (2010)
2010
-
[27]
In: Proceedings of the ACM Web Conference (WWW)
Sun, Z., Wang, H., Li, Y., Jin, D.: Demographic bias in trajectory generation: Characterizing and mitigating representation gaps. In: Proceedings of the ACM Web Conference (WWW). pp. 3819–3830. ACM (2024)
2024
-
[28]
In: Proceedings of the 31st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Van de Ven, Y., Bulten, W., Sips, R.J.: On the limitations of generative models for trajectory privacy. In: Proceedings of the 31st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. p. Article 62. ACM (2023)
2023
-
[29]
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT)9(1), 1–23 (2025)
Wang, H., Wang, X., Li, Y., Zhu, Y., Chen, J., Li, Y.: Mobilitygpt: Generating human mobility trajectories with large language models. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT)9(1), 1–23 (2025)
2025
-
[30]
IEEE Transactions on Knowledge and Data Engineering36(5), 2234–2252 (2024)
Wang, S., Bao, Z., Culpepper, J.S., Cong, G.: Trajectory similarity metrics in the era of deep learning: A systematic review. IEEE Transactions on Knowledge and Data Engineering36(5), 2234–2252 (2024)
2024
-
[31]
IEEE Transactions on Knowledge and Data Engineering 36(11) (2024)
Wang, Z., Li, J., et al.: Spatiotemporal-augmented graph neural networks for hu- man mobility simulation. IEEE Transactions on Knowledge and Data Engineering 36(11) (2024)
2024
-
[32]
WHO Press (2007)
World Health Organization: Global age-friendly cities: A guide. WHO Press (2007)
2007
-
[33]
Nature Communications14(1), 4256 (2023)
Xu, Y., Béné, C., Batista, R., et al.: Demographic disparities in human mobility data: A large-scale analysis. Nature Communications14(1), 4256 (2023)
2023
-
[34]
ACM Computing Surveys56(4), Article 88 (2024)
Yan, Y., Zhang, Y., Wang, S., Liu, Y., Batty, M.: Algorithmic fairness in urban computing: A survey. ACM Computing Surveys56(4), Article 88 (2024)
2024
-
[35]
Computers, Environment and Urban Systems103, 101989 (2023)
Zhang, S., Feng, C.C., Tan, C.: Activity space of older adults: A longitudinal gps study in singapore. Computers, Environment and Urban Systems103, 101989 (2023)
2023
-
[36]
ACM Transactions on Intelligent Systems and Technology (TIST)6(3), 1–41 (2015)
Zheng, Y.: Trajectory data mining: an overview. ACM Transactions on Intelligent Systems and Technology (TIST)6(3), 1–41 (2015)
2015
-
[37]
Zhou,P.,Wang,X.,Li,Y.,Zheng,Y.:Asurveyontrajectorydatageneration:From classicalmodelingtodeepgenerativeapproaches.IEEETransactionsonKnowledge and Data Engineering37(2), 385–406 (2025)
2025
-
[38]
In: Advances in Neural Information Processing Systems (NeurIPS)
Zhu, Y., Yu, W., Wang, H., Chen, J., Li, Y.: Difftraj: Generating gps trajectory with diffusion probabilistic model. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 36, pp. 48044–48061 (2023)
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.