From Time Series to State: Situation-Aware Modeling for Air Traffic Flow Prediction
Pith reviewed 2026-05-10 16:14 UTC · model grok-4.3
The pith
Modeling airspace as a dynamic set of aircraft states directly improves air traffic flow predictions over time series methods.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AeroSense represents the terminal airspace situation as a dynamic set of real-time aircraft states, processes them with masked self-attention to capture inter-aircraft interactions, and applies decoupled heads to model heterogeneous flow dynamics, yielding state-of-the-art predictive performance on real airport data compared with time series baselines.
What carries the argument
The situation-aware state representation that ingests a variable number of microscopic aircraft states and applies masked self-attention to model their interactions without prior time-series aggregation.
If this is right
- Direct state modeling produces substantially higher predictive fidelity than time series baselines.
- The framework shows greater robustness during peak traffic periods.
- It achieves Pareto-optimal results under dayparting multi-objective evaluation.
- Attention weights supply interpretable visualizations of which aircraft interactions influence flows.
Where Pith is reading between the lines
- The variable-set input format could transfer to other domains that track moving entities, such as road traffic or maritime vessel flows.
- Attention maps might be used to flag emerging congestion zones in real time for operational decisions.
- Extending the state representation to include simple kinematic constraints could further reduce reliance on learned patterns alone.
Load-bearing premise
That a dynamic collection of real-time aircraft states together with masked self-attention is enough to capture all relevant flow dynamics without explicit temporal aggregation or extra domain features.
What would settle it
A controlled test on the same airport dataset where adding explicit time-series aggregation or additional hand-crafted features produces measurably higher accuracy than the state-only AeroSense model.
Figures
read the original abstract
Accurate air traffic prediction in the terminal airspace (TA) is pivotal for proactive air traffic management (ATM). However, existing data-driven approaches predominantly rely on time series-based forecasting paradigms, which inherently overlook critical aircraft state information, such as real-time kinematics and proximity to airspace boundaries. To address this limitation, we propose \textit{AeroSense}, a direct state-to-flow modeling framework for air traffic prediction. Unlike classical time series-based methods that first aggregate aircraft trajectories into macroscopic flow sequences before modeling, AeroSense explicitly represents the real-time airspace situation as \textit{a dynamic set of aircraft states}, enabling the direct processing of a variable number of aircraft instead of time series as inputs. Specifically, we introduce a situation-aware state representation that enables AeroSense to sense the instantaneous terminal airspace situation directly from microscopic aircraft states. Furthermore, we design a model architecture that incorporates masked self-attention to capture inter-aircraft interactions, together with two decoupled prediction heads to model heterogeneous flow dynamics across two key functional areas of the TA. Extensive experiments on a large-scale real-world airport dataset demonstrate that AeroSense consistently achieves state-of-the-art performance, validating that direct modeling of microscopic aircraft states yields substantially higher predictive fidelity than time series-based baselines. Moreover, the proposed framework exhibits superior robustness during peak traffic periods, achieves Pareto-optimal performance under dayparting multi-object evaluation, and provides meaningful interpretability through attention-based visualizations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes AeroSense, a direct state-to-flow modeling framework for air traffic flow prediction in terminal airspace (TA). It represents the airspace as a dynamic set of real-time aircraft states (including kinematics and boundary proximity) rather than aggregated time series, processes variable numbers of aircraft via masked self-attention to capture inter-aircraft interactions, and uses two decoupled prediction heads for heterogeneous flow dynamics in different TA functional areas. The central claim is that extensive experiments on a large-scale real-world airport dataset show AeroSense achieving state-of-the-art performance with superior robustness during peak traffic periods, Pareto-optimal results under dayparting multi-object evaluation, and improved interpretability via attention visualizations.
Significance. If the experimental comparisons hold, the work offers a meaningful paradigm shift in air traffic management by showing that explicit modeling of microscopic aircraft states can yield higher predictive fidelity than classical time-series aggregation. Strengths include the situation-aware state representation that directly encodes kinematics and boundary effects, the use of masked self-attention for dynamic inter-aircraft relations, and the decoupled heads that address heterogeneous TA regions. The application to a real-world dataset and the reported robustness in peak periods add practical value. The approach is internally consistent with no hidden circularity in the modeling assumptions.
major comments (1)
- [Experimental Evaluation] Experimental section (results and evaluation): The manuscript asserts that AeroSense 'consistently achieves state-of-the-art performance' and exhibits 'superior robustness' but supplies no concrete quantitative metrics (e.g., MAE/RMSE values or improvement margins), baseline model details, statistical significance tests, or ablation results. This leaves the central performance claim resting on unverified assertions rather than verifiable evidence.
minor comments (2)
- [Abstract] Abstract: The description of the dataset as 'large-scale real-world airport dataset' would benefit from additional specificity (e.g., airport name, time span, or number of flights) to allow readers to assess generalizability.
- [Method] Notation: The situation-aware state representation is described clearly in the method but could include an explicit equation or pseudocode for the input feature vector to improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the thorough review and the recommendation for major revision. The feedback on the experimental evaluation is well-taken and has prompted us to strengthen the manuscript by making the quantitative results fully explicit and verifiable. We have revised the paper accordingly and believe the changes directly address the concern while preserving the core contributions.
read point-by-point responses
-
Referee: [Experimental Evaluation] Experimental section (results and evaluation): The manuscript asserts that AeroSense 'consistently achieves state-of-the-art performance' and exhibits 'superior robustness' but supplies no concrete quantitative metrics (e.g., MAE/RMSE values or improvement margins), baseline model details, statistical significance tests, or ablation results. This leaves the central performance claim resting on unverified assertions rather than verifiable evidence.
Authors: We agree that the central performance claims must be supported by explicit, verifiable numbers rather than assertions alone. While the original submission presented results via figures and tables, the main text did not quote the key MAE/RMSE values, improvement margins, or provide sufficient methodological detail on baselines and statistical testing. In the revised manuscript we have added explicit reporting of all MAE and RMSE values (including absolute numbers and relative improvements over each baseline), a complete description of the baseline models and their hyper-parameters, results of statistical significance tests (paired t-tests with p-values reported for all comparisons), and a full set of ablation studies isolating the contributions of the state representation, masked self-attention, and decoupled prediction heads. These additions appear in the expanded Section 4 and updated Tables 2–4, ensuring every claim is now directly traceable to concrete evidence. revision: yes
Circularity Check
No significant circularity; central claim rests on external experimental validation
full rationale
The paper's core contribution is a modeling framework (AeroSense) that represents airspace as a dynamic set of aircraft states processed via masked self-attention and decoupled heads, explicitly contrasted with time-series aggregation baselines. Validation is provided solely by comparative experiments on a held-out real-world dataset, with no equations or derivations that reduce the claimed performance gains to fitted parameters, self-definitions, or prior self-citations. The situation-aware representation and attention mechanism are presented as architectural choices motivated by domain limitations of time-series methods, not as quantities derived from the target predictions themselves. No load-bearing step equates the output fidelity metric to an input by construction.
Axiom & Free-Parameter Ledger
invented entities (1)
-
AeroSense framework
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Yi Lin, Jian-wei Zhang, and Hong Liu. Deep learning based short-term air traffic flow prediction considering temporal–spatial correlation.Aerospace Science and Technology, 93:105113, 2019
work page 2019
-
[2]
Dan Chen, Minghua Hu, Yuanyuan Ma, and Jianan Yin. A network-based dy- namic air traffic flow model for short-term en route traffic prediction.Journal of From Time Series to State: Situation-Aware Modeling for Air Traffic Flow Prediction Advanced Transportation, 50(8):2174–2192, 2016
work page 2016
-
[3]
Wenbo Du, Shenwen Chen, Zhishuai Li, Xianbin Cao, and Yisheng Lv. A spatial- temporal approach for multi-airport traffic flow prediction through causality graphs.IEEE Transactions on Intelligent Transportation Systems, 25(1):532–544, 2024
work page 2024
-
[4]
Juan Jose Rebollo and Hamsa Balakrishnan. Characterization and prediction of air traffic delays.Transportation Research Part C: Emerging Technologies, 44: 231–241, 2014
work page 2014
-
[5]
A deep learning approach for short-term airport traffic flow prediction.Aerospace, 9(1), 2022
Zhen Yan, Hongyu Yang, Fan Li, and Yi Lin. A deep learning approach for short-term airport traffic flow prediction.Aerospace, 9(1), 2022. URL https: //www.mdpi.com/2226-4310/9/1/11
work page 2022
-
[6]
Zhen Yan, Hongyu Yang, Yuankai Wu, and Yi Lin. A multi-view attention-based spatial–temporal network for airport arrival flow prediction.Transportation Research Part E: Logistics and Transportation Review, 170:102997, 2023
work page 2023
-
[7]
Con- necting the dots: Multivariate time series forecasting with graph neural networks
Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, and Chengqi Zhang. Con- necting the dots: Multivariate time series forecasting with graph neural networks. InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 753–763, 2020
work page 2020
-
[8]
Pre-training enhanced spatial-temporal graph neural networks for traffic forecasting
Zezhi Shao, Zhao Zhang, Fei Wang, Wei Wei, and Yongjun Xu. Pre-training enhanced spatial-temporal graph neural networks for traffic forecasting. In Proceedings of the 28th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1567–1577, 2022
work page 2022
-
[9]
Chunyao Ma, Sameer Alam, Qing Cai, and Daniel Delahaye. Text-enriched air traffic flow modeling and prediction using transformers.IEEE Transactions on Intelligent Transportation Systems, 25(7):7963–7976, 2024
work page 2024
-
[10]
Long-term airport network performance forecasting with linear diffusion graph networks
Yuankai Wu, Jing Yang, Xiaoxu Chen, Yi Lin, and Hongyu Yang. Long-term airport network performance forecasting with linear diffusion graph networks. IEEE Transactions on Intelligent Transportation Systems, 25(11):18264–18278, 2024
work page 2024
-
[11]
Weiqi Liang, Ziqiang Cui, Yanyan Shen, and Huaiup Wu. Irregular traffic time series forecasting based on asynchronous spatio-temporal graph convolutional networks. InProceedings of the 30th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1747–1758, 2024
work page 2024
-
[12]
Modeling network-level traffic flow transitions on sparse data
Xiaowei Mao, Huimin Ma, Zongtao Duan, Qiong Wu, Xi Xiao, and Yunjia Gong. Modeling network-level traffic flow transitions on sparse data. InProceedings of the 28th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 3564–3572, 2022
work page 2022
-
[13]
Batch normalization: Accelerating deep network training by reducing internal covariate shift
Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. InInternational Conference on Machine Learning (ICML), pages 448–456, 2015
work page 2015
-
[14]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Rus- lan Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting.The Journal of Machine Learning Research, 15(1):1929–1958, 2014
work page 1929
-
[15]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems (NeurIPS), volume 30, 2017
work page 2017
-
[16]
Set transformer: A framework for attention-based permutation- invariant neural networks
Juho Lee, Yoonho Lee, Jungtaek Kim, Adam Kosiorek, Seungjin Choi, and Yee Whye Teh. Set transformer: A framework for attention-based permutation- invariant neural networks. InInternational Conference on Machine Learning (ICML), pages 3744–3753, 2019
work page 2019
- [17]
-
[18]
Autoformer: Decom- position transformers with auto-correlation for long-term series forecasting
Haixu Wu, Jiehui Xu, Jianmin Wang, and Mingsheng Long. Autoformer: Decom- position transformers with auto-correlation for long-term series forecasting. In Advances in Neural Information Processing Systems (NeurIPS), volume 34, pages 22419–22430, 2021
work page 2021
-
[19]
Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting
Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. InProceedings of the 39th International Conference on Machine Learning (ICML), pages 27268–27286, 2022
work page 2022
-
[20]
Timesnet: Temporal 2d-variation modeling for general time series analysis
Haixu Wu, Tengge Hu, Yong Liu, Hang Zhou, Jianmin Wang, and Mingsheng Long. Timesnet: Temporal 2d-variation modeling for general time series analysis. InInternational Conference on Learning Representations (ICLR), 2023
work page 2023
-
[21]
itransformer: Inverted transformers are effective for time series forecasting
Yong Liu, Tengge Hu, Haoran Zhang, Haixu Wu, Shiyu Wang, Linton Ma, and Mingsheng Long. itransformer: Inverted transformers are effective for time series forecasting. InInternational Conference on Learning Representations (ICLR), 2024
work page 2024
-
[22]
Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu. Are transformers effective for time series forecasting? InProceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 11121–11128, 2023
work page 2023
-
[23]
Shengnan Guo, Youfang Lin, Ning Feng, Chao Song, and Huaiyu Wan. Hybrid spatio-temporal graph convolutional network: Improving traffic prediction with navigation data. InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 507–515, 2020
work page 2020
-
[24]
Deepurbanevent: A system for predicting citywide crowd dynamics at big events
Renchu Song, Weiwei Sun, Baihua Zheng, and Yu Zheng. Deepurbanevent: A system for predicting citywide crowd dynamics at big events. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2114–2122, 2019
work page 2019
-
[25]
Spatio-temporal adaptive embedding makes vanilla transformer SOTA for traffic forecasting
Hangchen Liu, Zheng Dong, Renhe Jiang, Jiewen Deng, Jinliang Deng, Quan- jun Chen, and Xuan Song. Spatio-temporal adaptive embedding makes vanilla transformer sota for traffic forecasting. InProceedings of the 32nd ACM Interna- tional Conference on Information and Knowledge Management, CIKM ’23, page 4125–4129, New York, NY, USA, 2023. Association for Com...
-
[26]
Filternet: Harnessing frequency filters for time series forecasting
Kun Yi, Jingru Fei, Qi Zhang, Hui He, Shufeng Hao, Defu Lian, and Wei Fan. Filternet: Harnessing frequency filters for time series forecasting. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors, Advances in Neural Information Processing Systems, volume 37, pages 55115–55140. Curran Associates, Inc., 2024. URL ht...
work page 2024
-
[27]
Lu Wang, Sunyan Hong, Haiyang Chi, Can Xie, Yirong Zhu, and Hanbin Mao. Hybrid spatio-temporal graph neural network with attention fusion for traffic flow prediction.Knowledge-Based Systems, 324:113813, 2025. URL https://www. sciencedirect.com/science/article/pii/S0950705125008597
work page 2025
-
[28]
Transferable graph structure learning for graph-based traffic forecasting
Yilun Jin, Kai Chen, and Qiang Yang. Transferable graph structure learning for graph-based traffic forecasting. InProceedings of the 29th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1027– 1038, 2023
work page 2023
-
[29]
A time series is worth 64 words: Long-term forecasting with transformers
Yuqi Nie, Nam H Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. A time series is worth 64 words: Long-term forecasting with transformers. In International Conference on Learning Representations (ICLR), 2023
work page 2023
-
[30]
PhaseFormer: from patches to phases for efficient and effective time series forecasting
Yiming Niu, Jinliang Deng, and Yongxin Tong. Phaseformer: From patches to phases for efficient and effective time series forecasting. 2025. URL https: //arxiv.org/abs/2510.04134
-
[31]
Sparsetsf: Modeling long-term time series forecasting with 1k parameters
Shengsheng Lin, Weiwei Lin, Wentai Wu, Haojun Chen, and Junjie Yang. Sparsetsf: Modeling long-term time series forecasting with 1k parameters. In International Conference on Machine Learning (ICML), 2024
work page 2024
-
[32]
Fits: Modeling time series with 10k parameters
Zhijian Xu, Ailing Zeng, and Qiang Xu. Fits: Modeling time series with 10k parameters. InInternational Conference on Learning Representations (ICLR), 2024
work page 2024
-
[33]
Timemixer: Decomposable multiscale mixing for time series forecasting
Shiyu Wang, Haixu Wu, Xiaoming Shi, Tengge Hu, Huakun Luo, Lintao Ma, James Y Zhang, and Jun Zhou. Timemixer: Decomposable multiscale mixing for time series forecasting. InInternational Conference on Learning Representations (ICLR), 2024
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.