Graph-based Complexity Forecasts in UK En Route Airspace Using Relevant Aircraft Interactions
Pith reviewed 2026-05-25 05:05 UTC · model grok-4.3
The pith
A graph of standardized route legs plus probabilistic occupancy forecasts predicts controller workload via relevant aircraft pair counts up to 45 minutes ahead.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors construct a graph representation of the London Middle Sector route network and model the probability that each aircraft occupies particular route segments at future times, accounting for arrival time uncertainty. Combined with an adapted filter that identifies relevant aircraft pairs requiring controller intervention, this produces forecasts of the number of such pairs up to 45 minutes ahead. On test data the forecasts correlate with observed relevant interactions at Spearman's ρ = 0.68, exceeding the 0.55 correlation of a baseline traffic-volume predictor.
What carries the argument
Graph representation of standardized route legs combined with probabilistic occupancy predictions that feed an adapted relevant-pair filter algorithm.
If this is right
- Forecasts available 45 minutes ahead allow earlier sector configuration and rostering adjustments.
- The live data stream integration supports operational deployment for group supervisors.
- Higher correlation with actual interactions reduces mismatch between predicted and experienced workload.
- The method extends to other multi-flow airspace sectors after similar filter adaptation.
Where Pith is reading between the lines
- If the pair count reliably tracks workload, it could reduce reliance on purely manual supervisor assessments.
- Adding weather or aircraft performance data to the occupancy model might extend forecast accuracy.
- Testing whether the improved pair detection leads to fewer sector overload events would measure operational value.
Load-bearing premise
The number of relevant aircraft pairs identified by the adapted filter serves as a reliable proxy for actual controller workload.
What would settle it
Running the forecasts against a larger collection of real traffic scenarios with direct workload ratings or intervention logs beyond the original 50 labelled cases.
Figures
read the original abstract
Effectively managing Air Traffic Control Officer (ATCO) workload is crucial in maintaining operational safety. Group supervisors use tools that estimate upcoming traffic load to aid decision-making. However, industry-standard models can fail to capture the nuances of upcoming air traffic complexity. This study presents a probabilistic approach to forecast the complexity of an airspace sector using the number of relevant aircraft pairs, i.e., those that require monitoring or deconfliction by a controller, as a proxy measure for ATCO workload. We adapted an existing filter algorithm to make it suitable for use in London Middle Sector (LMS), a complex airspace sector with multiple flows of traffic above some of the busiest airports in Europe. Through iterative feedback with ATCOs, the algorithm was refined and extended to handle specific geometric and operational considerations. The updated algorithm outperformed the original, with an F1-score of 0.84 compared to 0.69 on a labelled set of 50 traffic scenarios. To produce forecasts of future numbers of relevant aircraft pairs in the sector, a graph representation of the LMS route network was constructed, standardising the spatial fidelity of route legs. The forecasting method accounts for uncertainty in aircraft arrival times by modelling the probability of each aircraft occupying route segments at future query times. When combined with historic distributions of relevant interactions and a live operational data stream, predictions of upcoming ATCO workload could be made up to 45 minutes in advance. The proposed method to forecast upcoming workload showed a significantly stronger correlation with actual relevant interactions (Spearman's $\rho = 0.68$) than a standard traffic volume prediction ($\rho = 0.55$). The resulting data-driven tool shows promise for use by group supervisors to inform sector configuration and ATCO rostering decisions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a graph-based probabilistic forecasting method for ATCO workload in London Middle Sector airspace. It adapts an existing filter to identify 'relevant' aircraft pairs (those requiring monitoring or deconfliction) as a workload proxy, achieving F1=0.84 on 50 labelled scenarios after ATCO feedback. A route-network graph with standardized legs is combined with live data and historic distributions to model future pair counts up to 45 min ahead via probabilistic occupancy of segments. The method yields Spearman's ρ=0.68 against held-out relevant-pair counts, outperforming a traffic-volume baseline (ρ=0.55).
Significance. If the relevant-pair count is shown to be a valid workload proxy, the graph-plus-probability approach would offer a measurable improvement over volume-based forecasts for sector-configuration decisions. The use of ATCO-tuned filtering and explicit uncertainty modelling in arrival times are concrete strengths; however, the absence of any quantitative link between pair counts and independent workload indicators (self-report, physiological, or sector-complexity scores) limits the operational claim.
major comments (3)
- [Abstract / proxy section] Abstract and proxy-definition section: the central operational claim—that the forecast improves workload prediction—rests on treating the count of relevant pairs as a workload proxy, yet the only quantitative evidence is the F1=0.84 filter performance on 50 scenarios; no correlation or regression against any independent workload measure (ATCO ratings, sector complexity index, or physiological data) is reported. This is load-bearing for the utility conclusion.
- [Results / correlation analysis] Results paragraph reporting ρ=0.68 vs 0.55: the headline correlation is computed between the graph forecast and the same relevant-pair count that the model is trained to predict; because this is an internal consistency check rather than an external workload validation, the improvement does not yet demonstrate better workload forecasting. No error bars, bootstrap intervals, or test of difference between the two ρ values are supplied.
- [Forecasting method] Forecasting-method section: the probability model for aircraft occupancy of route segments at future query times is described at a high level but the explicit equations, parameter estimation procedure, and handling of arrival-time uncertainty are not derived or referenced, preventing verification of the claimed 45-minute horizon performance.
minor comments (2)
- [Methods / validation set] The selection criteria and labelling protocol for the 50 traffic scenarios are not described (random sample? stratified by traffic density? inter-rater agreement?), which affects reproducibility of the F1 result.
- [Abstract / results] The statement 'significantly stronger correlation' is used without a statistical test or p-value; this should be qualified or removed.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive report. The comments highlight important distinctions between proxy validation and direct workload measurement, as well as the need for greater methodological transparency. We address each major comment below and will revise the manuscript to improve clarity and rigor while acknowledging the inherent limitations of the current study.
read point-by-point responses
-
Referee: [Abstract / proxy section] Abstract and proxy-definition section: the central operational claim—that the forecast improves workload prediction—rests on treating the count of relevant pairs as a workload proxy, yet the only quantitative evidence is the F1=0.84 filter performance on 50 scenarios; no correlation or regression against any independent workload measure (ATCO ratings, sector complexity index, or physiological data) is reported. This is load-bearing for the utility conclusion.
Authors: We agree that the manuscript relies on relevant-pair count as a proxy without reporting direct correlations to independent workload indicators such as ATCO self-ratings or physiological measures. The proxy is supported by iterative ATCO feedback that refined the filter to F1=0.84 on 50 labelled scenarios. In revision we will explicitly qualify the abstract, introduction, and discussion to state that the proxy is expert-validated rather than directly validated against workload metrics, and we will note the absence of such correlations as a limitation and avenue for future work. revision: yes
-
Referee: [Results / correlation analysis] Results paragraph reporting ρ=0.68 vs 0.55: the headline correlation is computed between the graph forecast and the same relevant-pair count that the model is trained to predict; because this is an internal consistency check rather than an external workload validation, the improvement does not yet demonstrate better workload forecasting. No error bars, bootstrap intervals, or test of difference between the two ρ values are supplied.
Authors: We accept that the reported Spearman correlations evaluate predictive accuracy against the proxy variable itself and therefore constitute an internal consistency check. We will revise the results section to clarify this point, add bootstrap confidence intervals for both ρ values, and include a statistical test of the difference between the two correlations. revision: yes
-
Referee: [Forecasting method] Forecasting-method section: the probability model for aircraft occupancy of route segments at future query times is described at a high level but the explicit equations, parameter estimation procedure, and handling of arrival-time uncertainty are not derived or referenced, preventing verification of the claimed 45-minute horizon performance.
Authors: We will expand the forecasting-method section to derive the explicit equations for the probabilistic occupancy model, detail the parameter estimation from historic distributions, and describe the treatment of arrival-time uncertainty. These additions will allow verification of the 45-minute forecast capability. revision: yes
- The current dataset does not contain independent workload measures (ATCO ratings, physiological recordings, or sector-complexity scores), so a quantitative correlation between relevant-pair counts and such measures cannot be supplied without new data collection.
Circularity Check
No circularity: forecasting derivation is independent of inputs
full rationale
The paper builds a graph model of the LMS route network, models probabilistic occupancy of route segments from arrival-time uncertainty, and combines this with historic distributions of relevant interactions plus live data to produce forecasts of pair counts up to 45 min ahead. The reported Spearman correlation (ρ=0.68) is computed between these model forecasts and observed pair counts obtained by applying the (separately tuned) filter to held-out operational traffic; this evaluation is not forced by the forecast parameters themselves. The filter tuning (F1=0.84 on 50 scenarios) occurs upstream and does not redefine the forecast target. No self-definitional equations, fitted inputs renamed as predictions, or load-bearing self-citations appear in the chain.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The count of relevant aircraft pairs is a valid proxy measure for ATCO workload
Reference graph
Works this paper leans on
-
[1]
Determining air traffic complexity – challenges and future development
Bruno Antulov-Fantulin, Biljana Juri ˇci´c, Tomislav Radiˇsi´c, and Cem C ¸ etek. “Determining air traffic complexity – challenges and future development”. In: Promet - Traffic & Transportation32.4 (July 2020), pp. 475–485.DOI: 10.7307/ptt.v32i4.3401
-
[2]
ATC Network.NATS selects Altran Praxis to support major air traffic control system. Sept. 2010.URL: https: //www.atc-network.com/atc-news/nats-selects-altran- praxis-to-support-major-air-traffic-control-system
work page 2010
-
[3]
Modeling convective weather avoidance in enroute airspace
Rich DeLaura, Mike Robinson, Margo Pawlak, and Jim Evans. “Modeling convective weather avoidance in enroute airspace”. In:13th Conference on Aviation, Range, and Aerospace Meteorology, AMS, New Or- leans, LA. 2008
work page 2008
-
[4]
Jelena Djokic, Bernd Lorenz, and Hartmut Fricke. “Air traffic control complexity as workload driver”. In: Transportation Research Part C: Emerging Technolo- gies18.6 (Dec. 2010), pp. 930–936.DOI: 10.1016/j. trc.2010.03.005
work page doi:10.1016/j 2010
-
[5]
EUROCONTROL.Cognitive complexity in air traffic control - a literature review. 2004.URL: https://www. eurocontrol.int/node/9861
work page 2004
-
[6]
Edward Henderson, Dewi Gould, George De Ath, Richard Everson, and Nick Pepper. “Air traffic con- troller task demand via graph neural networks: an interpretable approach to airspace complexity”. In: AIAA AVIATION FORUM AND ASCEND 2025. 2025, p. 3590.DOI: 10.2514/6.2025-3590
-
[7]
Probabilistic simulation of aircraft descent via a physics-informed machine learning approach
Amy Hodgkin, Nick Pepper, and Marc Thomas. “Probabilistic simulation of aircraft descent via a physics-informed machine learning approach”. In: arXiv(2025).URL: https://arxiv.org/abs/2504.02529
-
[8]
The Jackknife and the Bootstrap for general stationary observations
Hans R. K ¨unsch. “The Jackknife and the Bootstrap for general stationary observations”. In:The Annals of Statistics17.3 (Sept. 1989).DOI: 10 . 1214 / aos / 1176347265
work page 1989
-
[9]
Biyue Li, Zhishuai Li, Jun Chen, Yongjie Yan, Yisheng Lv, and Wenbo Du. “MAST-GNN: A multimodal adap- tive spatio-temporal graph neural network for airspace complexity prediction”. In:Transportation Research Part C: Emerging Technologies160 (2024), p. 104521. DOI: 10.1016/j.trc.2024.104521
-
[10]
Bufan Liu, Sun Woh Lye, Terry Liang Khin Teo, and Hong Jie Wee. “A human-centered visual cognitive framework for traffic pair crossing identification in human–machine teaming”. In:Electronics15.2 (Jan. 2026), p. 477.DOI: 10.3390/electronics15020477
-
[11]
Two perspectives on graph based traffic flow management
Aude Marzuoli, Vlad Popescu, and Eric Feron. “Two perspectives on graph based traffic flow management”. In:SESAR Innovation Days. 2011
work page 2011
-
[12]
Towards transparent AI agents for air traffic control
Elhassan Mohamed, Benjamin J. Carvell, Rob Procter, Eseoghene Benjamin, George De Ath, and Richard Everson. “Towards transparent AI agents for air traffic control”. In:AIAA SCITECH 2026 Forum. American Institute of Aeronautics and Astronautics, Jan. 2026. DOI: 10.2514/6.2026-2869
-
[13]
Modern hierarchical, agglomerative clustering algorithms
Daniel M ¨ullner.Modern hierarchical, agglomerative clustering algorithms. 2011.DOI: 10.48550/ARXIV. 1109.2378
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv 2011
-
[14]
Towards an air traffic control com- plexity metric based on workspace constraints
Marinus M. van Paassen, Jurriaan G. d’Engelbronner, and Max Mulder. “Towards an air traffic control com- plexity metric based on workspace constraints”. In: 2010 IEEE International Conference on Systems, Man and Cybernetics. IEEE, Oct. 2010, pp. 654–660.DOI: 10.1109/icsmc.2010.5641823
-
[15]
Air traffic con- troller workload level prediction using conformalized dynamical graph learning
Yutian Pang, Jueming Hu, Christopher S. Lieber, Nancy J. Cooke, and Yongming Liu. “Air traffic con- troller workload level prediction using conformalized dynamical graph learning”. In:Advanced Engineering Informatics57 (2023), p. 102113.DOI: 10.1016/j.aei. 2023.102113
-
[16]
Nick Pepper et al. “A probabilistic digital twin of UK en route airspace for training and evaluating AI agents for air traffic control”. In:AIAA SCITECH 2026 Forum. American Institute of Aeronautics and Astronautics, Jan. 2026.DOI: 10.2514/6.2026-1794
-
[17]
A sector-specific probabilistic ap- proach for 4D aircraft trajectory generation
Nick Pepper et al. “A sector-specific probabilistic ap- proach for 4D aircraft trajectory generation”. In:Trans- portation Research Part C: Emerging Technologies179 (2025), p. 105291.DOI: 10.1016/j.trc.2025.105291
-
[18]
Modelling delay propagation within an airport network
Nikolas Pyrgiotis, Kerry M. Malone, and Amedeo Odoni. “Modelling delay propagation within an airport network”. In:Transportation Research Part C: Emerg- ing Technologies27 (2013). Selected papers from the Seventh Triennial Symposium on Transportation Anal- ysis (TRISTAN VII), pp. 60–75.DOI: 10.1016/j.trc. 2011.05.017
-
[19]
2022.URL: https: //skybrary.aero/articles/atc-shift-supervisor
SKYbrary.The ATC shift supervisor. 2022.URL: https: //skybrary.aero/articles/atc-shift-supervisor
work page 2022
-
[20]
Determining flight complexity and relevance: flight- centric filtering for air traffic control
Ajay Vijay Kumbhar, Wenying Lyu, and Clark Borst. “Determining flight complexity and relevance: flight- centric filtering for air traffic control”. English. In:14th SESAR Innovation Days, SIDS 2024. 2024, pp. 1–8
work page 2024
-
[21]
Yuanchao Yang. “Practical method for 4-dimentional strategic air traffic management problem with con- vective weather uncertainty”. In:IEEE Transactions on Intelligent Transportation Systems19.6 (2018), pp. 1697–1708.DOI: 10.1109/TITS.2017.2730229
-
[22]
Robust 3D dynamic airspace sectorization: a multilayer graph-based ap- proach
Tianyu Zhao, Jose Escribano, Arnab Majumdar, and Washington Yotto Ochieng. “Robust 3D dynamic airspace sectorization: a multilayer graph-based ap- proach”. In:Journal of Air Transport Management132 (2026), p. 102953.DOI: 10.1016/j.jairtraman.2025. 102953
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.