pith. sign in

arxiv: 2605.23696 · v1 · pith:EWHV7SUWnew · submitted 2026-05-22 · 💻 cs.LG

Graph-based Complexity Forecasts in UK En Route Airspace Using Relevant Aircraft Interactions

Pith reviewed 2026-05-25 05:05 UTC · model grok-4.3

classification 💻 cs.LG
keywords air traffic controlworkload forecastinggraph-based modelingairspace complexityprobabilistic predictionrelevant aircraft interactionssector management
0
0 comments X

The pith

A graph of standardized route legs plus probabilistic occupancy forecasts predicts controller workload via relevant aircraft pair counts up to 45 minutes ahead.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that counting pairs of aircraft that actually require monitoring or deconfliction gives a better signal of upcoming air traffic control workload than simply counting total aircraft. It adapts a filter algorithm for a busy UK sector through controller feedback, then builds a graph model of the route network to handle uncertainty in arrival times and produce forecasts. These forecasts correlate more strongly with observed relevant interactions than a standard volume-based prediction. If the approach works, group supervisors could make earlier and more accurate decisions about splitting sectors or adjusting staffing to maintain safety.

Core claim

The authors construct a graph representation of the London Middle Sector route network and model the probability that each aircraft occupies particular route segments at future times, accounting for arrival time uncertainty. Combined with an adapted filter that identifies relevant aircraft pairs requiring controller intervention, this produces forecasts of the number of such pairs up to 45 minutes ahead. On test data the forecasts correlate with observed relevant interactions at Spearman's ρ = 0.68, exceeding the 0.55 correlation of a baseline traffic-volume predictor.

What carries the argument

Graph representation of standardized route legs combined with probabilistic occupancy predictions that feed an adapted relevant-pair filter algorithm.

If this is right

  • Forecasts available 45 minutes ahead allow earlier sector configuration and rostering adjustments.
  • The live data stream integration supports operational deployment for group supervisors.
  • Higher correlation with actual interactions reduces mismatch between predicted and experienced workload.
  • The method extends to other multi-flow airspace sectors after similar filter adaptation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the pair count reliably tracks workload, it could reduce reliance on purely manual supervisor assessments.
  • Adding weather or aircraft performance data to the occupancy model might extend forecast accuracy.
  • Testing whether the improved pair detection leads to fewer sector overload events would measure operational value.

Load-bearing premise

The number of relevant aircraft pairs identified by the adapted filter serves as a reliable proxy for actual controller workload.

What would settle it

Running the forecasts against a larger collection of real traffic scenarios with direct workload ratings or intervention logs beyond the original 50 labelled cases.

Figures

Figures reproduced from arXiv: 2605.23696 by Edward Henderson, George De Ath, Nick Pepper.

Figure 1
Figure 1. Figure 1: Example showing the use of probabilistic TP in the [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Flowchart illustrating the outline of our updated relevant [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: a) The original graph created from the routes between waypoints (blue nodes) aircraft can take through LMS, the outline of which [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: An example of the procedure used to estimate the likelihood of an aircraft to occupy each leg along its route at a particular query [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Confusion matrices showing the performance of the filter of [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Example traffic sample from the set of 50 ATCO-labelled scenarios of LMS. The subject aircraft is highlighted in blue and the [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Plot showing the number of relevant pairs forecast 45 minutes in advance by our proposed method (orange) for a day in May 2025. [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
read the original abstract

Effectively managing Air Traffic Control Officer (ATCO) workload is crucial in maintaining operational safety. Group supervisors use tools that estimate upcoming traffic load to aid decision-making. However, industry-standard models can fail to capture the nuances of upcoming air traffic complexity. This study presents a probabilistic approach to forecast the complexity of an airspace sector using the number of relevant aircraft pairs, i.e., those that require monitoring or deconfliction by a controller, as a proxy measure for ATCO workload. We adapted an existing filter algorithm to make it suitable for use in London Middle Sector (LMS), a complex airspace sector with multiple flows of traffic above some of the busiest airports in Europe. Through iterative feedback with ATCOs, the algorithm was refined and extended to handle specific geometric and operational considerations. The updated algorithm outperformed the original, with an F1-score of 0.84 compared to 0.69 on a labelled set of 50 traffic scenarios. To produce forecasts of future numbers of relevant aircraft pairs in the sector, a graph representation of the LMS route network was constructed, standardising the spatial fidelity of route legs. The forecasting method accounts for uncertainty in aircraft arrival times by modelling the probability of each aircraft occupying route segments at future query times. When combined with historic distributions of relevant interactions and a live operational data stream, predictions of upcoming ATCO workload could be made up to 45 minutes in advance. The proposed method to forecast upcoming workload showed a significantly stronger correlation with actual relevant interactions (Spearman's $\rho = 0.68$) than a standard traffic volume prediction ($\rho = 0.55$). The resulting data-driven tool shows promise for use by group supervisors to inform sector configuration and ATCO rostering decisions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes a graph-based probabilistic forecasting method for ATCO workload in London Middle Sector airspace. It adapts an existing filter to identify 'relevant' aircraft pairs (those requiring monitoring or deconfliction) as a workload proxy, achieving F1=0.84 on 50 labelled scenarios after ATCO feedback. A route-network graph with standardized legs is combined with live data and historic distributions to model future pair counts up to 45 min ahead via probabilistic occupancy of segments. The method yields Spearman's ρ=0.68 against held-out relevant-pair counts, outperforming a traffic-volume baseline (ρ=0.55).

Significance. If the relevant-pair count is shown to be a valid workload proxy, the graph-plus-probability approach would offer a measurable improvement over volume-based forecasts for sector-configuration decisions. The use of ATCO-tuned filtering and explicit uncertainty modelling in arrival times are concrete strengths; however, the absence of any quantitative link between pair counts and independent workload indicators (self-report, physiological, or sector-complexity scores) limits the operational claim.

major comments (3)
  1. [Abstract / proxy section] Abstract and proxy-definition section: the central operational claim—that the forecast improves workload prediction—rests on treating the count of relevant pairs as a workload proxy, yet the only quantitative evidence is the F1=0.84 filter performance on 50 scenarios; no correlation or regression against any independent workload measure (ATCO ratings, sector complexity index, or physiological data) is reported. This is load-bearing for the utility conclusion.
  2. [Results / correlation analysis] Results paragraph reporting ρ=0.68 vs 0.55: the headline correlation is computed between the graph forecast and the same relevant-pair count that the model is trained to predict; because this is an internal consistency check rather than an external workload validation, the improvement does not yet demonstrate better workload forecasting. No error bars, bootstrap intervals, or test of difference between the two ρ values are supplied.
  3. [Forecasting method] Forecasting-method section: the probability model for aircraft occupancy of route segments at future query times is described at a high level but the explicit equations, parameter estimation procedure, and handling of arrival-time uncertainty are not derived or referenced, preventing verification of the claimed 45-minute horizon performance.
minor comments (2)
  1. [Methods / validation set] The selection criteria and labelling protocol for the 50 traffic scenarios are not described (random sample? stratified by traffic density? inter-rater agreement?), which affects reproducibility of the F1 result.
  2. [Abstract / results] The statement 'significantly stronger correlation' is used without a statistical test or p-value; this should be qualified or removed.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the thoughtful and constructive report. The comments highlight important distinctions between proxy validation and direct workload measurement, as well as the need for greater methodological transparency. We address each major comment below and will revise the manuscript to improve clarity and rigor while acknowledging the inherent limitations of the current study.

read point-by-point responses
  1. Referee: [Abstract / proxy section] Abstract and proxy-definition section: the central operational claim—that the forecast improves workload prediction—rests on treating the count of relevant pairs as a workload proxy, yet the only quantitative evidence is the F1=0.84 filter performance on 50 scenarios; no correlation or regression against any independent workload measure (ATCO ratings, sector complexity index, or physiological data) is reported. This is load-bearing for the utility conclusion.

    Authors: We agree that the manuscript relies on relevant-pair count as a proxy without reporting direct correlations to independent workload indicators such as ATCO self-ratings or physiological measures. The proxy is supported by iterative ATCO feedback that refined the filter to F1=0.84 on 50 labelled scenarios. In revision we will explicitly qualify the abstract, introduction, and discussion to state that the proxy is expert-validated rather than directly validated against workload metrics, and we will note the absence of such correlations as a limitation and avenue for future work. revision: yes

  2. Referee: [Results / correlation analysis] Results paragraph reporting ρ=0.68 vs 0.55: the headline correlation is computed between the graph forecast and the same relevant-pair count that the model is trained to predict; because this is an internal consistency check rather than an external workload validation, the improvement does not yet demonstrate better workload forecasting. No error bars, bootstrap intervals, or test of difference between the two ρ values are supplied.

    Authors: We accept that the reported Spearman correlations evaluate predictive accuracy against the proxy variable itself and therefore constitute an internal consistency check. We will revise the results section to clarify this point, add bootstrap confidence intervals for both ρ values, and include a statistical test of the difference between the two correlations. revision: yes

  3. Referee: [Forecasting method] Forecasting-method section: the probability model for aircraft occupancy of route segments at future query times is described at a high level but the explicit equations, parameter estimation procedure, and handling of arrival-time uncertainty are not derived or referenced, preventing verification of the claimed 45-minute horizon performance.

    Authors: We will expand the forecasting-method section to derive the explicit equations for the probabilistic occupancy model, detail the parameter estimation from historic distributions, and describe the treatment of arrival-time uncertainty. These additions will allow verification of the 45-minute forecast capability. revision: yes

standing simulated objections not resolved
  • The current dataset does not contain independent workload measures (ATCO ratings, physiological recordings, or sector-complexity scores), so a quantitative correlation between relevant-pair counts and such measures cannot be supplied without new data collection.

Circularity Check

0 steps flagged

No circularity: forecasting derivation is independent of inputs

full rationale

The paper builds a graph model of the LMS route network, models probabilistic occupancy of route segments from arrival-time uncertainty, and combines this with historic distributions of relevant interactions plus live data to produce forecasts of pair counts up to 45 min ahead. The reported Spearman correlation (ρ=0.68) is computed between these model forecasts and observed pair counts obtained by applying the (separately tuned) filter to held-out operational traffic; this evaluation is not forced by the forecast parameters themselves. The filter tuning (F1=0.84 on 50 scenarios) occurs upstream and does not redefine the forecast target. No self-definitional equations, fitted inputs renamed as predictions, or load-bearing self-citations appear in the chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that relevant-pair count proxies workload and on the accuracy of the adapted identification algorithm; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)
  • domain assumption The count of relevant aircraft pairs is a valid proxy measure for ATCO workload
    Explicitly stated in the abstract as the basis for the forecasting target.

pith-pipeline@v0.9.0 · 5848 in / 1211 out tokens · 22359 ms · 2026-05-25T05:05:01.612466+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages · 1 internal anchor

  1. [1]

    Determining air traffic complexity – challenges and future development

    Bruno Antulov-Fantulin, Biljana Juri ˇci´c, Tomislav Radiˇsi´c, and Cem C ¸ etek. “Determining air traffic complexity – challenges and future development”. In: Promet - Traffic & Transportation32.4 (July 2020), pp. 475–485.DOI: 10.7307/ptt.v32i4.3401

  2. [2]

    ATC Network.NATS selects Altran Praxis to support major air traffic control system. Sept. 2010.URL: https: //www.atc-network.com/atc-news/nats-selects-altran- praxis-to-support-major-air-traffic-control-system

  3. [3]

    Modeling convective weather avoidance in enroute airspace

    Rich DeLaura, Mike Robinson, Margo Pawlak, and Jim Evans. “Modeling convective weather avoidance in enroute airspace”. In:13th Conference on Aviation, Range, and Aerospace Meteorology, AMS, New Or- leans, LA. 2008

  4. [4]

    Masset, R

    Jelena Djokic, Bernd Lorenz, and Hartmut Fricke. “Air traffic control complexity as workload driver”. In: Transportation Research Part C: Emerging Technolo- gies18.6 (Dec. 2010), pp. 930–936.DOI: 10.1016/j. trc.2010.03.005

  5. [5]

    2004.URL: https://www

    EUROCONTROL.Cognitive complexity in air traffic control - a literature review. 2004.URL: https://www. eurocontrol.int/node/9861

  6. [6]

    Air traffic con- troller task demand via graph neural networks: an interpretable approach to airspace complexity

    Edward Henderson, Dewi Gould, George De Ath, Richard Everson, and Nick Pepper. “Air traffic con- troller task demand via graph neural networks: an interpretable approach to airspace complexity”. In: AIAA AVIATION FORUM AND ASCEND 2025. 2025, p. 3590.DOI: 10.2514/6.2025-3590

  7. [7]

    Probabilistic simulation of aircraft descent via a physics-informed machine learning approach

    Amy Hodgkin, Nick Pepper, and Marc Thomas. “Probabilistic simulation of aircraft descent via a physics-informed machine learning approach”. In: arXiv(2025).URL: https://arxiv.org/abs/2504.02529

  8. [8]

    The Jackknife and the Bootstrap for general stationary observations

    Hans R. K ¨unsch. “The Jackknife and the Bootstrap for general stationary observations”. In:The Annals of Statistics17.3 (Sept. 1989).DOI: 10 . 1214 / aos / 1176347265

  9. [9]

    MAST-GNN: A multimodal adap- tive spatio-temporal graph neural network for airspace complexity prediction

    Biyue Li, Zhishuai Li, Jun Chen, Yongjie Yan, Yisheng Lv, and Wenbo Du. “MAST-GNN: A multimodal adap- tive spatio-temporal graph neural network for airspace complexity prediction”. In:Transportation Research Part C: Emerging Technologies160 (2024), p. 104521. DOI: 10.1016/j.trc.2024.104521

  10. [10]

    A human-centered visual cognitive framework for traffic pair crossing identification in human–machine teaming

    Bufan Liu, Sun Woh Lye, Terry Liang Khin Teo, and Hong Jie Wee. “A human-centered visual cognitive framework for traffic pair crossing identification in human–machine teaming”. In:Electronics15.2 (Jan. 2026), p. 477.DOI: 10.3390/electronics15020477

  11. [11]

    Two perspectives on graph based traffic flow management

    Aude Marzuoli, Vlad Popescu, and Eric Feron. “Two perspectives on graph based traffic flow management”. In:SESAR Innovation Days. 2011

  12. [12]

    Towards transparent AI agents for air traffic control

    Elhassan Mohamed, Benjamin J. Carvell, Rob Procter, Eseoghene Benjamin, George De Ath, and Richard Everson. “Towards transparent AI agents for air traffic control”. In:AIAA SCITECH 2026 Forum. American Institute of Aeronautics and Astronautics, Jan. 2026. DOI: 10.2514/6.2026-2869

  13. [13]

    Modern hierarchical, agglomerative clustering algorithms

    Daniel M ¨ullner.Modern hierarchical, agglomerative clustering algorithms. 2011.DOI: 10.48550/ARXIV. 1109.2378

  14. [14]

    Towards an air traffic control com- plexity metric based on workspace constraints

    Marinus M. van Paassen, Jurriaan G. d’Engelbronner, and Max Mulder. “Towards an air traffic control com- plexity metric based on workspace constraints”. In: 2010 IEEE International Conference on Systems, Man and Cybernetics. IEEE, Oct. 2010, pp. 654–660.DOI: 10.1109/icsmc.2010.5641823

  15. [15]

    Air traffic con- troller workload level prediction using conformalized dynamical graph learning

    Yutian Pang, Jueming Hu, Christopher S. Lieber, Nancy J. Cooke, and Yongming Liu. “Air traffic con- troller workload level prediction using conformalized dynamical graph learning”. In:Advanced Engineering Informatics57 (2023), p. 102113.DOI: 10.1016/j.aei. 2023.102113

  16. [16]

    A probabilistic digital twin of UK en route airspace for training and evaluating AI agents for air traffic control

    Nick Pepper et al. “A probabilistic digital twin of UK en route airspace for training and evaluating AI agents for air traffic control”. In:AIAA SCITECH 2026 Forum. American Institute of Aeronautics and Astronautics, Jan. 2026.DOI: 10.2514/6.2026-1794

  17. [17]

    A sector-specific probabilistic ap- proach for 4D aircraft trajectory generation

    Nick Pepper et al. “A sector-specific probabilistic ap- proach for 4D aircraft trajectory generation”. In:Trans- portation Research Part C: Emerging Technologies179 (2025), p. 105291.DOI: 10.1016/j.trc.2025.105291

  18. [18]

    Modelling delay propagation within an airport network

    Nikolas Pyrgiotis, Kerry M. Malone, and Amedeo Odoni. “Modelling delay propagation within an airport network”. In:Transportation Research Part C: Emerg- ing Technologies27 (2013). Selected papers from the Seventh Triennial Symposium on Transportation Anal- ysis (TRISTAN VII), pp. 60–75.DOI: 10.1016/j.trc. 2011.05.017

  19. [19]

    2022.URL: https: //skybrary.aero/articles/atc-shift-supervisor

    SKYbrary.The ATC shift supervisor. 2022.URL: https: //skybrary.aero/articles/atc-shift-supervisor

  20. [20]

    Determining flight complexity and relevance: flight- centric filtering for air traffic control

    Ajay Vijay Kumbhar, Wenying Lyu, and Clark Borst. “Determining flight complexity and relevance: flight- centric filtering for air traffic control”. English. In:14th SESAR Innovation Days, SIDS 2024. 2024, pp. 1–8

  21. [21]

    Practical method for 4-dimentional strategic air traffic management problem with con- vective weather uncertainty

    Yuanchao Yang. “Practical method for 4-dimentional strategic air traffic management problem with con- vective weather uncertainty”. In:IEEE Transactions on Intelligent Transportation Systems19.6 (2018), pp. 1697–1708.DOI: 10.1109/TITS.2017.2730229

  22. [22]

    Robust 3D dynamic airspace sectorization: a multilayer graph-based ap- proach

    Tianyu Zhao, Jose Escribano, Arnab Majumdar, and Washington Yotto Ochieng. “Robust 3D dynamic airspace sectorization: a multilayer graph-based ap- proach”. In:Journal of Air Transport Management132 (2026), p. 102953.DOI: 10.1016/j.jairtraman.2025. 102953