pith. sign in

arxiv: 2606.11017 · v1 · pith:KVMWC5OSnew · submitted 2026-06-09 · 💻 cs.LG · eess.AS

Data-Driven Runway and Taxiway Exits Prediction of Landing Aircraft: A Case Study at Hartsfield-Jackson Atlanta International Airport

Pith reviewed 2026-06-27 13:34 UTC · model grok-4.3

classification 💻 cs.LG eess.AS
keywords runway exit predictiontaxiway routingairport surface operationsmachine learningASDE-X trajectoriesgradient boostingair traffic control decision support
0
0 comments X

The pith

A two-stage machine learning system predicts the runway exit and subsequent taxi routing chosen by landing aircraft at KATL.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a two-stage data-driven aid that first forecasts which runway exit an arriving aircraft will select and then, given the exit, whether the aircraft will cross the departure runway or take the end-around taxiway. Models are trained on ASDE-X surface trajectories together with aircraft characteristics, ramp destinations, short-horizon traffic rates, and weather. Gradient boosting classifiers reach 0.86-0.89 accuracy on exit prediction and 0.70-0.74 on routing prediction, with approach speed emerging as the dominant exit driver and traffic rates plus ramp destination as the main routing drivers. The framework is intended to supply calibrated, explainable information that supports controller situational awareness while leaving the final decision to the human.

Core claim

The authors claim that a two-stage classifier pipeline, Stage I for runway exit selection and Stage II for crossing-versus-end-around routing, trained on ASDE-X trajectories, aircraft type, ramp destination, traffic rates, and weather across multiple look-back windows, reproduces controller workflow at KATL with the reported accuracies and identifies approach speed as the leading exit predictor and departure rate, crossing rate, and ramp destination as the leading routing predictors.

What carries the argument

Two-stage gradient-boosting pipeline (XGBoost and LightGBM) that first classifies runway exit and then classifies crossing versus end-around routing given the exit.

If this is right

  • Approach speed is the strongest single predictor of which runway exit an arriving aircraft selects.
  • Departure rate, crossing rate, ramp destination, and (in west flow) the selected exit are the strongest predictors of crossing versus end-around routing.
  • Minority routing classes remain harder to separate because of feature-space overlap visible in t-SNE and UMAP projections.
  • The resulting predictions are calibrated and can be presented to controllers as decision support while preserving human responsibility for the final routing.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same feature set and two-stage structure could be retrained at other high-volume airports to test whether approach speed and traffic-rate importance generalize.
  • Real-time deployment would require streaming updates to the short-horizon traffic-rate features; performance under live data latency remains untested in the paper.
  • Adding pilot intent signals or voice-communication features might reduce the residual overlap that limits minority-class performance.

Load-bearing premise

Historical patterns in ASDE-X trajectories, aircraft characteristics, ramp destinations, traffic rates, and weather will continue to determine future exit and routing choices without major unmodeled influences such as pilot communications.

What would settle it

Apply the published models to a fresh hold-out set of KATL arrivals collected after the training period and measure whether Stage I accuracy falls materially below 0.86 or Stage II accuracy falls materially below 0.70.

Figures

Figures reproduced from arXiv: 2606.11017 by Alex Porcayo, John-Paul Clarke, Maria Thomas, Yutian Pang.

Figure 1
Figure 1. Figure 1: KATL’s north complex (arrivals on 8L/26R, departures on 8R/26L) concentrates [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: PMF histograms of the taxiways taken right before reaching the destination [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Schematic of study area and decision points on KATL’s north complex. [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Layout of runway 8L/26R (arrivals) and 8R/26L (departures) and stage-wise [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Simple Confusion Matrix [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Multiclass Confusion Matrix 4.4.3. Precision–Recall The precision–recall curve plots precision (Equation (8)) against recall (identical to TPR) as the classification threshold varies. Unlike the ROC curve, the PR curve is sensitive to class prevalence, making it the more informative diagnostic when classes are heavily imbalanced. P recision = T P T P + F P (8) 21 [PITH_FULL_IMAGE:figures/full_fig_p022_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: ROC AUC Plots Stage I 24 [PITH_FULL_IMAGE:figures/full_fig_p025_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Precision Recall Plots Stage I 25 [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Confusion Matrices Stage I 5.2. Stage II Results [PITH_FULL_IMAGE:figures/full_fig_p027_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: ROC AUC Plots Stage II 28 [PITH_FULL_IMAGE:figures/full_fig_p029_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Precision Recall Plots Stage II 29 [PITH_FULL_IMAGE:figures/full_fig_p030_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Confusion Matrices Stage II 5.3. Insights and Feature Attributions Focusing on the XGBoost models, [PITH_FULL_IMAGE:figures/full_fig_p031_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: SHAP Feature Importance SHAP results ( [PITH_FULL_IMAGE:figures/full_fig_p032_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Visualization of the reduced 2D feature-space class overlap for Stage 1 runway [PITH_FULL_IMAGE:figures/full_fig_p040_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Stage 1 reliability diagrams. Predicted probabilities align closely with observed [PITH_FULL_IMAGE:figures/full_fig_p042_15.png] view at source ↗
read the original abstract

Airport surface operations increasingly constrain performance at high-throughput hubs. This study examines arrival taxi-in decisions at Hartsfield-Jackson Atlanta International Airport (KATL) and proposes a two-stage, data-driven decision aid that mirrors controller workflow. Stage I predicts the runway exit selected by an arriving aircraft. Stage II predicts whether, given that exit, the aircraft will cross the active departure runway at a designated point or use the end-around taxiway. Models are trained using ASDE-X surface trajectories, aircraft characteristics, ramp destinations, short-horizon traffic rates, and weather across multiple look-back windows. We benchmark nine classifiers, including Random Forest, XGBoost, LightGBM, and CatBoost, and evaluate accuracy, macro-F1, precision-recall behavior, confusion matrices, Brier score, and Expected Calibration Error. Across east and west flows, XGBoost and LightGBM outperform Random Forest. Stage I achieves 0.86-0.89 accuracy with macro-F1 scores of 0.40-0.50, while Stage II achieves 0.70-0.74 accuracy with macro-F1 scores of 0.28-0.55. Feature-importance analysis shows that approach speed is the main driver of exit choice. Departure rate, crossing rate, ramp destination, and, for west flow, the selected exit are the strongest predictors of crossing versus end-around routing. Minority classes remain harder to predict because of feature-space overlap, as shown by t-SNE and UMAP analyses. The proposed framework supports controller situational awareness through calibrated, explainable predictions while preserving human responsibility for final routing decisions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a two-stage supervised learning framework to predict runway exit selection (Stage I) and subsequent crossing versus end-around routing (Stage II) for arriving aircraft at KATL. It trains and benchmarks nine classifiers (including XGBoost and LightGBM) on ASDE-X trajectories augmented with aircraft type, ramp destination, short-horizon traffic rates, and weather using multiple look-back windows, reporting accuracies of 0.86-0.89 (Stage I) and 0.70-0.74 (Stage II) together with macro-F1, Brier scores, calibration error, confusion matrices, and feature-importance rankings that identify approach speed as dominant for exits and traffic rates/ramp destination for routing.

Significance. If the reported metrics prove robust, the work supplies a practical, calibrated, and explainable decision-support tool that aligns with controller workflow while keeping final authority with the human. The explicit use of macro-F1, calibration diagnostics, and dimensionality-reduction diagnostics for minority-class overlap constitutes a methodological strength relative to many applied aviation-ML studies.

major comments (2)
  1. [Methods / Results (data partitioning and preprocessing)] The central performance claims (accuracy 0.86-0.89 / macro-F1 0.40-0.50 for Stage I; accuracy 0.70-0.74 / macro-F1 0.28-0.55 for Stage II) rest on the integrity of the train/test splits, the handling of class imbalance, and whether feature selection or hyper-parameter tuning was performed on the held-out set. These details are not verifiable from the provided text and directly affect whether the quoted numbers can be treated as unbiased estimates of operational performance.
  2. [Discussion / Feature-importance and visualization sections] The claim that the framework supports controller situational awareness assumes that the chosen feature set (ASDE-X trajectories, aircraft characteristics, ramp destinations, short-horizon rates, weather) sufficiently captures the exit and routing decision process. The reported t-SNE/UMAP overlap for minority classes already indicates that unmodeled factors (e.g., pilot-ATC voice exchanges or untracked dynamics) could further degrade macro-F1 or alter the reported feature rankings; no sensitivity analysis to omitted variables is presented.
minor comments (2)
  1. [Methods] Clarify the exact look-back window lengths and the rationale for their selection; these appear among the free parameters but are not enumerated in the abstract.
  2. [Results] Ensure that all nine classifiers are evaluated under identical preprocessing and that the reported macro-F1 values are accompanied by per-class precision-recall tables for the minority exits.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. Below we respond point-by-point to the major comments, clarifying methodological aspects where possible and indicating revisions that will strengthen the manuscript.

read point-by-point responses
  1. Referee: [Methods / Results (data partitioning and preprocessing)] The central performance claims (accuracy 0.86-0.89 / macro-F1 0.40-0.50 for Stage I; accuracy 0.70-0.74 / macro-F1 0.28-0.55 for Stage II) rest on the integrity of the train/test splits, the handling of class imbalance, and whether feature selection or hyper-parameter tuning was performed on the held-out set. These details are not verifiable from the provided text and directly affect whether the quoted numbers can be treated as unbiased estimates of operational performance.

    Authors: We agree that explicit documentation of the data partitioning, imbalance handling, and tuning protocol is necessary to substantiate the performance claims. The current text describes the overall experimental setup and classifier benchmarking but does not provide a dedicated subsection on these implementation choices. In the revised manuscript we will add a Methods subsection that specifies: (i) the temporal train/test split used to avoid leakage, (ii) confirmation that all hyper-parameter search and any feature engineering occurred exclusively within the training folds via cross-validation, and (iii) the class-weighting scheme applied to address imbalance. These additions will make the reported metrics verifiable as unbiased estimates. revision: yes

  2. Referee: [Discussion / Feature-importance and visualization sections] The claim that the framework supports controller situational awareness assumes that the chosen feature set (ASDE-X trajectories, aircraft characteristics, ramp destinations, short-horizon rates, weather) sufficiently captures the exit and routing decision process. The reported t-SNE/UMAP overlap for minority classes already indicates that unmodeled factors (e.g., pilot-ATC voice exchanges or untracked dynamics) could further degrade macro-F1 or alter the reported feature rankings; no sensitivity analysis to omitted variables is presented.

    Authors: We accept that the feature set is necessarily limited to observables present in the ASDE-X feed and supplementary sources, and that the t-SNE/UMAP visualizations already demonstrate substantial overlap for minority classes. The manuscript positions the models as decision-support aids that keep final authority with the controller precisely because of these limitations. A quantitative sensitivity analysis to omitted variables (e.g., voice communications) would require data sources outside the scope of this study. We will nevertheless revise the Discussion to include an explicit paragraph on potential omitted-variable effects, their likely impact on feature rankings, and the consequent bounds on operational utility. revision: partial

Circularity Check

0 steps flagged

No circularity: standard supervised learning on held-out data

full rationale

The paper trains and evaluates classifiers (XGBoost, LightGBM, etc.) on features extracted from ASDE-X trajectories, aircraft data, ramp destinations, traffic rates, and weather. Predictions are generated via standard cross-validation on held-out test sets; no equations, fitted parameters, or self-citations reduce outputs to inputs by construction. Feature importances are post-hoc analyses of the trained models, not definitional. The framework is self-contained empirical modeling with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that historical surface data patterns remain predictive and that the chosen features capture the dominant decision factors.

free parameters (2)
  • look-back window lengths
    Multiple look-back windows used for traffic and weather features; exact lengths chosen during model development.
  • classifier hyperparameters
    XGBoost, LightGBM, and other models require hyperparameter tuning on the training data.
axioms (1)
  • domain assumption Historical ASDE-X trajectories and associated metadata are representative of future arrival operations at KATL.
    Models trained on past data are applied to predict future decisions.

pith-pipeline@v0.9.1-grok · 5845 in / 1422 out tokens · 23835 ms · 2026-06-27T13:34:19.476667+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references · 2 linked inside Pith

  1. [1]

    Donald K. Knuth. The book. 1984

  2. [2]

    IEEE Transactions on Intelligent Transportation Systems , volume=

    A deep reinforcement learning approach for airport departure metering under spatial--temporal airside interactions , author=. IEEE Transactions on Intelligent Transportation Systems , volume=. 2022 , publisher=

  3. [3]

    Transportation Research Record , pages=

    Flexible Use of End-Around Taxiways to Reduce Risk and Improve the Efficiency of Airport Operations , author=. Transportation Research Record , pages=. 2025 , publisher=

  4. [4]

    2024 , school=

    Simulation Analysis of Implementing End-Around Taxiway on Crossing Runways , author=. 2024 , school=

  5. [5]

    Air traffic control quarterly , volume=

    Surface performance of end-around taxiways , author=. Air traffic control quarterly , volume=. 2014 , publisher=

  6. [6]

    Journal of Aerospace Information Systems , volume=

    Dynamic Routing and Scheduling Approach for Aircraft Taxi Automation with Adaptive Surface Situation , author=. Journal of Aerospace Information Systems , volume=. 2025 , publisher=

  7. [7]

    Transportation Research Part C: Emerging Technologies , volume=

    Data-driven trajectory prediction with weather uncertainties: A Bayesian deep learning approach , author=. Transportation Research Part C: Emerging Technologies , volume=. 2021 , publisher=

  8. [8]

    Journal of Aerospace Information Systems , volume=

    Unimpeded taxi-time prediction based on the node--Link model , author=. Journal of Aerospace Information Systems , volume=. 2020 , publisher=

  9. [9]

    Transportation Research Part C: Emerging Technologies , volume=

    A real-time active routing approach via a database for airport surface movement , author=. Transportation Research Part C: Emerging Technologies , volume=. 2015 , publisher=

  10. [10]

    A runway exit prediction model with visually explainable machine decisions , author=

  11. [11]

    International Journal of Pavement Engineering , volume=

    A gradient boosting approach to understanding airport runway and taxiway pavement deterioration , author=. International Journal of Pavement Engineering , volume=. 2021 , publisher=

  12. [12]

    arXiv preprint arXiv:1612.05693 , year=

    Optimal target assignment and path finding for teams of agents , author=. arXiv preprint arXiv:1612.05693 , year=

  13. [13]

    AIAA Aviation 2019 Forum , pages=

    Departure scheduling and taxiway path planning under uncertainty , author=. AIAA Aviation 2019 Forum , pages=

  14. [14]

    IEEE Transactions on Intelligent Transportation Systems , volume=

    Optimization of taxiway routing and runway scheduling , author=. IEEE Transactions on Intelligent Transportation Systems , volume=. 2011 , publisher=

  15. [15]

    Proceedings of the 4th international conference on research in air transportation (ICRAT), Budapest, Hungary , pages=

    The airport ground movement problem: Past and current research and future directions , author=. Proceedings of the 4th international conference on research in air transportation (ICRAT), Budapest, Hungary , pages=

  16. [16]

    International Journal of Aerospace Engineering , volume=

    Optimal airport surface traffic planning using mixed-integer linear programming , author=. International Journal of Aerospace Engineering , volume=. 2008 , publisher=

  17. [17]

    Aerospace , volume=

    A conflict resolution strategy at a taxiway intersection by combining a Monte Carlo tree search with prior knowledge , author=. Aerospace , volume=. 2023 , publisher=

  18. [18]

    IEEE Access , volume=

    Application of improved Q-learning algorithm in dynamic path planning for aircraft at airports , author=. IEEE Access , volume=. 2023 , publisher=

  19. [19]

    Author A. Person. Example Citation. 2023

  20. [20]

    2008 IEEE/AIAA 27th Digital Avionics Systems Conference , pages=

    Estimating taxi-out times with a reinforcement learning algorithm , author=. 2008 IEEE/AIAA 27th Digital Avionics Systems Conference , pages=. 2008 , organization=

  21. [21]

    and Ruszkowski, Louise Morgan , title =

    Engelland, Shawn A. and Ruszkowski, Louise Morgan , title =. Proceedings of the 2010 AIAA Aviation Technology, Integration, and Operations (ATIO) / ISSMO Conference , year =

  22. [22]

    2023 , url =

    NASA Aeronautics Research Mission Directorate , title =. 2023 , url =

  23. [23]

    Transportation Research Part E: Logistics and Transportation Review , volume =

    A machine learning model to predict runway exit at Vienna airport , author =. Transportation Research Part E: Logistics and Transportation Review , volume =. 2019 , publisher =

  24. [24]

    Eighth SESAR Innovation Days , month = dec, year =

    A Boosted Tree Framework for Runway Occupancy and Exit Prediction , author =. Eighth SESAR Innovation Days , month = dec, year =

  25. [25]

    Air Traffic Control Quarterly , year=

    Queuing Model for Taxi-Out Time Estimation , author=. Air Traffic Control Quarterly , year=

  26. [26]

    Journal of Advanced Transportation , volume=

    Research on Aircraft Surface Taxi Path Planning and Conflict Detection and Resolution , author=. Journal of Advanced Transportation , volume=. 2021 , publisher=

  27. [27]

    2021 , publisher=

    Xu, Yang and Ma, Feng and Qin, Zhihao and Zhang, Tianxu , journal=. 2021 , publisher=

  28. [28]

    AIAA Aviation Forum , year=

    Prediction of Runway Occupancy Time and Runway Exit Distance with Feedforward Neural Networks , author=. AIAA Aviation Forum , year=

  29. [29]

    Advances in Neural Information Processing Systems , volume=

    Why do tree-based models still outperform deep learning on typical tabular data? , author=. Advances in Neural Information Processing Systems , volume=

  30. [30]

    Knowledge-Based Systems , volume=

    On the class overlap problem in imbalanced data classification , author=. Knowledge-Based Systems , volume=. 2021 , publisher=

  31. [31]

    Journal of Big Data , volume=

    Boosting methods for multi-class imbalanced data classification: an experimental review , author=. Journal of Big Data , volume=. 2020 , publisher=

  32. [32]

    , author=

    Measuring calibration in deep learning. , author=. CVPR workshops , volume=

  33. [33]

    , author=

    Visualizing data using t-SNE. , author=. Journal of machine learning research , volume=

  34. [34]

    arXiv preprint arXiv:1802.03426 , year=

    Umap: Uniform manifold approximation and projection for dimension reduction , author=. arXiv preprint arXiv:1802.03426 , year=

  35. [35]

    Journal of clinical epidemiology , volume=

    Assessing predictive accuracy: how to compare Brier scores , author=. Journal of clinical epidemiology , volume=. 1991 , publisher=

  36. [36]

    Transportation Research Part C: Emerging Technologies , volume=

    From voice to safety: language ai powered pilot-atc communication understanding for airport surface movement collision risk assessment , author=. Transportation Research Part C: Emerging Technologies , volume=. 2026 , publisher=

  37. [37]

    Computational statistics & data analysis , volume=

    Stochastic gradient boosting , author=. Computational statistics & data analysis , volume=. 2002 , publisher=

  38. [38]

    Annals of statistics , pages=

    Greedy function approximation: a gradient boosting machine , author=. Annals of statistics , pages=. 2001 , publisher=

  39. [39]

    The annals of statistics , volume=

    Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors) , author=. The annals of statistics , volume=. 2000 , publisher=

  40. [40]

    Are neural rankers still outperformed by gradient boosted decision trees? , author=

  41. [41]

    IEEE Transactions on Intelligent Transportation Systems , volume=

    A generative adversarial imitation learning approach for realistic aircraft taxi-speed modeling , author=. IEEE Transactions on Intelligent Transportation Systems , volume=. 2021 , publisher=

  42. [42]

    Transportation Research Part C: Emerging Technologies , volume=

    Machine learning-enhanced aircraft landing scheduling under uncertainties , author=. Transportation Research Part C: Emerging Technologies , volume=. 2024 , publisher=

  43. [43]

    Aerospace , volume=

    Explanation of machine-learning solutions in air-traffic management , author=. Aerospace , volume=. 2021 , publisher=

  44. [44]

    AIAA Aviation 2019 Forum , pages=

    A data-driven approach to understanding runway occupancy time , author=. AIAA Aviation 2019 Forum , pages=

  45. [45]

    2022 IEEE Aerospace Conference (AERO) , pages=

    (Explainable) artificial intelligence in aerospace safety-critical systems , author=. 2022 IEEE Aerospace Conference (AERO) , pages=. 2022 , organization=

  46. [46]

    Industrial Management & Data Systems , volume=

    Artificial intelligence in safety-critical systems: a systematic review , author=. Industrial Management & Data Systems , volume=. 2022 , publisher=

  47. [47]

    Applied Sciences , volume=

    Prediction and analysis of airport surface taxi time: Classification, features, and methodology , author=. Applied Sciences , volume=. 2024 , publisher=

  48. [48]

    15th AIAA Aviation Technology, Integration, and Operations Conference , pages=

    Taxi time prediction at Charlotte Airport using fast-time simulation and machine learning techniques , author=. 15th AIAA Aviation Technology, Integration, and Operations Conference , pages=

  49. [49]

    Proceedings of the 2nd International Conference on Application and Theory of Automation in Command and Control Systems , pages=

    Optimization of Taxiway Routing and Runway Scheduling , author=. Proceedings of the 2nd International Conference on Application and Theory of Automation in Command and Control Systems , pages=. 2012 , doi=

  50. [50]

    Journal of Aerospace Information Systems , year=

    Dynamic Routing and Scheduling Approach for Aircraft Taxi Automation with Adaptive Surface Situation , author=. Journal of Aerospace Information Systems , year=

  51. [51]

    16th AIAA Aviation Technology, Integration, and Operations Conference , year=

    Surface Performance of End-Around Taxiways , author=. 16th AIAA Aviation Technology, Integration, and Operations Conference , year=