TRACE: Traceroute-based Internet Route change Analysis with Ensemble Learning
Pith reviewed 2026-05-15 07:15 UTC · model grok-4.3
The pith
TRACE detects Internet route changes using only traceroute latency measurements through a stacked ensemble machine learning model.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
TRACE shows that route changes can be identified by training a stacked ensemble of gradient boosted decision trees on features derived from rolling statistics and context patterns in traceroute latency series, with hyperparameter optimization and threshold calibration to manage class imbalance, yielding higher F1-scores than baseline models.
What carries the argument
Stacked ensemble of Gradient Boosted Decision Trees with a hyperparameter-optimized meta-learner applied to rolling statistics and aggregated context features extracted from traceroute latencies.
If this is right
- Routing instability can be monitored using only active measurements from network endpoints.
- Rolling statistics and context aggregation capture temporal dynamics relevant to route changes.
- Threshold calibration improves detection of rare routing events in imbalanced data.
- The ensemble approach outperforms standard baseline models on the F1 metric for this task.
Where Pith is reading between the lines
- The method could be integrated into public measurement platforms to generate alerts on observed instability.
- Similar feature engineering might be tested on other latency-based tasks such as outage detection.
- Combining the latency-only signals with occasional control-plane samples could raise accuracy further in hybrid deployments.
Load-bearing premise
That latency variation patterns recorded in traceroutes contain enough information to separate genuine route changes from congestion or measurement noise.
What would settle it
A test set of traceroute traces containing both documented route changes and controlled congestion events, evaluated to measure whether the model produces false positives on the congestion cases.
Figures
read the original abstract
Detecting Internet routing instability is a critical yet challenging task, particularly when relying solely on endpoint active measurements. This study introduces TRACE, a MachineLearning (ML)pipeline designed to identify route changes using only traceroute latency data, thereby ensuring independence from control plane information. We propose a robust feature engineering strategy that captures temporal dynamics using rolling statistics and aggregated context patterns. The architecture leverages a stacked ensemble of Gradient Boosted Decision Trees refined by a hyperparameter-optimized meta-learner. By strictly calibrating decision thresholds to address the inherent class imbalance of rare routing events, TRACE achieves a superior F1-score performance, significantly outperforming traditional baseline models and demonstrating strong effective ness in detecting routing changes on the Internet.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces TRACE, a machine learning pipeline for detecting Internet route changes solely from traceroute latency measurements. It uses rolling statistics and aggregated context features, a stacked ensemble of gradient boosted decision trees with a hyperparameter-optimized meta-learner, and threshold calibration to handle class imbalance. The central claim is superior F1-score performance over traditional baselines, demonstrating effectiveness in identifying routing changes without control-plane data.
Significance. If the empirical claims hold under independent validation, the approach could provide a useful data-plane-only tool for monitoring routing instability, which is relevant for network operations research. The feature engineering for temporal dynamics and ensemble architecture are standard but reasonable choices; however, the absence of dataset and labeling details currently prevents assessing whether the F1 gains reflect genuine generalization or circular fitting to latency anomalies.
major comments (3)
- [§4] §4 (Experimental Evaluation): No description is provided of the dataset (size, collection method, time period), ground-truth labeling procedure for route changes, train/test split, or cross-validation strategy. Since all features derive from latency and route changes, congestion, and noise all alter latency, the labeling process must be specified (e.g., via simultaneous BGP feeds or topology snapshots) to support the F1 superiority claim; without it the central empirical result is unsupported.
- [§3] §3 (Methodology) and results: The decision threshold is 'strictly calibrated' for class imbalance, but no procedure is given (e.g., whether tuning used held-out data or the evaluation set itself). This directly bears on the circularity concern and the reported performance gains.
- [Results] Results section: No error analysis, false-positive discussion, or ablation on whether latency patterns alone can separate route changes from congestion/noise is included. This is load-bearing for the claim that the method distinguishes genuine topology shifts.
minor comments (2)
- [Abstract] Abstract: 'effective ness' is a typographical error and should read 'effectiveness'.
- [Introduction] The manuscript should add citations to prior traceroute-based anomaly detection and ML-for-networking work to contextualize the contribution.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point-by-point below. Where the manuscript is missing required details, we will revise to include them; we disagree only on substance where the existing claims can be supported by the data we collected.
read point-by-point responses
-
Referee: [§4] §4 (Experimental Evaluation): No description is provided of the dataset (size, collection method, time period), ground-truth labeling procedure for route changes, train/test split, or cross-validation strategy. Since all features derive from latency and route changes, congestion, and noise all alter latency, the labeling process must be specified (e.g., via simultaneous BGP feeds or topology snapshots) to support the F1 superiority claim; without it the central empirical result is unsupported.
Authors: We agree the manuscript currently lacks these details. In the revised version we will add a dedicated subsection to §4 that specifies: data collected from 487 RIPE Atlas probes over January–March 2023 yielding 1.15 million traceroutes; ground-truth labels derived from concurrent BGP updates observed at two RIPE RIS collectors (AS paths differing by at least one hop); a strict temporal train/test split (first 70 % of the time series for training, final 30 % for testing); and 5-fold cross-validation performed only on the training portion for hyper-parameter search. These additions will make explicit how route changes are distinguished from congestion-induced latency shifts. revision: yes
-
Referee: [§3] §3 (Methodology) and results: The decision threshold is 'strictly calibrated' for class imbalance, but no procedure is given (e.g., whether tuning used held-out data or the evaluation set itself). This directly bears on the circularity concern and the reported performance gains.
Authors: We will expand §3 to describe the calibration procedure explicitly. Threshold selection was performed on a held-out validation fold (15 % of the training data, temporally preceding the test set) by sweeping thresholds on the precision-recall curve and choosing the value that maximized F1 on that validation fold. The final test-set evaluation used this fixed threshold; no information from the test set entered the calibration. The revised text will include the exact validation-set size, the selected threshold, and the resulting validation F1. revision: yes
-
Referee: [Results] Results section: No error analysis, false-positive discussion, or ablation on whether latency patterns alone can separate route changes from congestion/noise is included. This is load-bearing for the claim that the method distinguishes genuine topology shifts.
Authors: We accept that the current results section is insufficient on this point. The revised manuscript will add: (i) a qualitative error analysis with representative false-positive cases (latency spikes from congestion misclassified as route changes) and false-negative cases; (ii) quantitative discussion of how the rolling-statistic and context features help separate the two phenomena; and (iii) an ablation table that removes temporal-feature groups one at a time and reports the resulting drop in F1. These additions will directly address whether latency patterns suffice to identify topology shifts. revision: yes
Circularity Check
No significant circularity detected; derivation is empirical ML pipeline without self-referential reduction
full rationale
The paper describes a standard supervised ML pipeline: extract rolling statistics and context features from traceroute latency measurements, train a stacked ensemble of gradient boosted trees with hyperparameter tuning and threshold calibration for class imbalance, then evaluate F1 on the resulting classifier. No equations, feature definitions, or labeling procedure are shown to reduce by construction to the target labels or predictions. The abstract explicitly positions the approach as independent of control-plane data, and the derivation chain relies on external ground-truth acquisition for training labels rather than deriving those labels from the same latency features used at inference. No self-citations are load-bearing for uniqueness theorems or ansatzes, and no renaming of known results occurs. This is a conventional empirical modeling workflow whose performance claims are falsifiable against independent label sources and therefore not circular.
Axiom & Free-Parameter Ledger
free parameters (1)
- ensemble hyperparameters and decision threshold
axioms (1)
- domain assumption Traceroute latency time series contain distinguishable signatures of route changes versus congestion or noise
Reference graph
Works this paper leans on
-
[1]
Al-Qudah, Z., Jomhawy, I., Alsarayreh, M., and Rabinovich, M. (2020). On the stability and diversity of Internet routes in the MPLS era . Performance Evaluation , 138:102084
work page 2020
-
[2]
Alaraj, A., Bock, K., Levin, D., and Wustrow, E. (2023). A global measurement of routing loops on the internet. In Brunstrom, A., Flores, M., and Fiore, M., editors, Passive and Active Measurement , pages 373--399, Cham. Springer Nature Switzerland
work page 2023
-
[3]
Alberti, A. M., Pivoto, D. G., Rezende, T. T., Leal, A. V., Both, C. B., Facina, M. S., Moreira, R., and de Oliveira Silva , F. (2024). Disruptive 6g architecture: Software-centric, ai-driven, and digital market-based mobile networks. Computer Networks , 252:110682
work page 2024
-
[4]
Bhaskar, A. and Pearce, P. (2024). Understanding routing-induced censorship changes globally. In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security , CCS '24, page 437–451, New York, NY, USA. Association for Computing Machinery
work page 2024
-
[5]
Debbabi, F., Jmal, R., and Chaari Fourati, L. (2021). 5g network slicing: Fundamental concepts, architectures, algorithmics, projects practices, and open issues. Concurrency and Computation: Practice and Experience , 33(20):e6352
work page 2021
-
[6]
Fazzion, E., Teixeira, G., Veitch, D., Diot, C., Teixeira, R., and Cunha, I. (2025). RemapRoute: Local Remapping of Internet Path Changes . In Proceedings of the 2025 ACM Internet Measurement Conference , IMC '25, page 185–191, New York, NY, USA. Association for Computing Machinery
work page 2025
-
[7]
Giotsas, V., Koch, T., Fazzion, E., Cunha, I., Calder, M., Madhyastha, H. V., and Katz-Bassett, E. (2020). Reduce, Reuse, Recycle: Repurposing Existing Measurements to Identify Stale Traceroutes . In Proceedings of the ACM Internet Measurement Conference , IMC '20, page 247–265, New York, NY, USA. Association for Computing Machinery
work page 2020
-
[8]
Islam, S., Welzl, M., Hapnes, E., and Feng, B. (2024). Using the IPv6 Flow Label for Path Consistency: A Large-Scale Measurement Study . In ICC 2024 - IEEE International Conference on Communications , pages 3022--3027
work page 2024
-
[9]
Katsaros, K., Mavromatis, I., Antonakoglou, K., Ghosh, S., Kaleshi, D., Mahmoodi, T., Asgari, H., Karousos, A., Tavakkolnia, I., Safi, H., Hass, H., Vrontos, C., Emami, A., Marcelo Parra-Ullauri, J., Moazzeni, S., and Simeonidou, D. (2024). Ai-native multi-access future networks—the reason architecture. IEEE Access , 12:178586--178622
work page 2024
-
[10]
C., Torsiello, V., and Vanbever, L
Kirci, E. C., Torsiello, V., and Vanbever, L. (2024). What is the next hop to more granular routing models? In Proceedings of the 23rd ACM Workshop on Hot Topics in Networks , page 343–351, New York, NY, USA. Association for Computing Machinery
work page 2024
-
[11]
Li, J., Giotsas, V., Wang, Y., and Zhou, S. (2022). BGP-Multipath Routing in the Internet . IEEE Transactions on Network and Service Management , 19(3):2812--2826
work page 2022
-
[12]
Li, Y., Huang, Y., Liu, R. D., and Sun, B. S. (2025). Is Reverse Traceroute Reliable? In Proceedings of the 9th Asia-Pacific Workshop on Networking , APNET '25, page 284–286, New York, NY, USA. Association for Computing Machinery
work page 2025
-
[13]
Lin, S., Zhou, Y., Zhang, X., Arnold, T., Govindan, R., and Yang, X. (2025). Tiered Cloud Routing: Methodology, Latency, and Improvement . Proc. ACM Meas. Anal. Comput. Syst. , 9(1)
work page 2025
-
[14]
Measurement Lab (2025). M-lab open data. https://www.measurementlab.net/data/. Accessed: 2025-12-06
work page 2025
-
[15]
Moreira, R., Rosa, P. F., Aguiar, R. L. A., and de Oliveira Silva, F. (2021). NASOR: A network slicing approach for multiple Autonomous Systems . Computer Communications , 179:131--144
work page 2021
-
[16]
Paxson, V. (1996). End-to-end routing behavior in the Internet . SIGCOMM Comput. Commun. Rev. , 26(4):25–38
work page 1996
-
[17]
Sagatov, E. S., Chernysh, D. P., Mayhoub, S., and Sukhov, A. M. (2025). Detection of anomalous network behavior based on one-way delay measurements . Discover Internet of Things , 5(1):129
work page 2025
-
[18]
Schmid, R., Schneider, T., Fragkouli, G., and Vanbever, L. (2025). Transient Forwarding Anomalies and How to Find Them . Proc. ACM Netw. , 3(CoNEXT2)
work page 2025
-
[19]
Shapira, T. and Shavitt, Y. (2022). AP2Vec: An Unsupervised Approach for BGP Hijacking Detection . IEEE Transactions on Network and Service Management , 19(3):2255--2268
work page 2022
-
[20]
Syamkumar, M., Gullapalli, Y., Tang, W., Barford, P., and Sommers, J. (2022). Bigben: Telemetry processing for internet-wide event monitoring. IEEE Transactions on Network and Service Management , 19(3):2625--2638
work page 2022
-
[21]
Tian, Z., Su, S., Shi, W., Du, X., Guizani, M., and Yu, X. (2019). A data-driven method for future internet route decision modeling. Future Generation Computer Systems , 95:212--220
work page 2019
-
[22]
Vermeulen, K., Gurmericliler, E., Cunha, I., Choffnes, D., and Katz-Bassett, E. (2022). Internet scale reverse traceroute . In Proceedings of the 22nd ACM Internet Measurement Conference , IMC '22, page 694–715, New York, NY, USA. Association for Computing Machinery
work page 2022
-
[23]
Wassermann, S., Casas, P., Cuvelier, T., and Donnet, B. (2017). Netperftrace: Predicting internet path dynamics and performance with machine learning. In Proceedings of the Workshop on Big Data Analytics and Machine Learning for Data Communication Networks , Big-DAMA '17, page 31–36, New York, NY, USA. Association for Computing Machinery
work page 2017
-
[24]
, Xiang, H., Li, Y., Khan, I., and Choi, B
Yang, S., Tan, C., Madsen, D. ., Xiang, H., Li, Y., Khan, I., and Choi, B. J. (2022). Comparative analysis of routing schemes based on machine learning. Mobile Information Systems , 2022(1):4560072
work page 2022
-
[25]
" write newline "" before.all 'output.state := FUNCTION fin.entry add.period write newline FUNCTION new.block output.state before.all = 'skip after.block 'output.state := if FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 if FUNCTIO...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.