Forecasting Individual NetFlows using a Predictive Masked Graph Autoencoder
Pith reviewed 2026-05-09 23:40 UTC · model grok-4.3
The pith
A predictive masked graph autoencoder on sliding-window heterogeneous graphs forecasts individual NetFlow connections by accurately identifying their ports and IPs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is that representing network traffic as sequences of heterogeneous graphs and applying a predictive masked graph autoencoder allows the model to predict which ports and IP addresses individual connections will attach to with higher accuracy than traditional forecasting methods, while feature prediction stays on par with strong baselines.
What carries the argument
The predictive masked graph autoencoder applied to temporal heterogeneous graphs of IP, port, and connection nodes, which learns to reconstruct masked graph elements to enable forecasting of structure and features.
If this is right
- Superior identification of connection endpoints enables more precise network traffic analysis.
- Competitive feature reconstruction supports reliable prediction of flow characteristics.
- GNNs prove effective for modeling temporal evolution in network traffic data.
- The method provides a new way to handle per-flow predictions in dynamic environments.
Where Pith is reading between the lines
- Similar graph-based approaches could be tested in other domains involving temporal connections, such as communication networks or transaction graphs.
- Performance might improve further by incorporating additional contextual features into the node representations.
- If the model scales well, it could replace or complement existing time-series forecasting tools in network management systems.
Load-bearing premise
The sliding-window heterogeneous graphs with IP, port, and connection nodes are assumed to adequately capture the temporal evolution of network traffic for making accurate per-flow predictions.
What would settle it
Running the model on a different network dataset and finding that it no longer shows superior results in identifying ports and IPs for connections compared to the baselines.
Figures
read the original abstract
In this paper, we propose a proof-of-concept Graph Neural Network model that can successfully predict network flow-level traffic (NetFlow) by accurately modelling the graph structure and the connection features. We use sliding-windows to split the network traffic in equal-sized heterogeneous bidirectional graphs containing IP, Port, and Connection nodes. We then use the GNN to model the evolution of the graph structure and the connection features. Our approach shows superior results when identifying the Port and IP to which connections attach, while feature reconstruction remains competitive with strong forecasting baselines. Overall, our work showcases the use of GNNs for per-flow NetFlow prediction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a Predictive Masked Graph Autoencoder (a GNN-based model) for forecasting individual NetFlow traffic. It splits network traffic into sliding-window heterogeneous bidirectional graphs containing IP, Port, and Connection nodes, then uses the GNN to model the evolution of graph structure and connection features. The central claim is that the approach achieves superior performance in identifying the Port and IP to which connections attach, while feature reconstruction is competitive with strong forecasting baselines.
Significance. If the model performs genuine temporal extrapolation to future windows (rather than intra-window imputation) and the reported superiority holds under rigorous evaluation, the work could demonstrate a useful graph-structured approach to per-flow NetFlow prediction that captures relational dependencies among IPs, ports, and connections. This would be of interest to network traffic analysis and management, provided the experimental evidence is made explicit.
major comments (3)
- [Abstract and Methodology] The abstract and title emphasize 'forecasting' and 'predict[ing] network flow-level traffic' via modeling 'the evolution' of the graph, yet the reported results focus on 'feature reconstruction' that is 'competitive with strong forecasting baselines.' The methodology section must explicitly state whether the masked autoencoder performs temporal extrapolation (predicting features/structure in a held-out future window from prior windows) or only masked reconstruction within the same observed sliding-window graph. This distinction is load-bearing for the forecasting claim.
- [Abstract and Experimental Evaluation] No quantitative results, dataset details (e.g., traffic traces used, number of flows, time spans), baseline descriptions, or error metrics are provided in the abstract or visible experimental sections. The claim of 'superior results when identifying the Port and IP' cannot be evaluated without tables reporting precision/recall/F1, comparison methods, or statistical significance tests.
- [Graph Construction and Model Architecture] The sliding-window construction of heterogeneous graphs is described at a high level, but the paper does not specify how temporal edges or node features evolve across windows, nor how the predictive masking is scheduled to ensure out-of-sample forecasting rather than in-sample imputation. This affects whether the GNN truly models evolution or performs reconstruction.
minor comments (2)
- [Abstract] The abstract states 'our approach shows superior results' without any supporting numbers or references to specific tables/figures; this should be replaced with concrete metrics or removed until the results section is complete.
- [Methodology] Notation for the heterogeneous graph (node types, edge directions, feature vectors) is introduced but not formalized with equations or a diagram; adding a clear definition would improve readability.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which highlight key areas for clarifying the temporal forecasting claims and improving the visibility of our experimental results. We address each major comment point by point below.
read point-by-point responses
-
Referee: [Abstract and Methodology] The abstract and title emphasize 'forecasting' and 'predict[ing] network flow-level traffic' via modeling 'the evolution' of the graph, yet the reported results focus on 'feature reconstruction' that is 'competitive with strong forecasting baselines.' The methodology section must explicitly state whether the masked autoencoder performs temporal extrapolation (predicting features/structure in a held-out future window from prior windows) or only masked reconstruction within the same observed sliding-window graph. This distinction is load-bearing for the forecasting claim.
Authors: We appreciate the referee's emphasis on this critical distinction. Our approach uses sequences of sliding windows to enable temporal modeling: the GNN is trained on prior windows to predict graph structure and connection features in subsequent held-out future windows, with predictive masking applied to simulate extrapolation rather than in-sample imputation. The phrase 'feature reconstruction' in the abstract refers to the autoencoder's core training objective, while evaluation metrics demonstrate performance on forecasting tasks against baselines. We will revise the Methodology section to explicitly describe the temporal extrapolation setup, including how prior windows condition predictions for future windows. revision: yes
-
Referee: [Abstract and Experimental Evaluation] No quantitative results, dataset details (e.g., traffic traces used, number of flows, time spans), baseline descriptions, or error metrics are provided in the abstract or visible experimental sections. The claim of 'superior results when identifying the Port and IP' cannot be evaluated without tables reporting precision/recall/F1, comparison methods, or statistical significance tests.
Authors: We agree that the abstract would benefit from including key quantitative highlights. The full Experimental Evaluation section contains the requested details: specific traffic traces, flow counts, time spans, baseline descriptions (including strong forecasting methods), error metrics, tables reporting precision/recall/F1 for Port and IP identification, method comparisons, and statistical significance tests. We will revise the abstract to summarize these results and ensure the experimental section is clearly structured with all supporting tables and analyses. revision: yes
-
Referee: [Graph Construction and Model Architecture] The sliding-window construction of heterogeneous graphs is described at a high level, but the paper does not specify how temporal edges or node features evolve across windows, nor how the predictive masking is scheduled to ensure out-of-sample forecasting rather than in-sample imputation. This affects whether the GNN truly models evolution or performs reconstruction.
Authors: We will expand the Graph Construction and Model Architecture sections with these specifics. The revised text will detail the evolution of temporal edges (e.g., persisting connections and new formations across consecutive windows), updates to node features (for IP, Port, and Connection nodes) over time, and the predictive masking schedule, where masks target elements in future windows based on observations from prior windows. This design ensures out-of-sample evaluation of graph evolution rather than intra-window imputation. revision: yes
Circularity Check
No significant circularity; empirical GNN model trained on data with no definitional or self-referential reduction.
full rationale
The paper presents a masked graph autoencoder trained on sliding-window heterogeneous graphs of IP, Port, and Connection nodes to model network traffic evolution. All claims rest on empirical training and evaluation against baselines rather than any parameter fitted to the target quantity and then renamed as a prediction. No equations, self-citations, or uniqueness theorems are invoked that would make the reported forecasting or reconstruction results equivalent to the inputs by construction. The distinction between intra-window reconstruction and true temporal extrapolation is an experimental-design question, not a circularity in the derivation chain.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
B. Pf ¨ulb, C. Hardegen, A. Gepperth, and S. Rieger,A Study of Deep Learning for Network Traffic Data Forecasting, p. 497–512. Springer International Publishing, 2019
work page 2019
-
[2]
Forecasting network traffic: A survey and tutorial with open-source comparative evaluation,
G. O. Ferreira, C. Ravazzi, F. Dabbene, G. C. Calafiore, and M. Fiore, “Forecasting network traffic: A survey and tutorial with open-source comparative evaluation,”IEEE Access, vol. 11, pp. 6018–6044, 2023
work page 2023
-
[3]
Towards fine grained network flow prediction,
P. Jahnke, E. Stapf, J. Mieseler, G. Neumann, and P. Eugster, “Towards fine grained network flow prediction,” 2018
work page 2018
-
[4]
m4: A learned flow-level network simulator,
C. Li, A. A. Zabreyko, A. Nasr-Esfahany, K. Zhao, P. Goyal, M. Al- izadeh, and T. Anderson, “m4: A learned flow-level network simulator,” 2025
work page 2025
-
[5]
Research Challenges in Coupling Artificial Intelligence and Net- work Management,
J. Franc ¸ois, A. Clemm, D. Papadimitriou, S. Fernandes, and S. Schnei- der, “Research Challenges in Coupling Artificial Intelligence and Net- work Management,” Internet-Draft draft-irtf-nmrg-ai-challenges-05, In- ternet Engineering Task Force, Mar. 2025. Expired Internet-Draft
work page 2025
-
[6]
Network traffic analysis based on graph neural networks: A scoping review,
R. Wang, J. Zhao, H. Zhang, L. He, H. Li, and M. Huang, “Network traffic analysis based on graph neural networks: A scoping review,”Big Data and Cognitive Computing, vol. 9, no. 11, 2025
work page 2025
-
[7]
Graph convolutional networks: a comprehensive review,
S. Zhang, H. Tong, J. Xu, and R. Maciejewski, “Graph convolutional networks: a comprehensive review,”Computational Social Networks, vol. 6, no. 1, pp. 1–23, 2019
work page 2019
-
[8]
R. Xu, G. Wu, W. Wang, X. Gao, A. He, and Z. Zhang, “Applying self- supervised learning to network intrusion detection for network flows with graph neural network,”Computer Networks, vol. 248, p. 110495, 2024
work page 2024
-
[9]
Unveiling the potential of graph neural networks for robust intrusion detection,
D. Pujol-Perich, J. Su ´arez-Varela, A. Cabellos-Aparicio, and P. Barlet- Ros, “Unveiling the potential of graph neural networks for robust intrusion detection,” 2021
work page 2021
-
[10]
P. Aitken, “Specification of the IP flow information export (IPFIX) pro- tocol for the exchange of flow information,” tech. rep., Cisco Systems, Inc., Sept. 2013
work page 2013
-
[11]
S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997
work page 1997
-
[12]
A survey on long short-term memory networks for time series prediction,
B. Lindemann, T. M ¨uller, H. Vietz, N. Jazdi, and M. Weyrich, “A survey on long short-term memory networks for time series prediction,” Procedia CIRP, vol. 99, pp. 650–655, 2021. 14th CIRP Conference on Intelligent Computation in Manufacturing Engineering, 15-17 July 2020
work page 2021
-
[13]
An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,
S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,” 2018
work page 2018
-
[14]
Deep learning for time series forecasting: a survey,
X. Kong, Z. Chen, W. Liu, K. Ning, L. Zhang, S. Muhammad Marier, Y . Liu, Y . Chen, and F. Xia, “Deep learning for time series forecasting: a survey,”Int. J. Mach. Learn. Cybern., vol. 16, pp. 5079–5112, Aug. 2025
work page 2025
-
[15]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,”CoRR, vol. abs/1706.03762, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[16]
Transformers in time series: A survey,
Q. Wen, T. Zhou, C. Zhang, W. Chen, Z. Ma, J. Yan, and L. Sun, “Transformers in time series: A survey,” 2023
work page 2023
-
[17]
Are transformers effective for time series forecasting?,
A. Zeng, M. Chen, L. Zhang, and Q. Xu, “Are transformers effective for time series forecasting?,” 2022
work page 2022
-
[18]
The graph neural network model,
F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The graph neural network model,”IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 61–80, 2009
work page 2009
-
[19]
Inductive representation learning on large graphs,
W. L. Hamilton, R. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” 2018
work page 2018
-
[20]
N. Moustafa and J. Slay, “Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set),” in 2015 Military Communications and Information Systems Conference (MilCIS), pp. 1–6, 2015
work page 2015
-
[21]
Cisco systems netflow services export version 9,
B. Claise, “Cisco systems netflow services export version 9,” tech. rep., Cisco, 2004
work page 2004
-
[22]
Nfstream: A flexible network data analysis framework,
Z. Aouini and A. Pekar, “Nfstream: A flexible network data analysis framework,”Computer Networks, vol. 204, p. 108719, 2022
work page 2022
-
[23]
Autographad: Unsupervised network anomaly detection using variational graph autoencoders,
G. Anyfantis and P. Barlet-Ros, “Autographad: Unsupervised network anomaly detection using variational graph autoencoders,” 2026
work page 2026
-
[24]
C. R. Harris, K. J. Millman, S. J. van der Walt, R. Gommers, P. Virtanen, D. Cournapeau, E. Wieser, J. Taylor, S. Berg, N. J. Smith, R. Kern, M. Picus, S. Hoyer, M. H. van Kerkwijk, M. Brett, A. Haldane, J. F. del R ´ıo, M. Wiebe, P. Peterson, P. G ´erard-Marchant, K. Sheppard, T. Reddy, W. Weckesser, H. Abbasi, C. Gohlke, and T. E. Oliphant, “Array progr...
work page 2020
-
[25]
Scikit-learn: Machine learning in Python,
F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V . Dubourg, J. Vander- plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch- esnay, “Scikit-learn: Machine learning in Python,”Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011
work page 2011
-
[26]
Variational graph auto-encoders,
T. N. Kipf and M. Welling, “Variational graph auto-encoders,” 2016
work page 2016
-
[27]
Graphmae: Self-supervised masked graph autoencoders,
Z. Hou, X. Liu, Y . Cen, Y . Dong, H. Yang, C. Wang, and J. Tang, “Graphmae: Self-supervised masked graph autoencoders,” inProceed- ings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’22, (New York, NY , USA), p. 594–604, Association for Computing Machinery, 2022
work page 2022
-
[28]
Heteroge- neous graph masked autoencoders,
Y . Tian, K. Dong, C. Zhang, C. Zhang, and N. V . Chawla, “Heteroge- neous graph masked autoencoders,” 2023
work page 2023
-
[29]
Bert: Pre-training of deep bidirectional transformers for language understanding,
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” 2019
work page 2019
-
[30]
Masked autoencoders are scalable vision learners,
K. He, X. Chen, S. Xie, Y . Li, P. Doll’ar, and R. B. Girshick, “Masked autoencoders are scalable vision learners,”2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15979–15988, 2021
work page 2022
-
[31]
Distributed representations of words and phrases and their compositionality,
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” 2013
work page 2013
-
[32]
Pytorch: An imperative style, high- performance deep learning library,
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high- performance deep learning library,” inAdvances in Neural Informa- tion Processing...
work page 2019
-
[33]
Fast graph representation learning with pytorch geometric,
M. Fey and J. E. Lenssen, “Fast graph representation learning with pytorch geometric,” 2019
work page 2019
-
[34]
W. Falcon and The PyTorch Lightning team, “Pytorch lightning,” March 2019
work page 2019
-
[35]
Torchmetrics - measuring reproducibility in pytorch,
N. S. Detlefsen, J. Borovec, J. Schock, A. H. Jha, T. Koker, L. Di Liello, D. Stancl, C. Quan, M. Grechkin, and W. Falcon, “Torchmetrics - measuring reproducibility in pytorch,”Journal of Open Source Software, vol. 7, no. 70, p. 4101, 2022
work page 2022
-
[36]
Optuna: A next-generation hyperparameter optimization framework,
T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next-generation hyperparameter optimization framework,” 2019
work page 2019
-
[37]
A system for massively parallel hyper- parameter tuning,
L. Li, K. Jamieson, A. Rostamizadeh, E. Gonina, J. Ben-tzur, M. Hardt, B. Recht, and A. Talwalkar, “A system for massively parallel hyper- parameter tuning,” inProceedings of Machine Learning and Systems (I. Dhillon, D. Papailiopoulos, and V . Sze, eds.), vol. 2, pp. 230–246, 2020
work page 2020
-
[38]
Algorithms for hyper- parameter optimization,
J. Bergstra, R. Bardenet, Y . Bengio, and B. K ´egl, “Algorithms for hyper- parameter optimization,” inAdvances in Neural Information Processing Systems(J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K. Wein- berger, eds.), vol. 24, Curran Associates, Inc., 2011
work page 2011
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.