pith. sign in

arxiv: 2511.17113 · v2 · submitted 2025-11-21 · 💻 cs.CR · cs.AI· cs.LG

AutoGraphAD: Unsupervised network anomaly detection using Variational Graph Autoencoders

Pith reviewed 2026-05-17 20:55 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.LG
keywords network intrusion detectionunsupervised anomaly detectionvariational graph autoencodersheterogeneous graphscybersecurityanomaly scoringcontrastive learning
0
0 comments X

The pith

A heterogeneous variational graph autoencoder detects network intrusions unsupervised by combining weighted losses into an anomaly score, matching or exceeding prior methods while training and inferring over an order of magnitude faster.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents AutoGraphAD as an unsupervised network intrusion detection approach that builds heterogeneous graphs from connection and IP nodes. It trains a variational graph autoencoder solely with unsupervised and contrastive learning, avoiding any need for labeled attack data. An anomaly score is then formed by weighting the model's reconstruction and other losses to flag intrusions. This yields detection performance equal to or better than Anomal-E without requiring separate downstream anomaly detectors. The simplified pipeline produces roughly 1.18 orders of magnitude faster training and 1.03 orders of magnitude faster inference.

Core claim

AutoGraphAD is a novel unsupervised anomaly detection system based on a Heterogeneous Variational Graph Autoencoder that processes network activity as heterogeneous graphs with connection and IP nodes. Trained solely through unsupervised and contrastive learning without any labeled data, the model derives an anomaly score from weighted combinations of its losses. This enables detection of network intrusions and attacks with performance equal to or better than Anomal-E, yet eliminates the need for costly downstream anomaly detectors, resulting in approximately 1.18 orders of magnitude faster training and 1.03 orders of magnitude faster inference.

What carries the argument

Heterogeneous Variational Graph Autoencoder that constructs graphs from connection and IP nodes, applies unsupervised plus contrastive training, and converts weighted losses into an anomaly score for detection.

If this is right

  • Removes dependence on expensive labeled attack datasets for training network detectors.
  • Eliminates the computational cost of running a separate anomaly detector after the autoencoder stage.
  • Supports faster retraining cycles when network traffic patterns shift over time.
  • Enables more practical real-time deployment on high-volume network links due to reduced inference latency.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same loss-weighting idea could be tested on transaction graphs for fraud detection without labels.
  • Contrastive training on evolving graphs may help the model adapt to zero-day attack variants more readily than reconstruction alone.
  • Streaming updates to the graph structure could turn the method into an online detector if the anomaly score remains stable under incremental training.

Load-bearing premise

The weighted combination of the autoencoder losses produces an anomaly score that reliably separates normal traffic from intrusions across datasets and attack types without labeled validation or tuning.

What would settle it

On a new dataset containing attack types absent from training, the weighted anomaly scores either fail to rank intrusions above normal traffic or require per-dataset threshold adjustments to reach the reported performance level.

Figures

Figures reproduced from arXiv: 2511.17113 by Georgios Anyfantis, Pere Barlet-Ros.

Figure 1
Figure 1. Figure 1: Dataset Pre-processing pipeline for graph generation [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The proposed Architecture of AutoGraphAD in a training setting. AutoGraphAD mainly focuses on the reconstruction and use of the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Performance metrics in 0% training dataset. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Performance metrics of all the approaches at 5.76% contami [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: The timings for training and inference at 3.36% contamination. [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The timings for training and inference at 5.76% contamination. [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Our Anomaly Score and classification pipeline. The pipeline very closes resembles the training of our model until the first part of the [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
read the original abstract

Network Intrusion Detection Systems (NIDS) are essential tools for detecting network attacks and intrusions. While extensive research has explored the use of supervised Machine Learning for attack detection and characterisation, these methods require accurately labelled datasets, which are very costly to obtain. Moreover, existing public datasets have limited and/or outdated attacks, and many of them suffer from mislabelled data. To reduce the reliance on labelled data, we propose AutoGraphAD, a novel unsupervised anomaly detection based on a Heterogeneous Variational Graph Autoencoder. AutoGraphAD operates on heterogeneous graphs, made from connection and IP nodes that represent network activity. The model is trained using unsupervised and contrastive learning, without relying on any labelled data. The model's losses are then weighted and combined in an anomaly score used for anomaly detection. Overall, AutoGraphAD yields the same, and in some cases better, results than Anomal-E, but without requiring costly downstream anomaly detectors. As a result, AutoGraphAD achieves around 1.18 orders of magnitude faster training and 1.03 orders of magnitude faster inference, which represents a significant advantage for operational deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents AutoGraphAD, a novel unsupervised anomaly detection approach for Network Intrusion Detection Systems (NIDS) based on a Heterogeneous Variational Graph Autoencoder. The method constructs heterogeneous graphs representing network activity with connection and IP nodes, trains the model using unsupervised and contrastive learning without any labeled data, and derives an anomaly score by weighting and combining the model's losses. It claims to achieve the same or better performance than the Anomal-E method while avoiding the need for costly downstream anomaly detectors, resulting in approximately 1.18 orders of magnitude faster training and 1.03 orders of magnitude faster inference.

Significance. If the central claims hold under a fully unsupervised weighting procedure, the work could meaningfully advance practical NIDS by reducing labeled-data requirements and offering substantial efficiency gains suitable for operational deployment. The avoidance of downstream detectors is a clear practical advantage over prior graph-based methods.

major comments (2)
  1. [Abstract] Abstract: the statement that 'The model's losses are then weighted and combined in an anomaly score' supplies no derivation, fixed a priori rule, or parameter-free procedure for selecting the loss weights. Because the headline performance and speedup claims rest on this score reliably identifying intrusions without labels, any implicit tuning on labeled subsets or attack-type validation would render the unsupervised premise and the comparison to Anomal-E invalid.
  2. [Anomaly score construction (likely §4)] Anomaly score construction (likely §4): the weighted combination of VAE reconstruction loss and contrastive loss must be shown to be either fixed by model structure or learned in a purely unsupervised fashion; otherwise the reported equivalence or superiority to Anomal-E cannot be taken as evidence for the method's unsupervised nature.
minor comments (2)
  1. [Abstract] Abstract: add the exact datasets, number of runs, and primary metrics (e.g., AUC, F1) used for the Anomal-E comparison to support the 'same or better' claim.
  2. [Graph construction] Graph construction: clarify how heterogeneous connection and IP nodes are formed from raw flow data and whether any preprocessing steps implicitly use attack labels.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and detailed review. The comments correctly highlight the need for explicit clarification on how the anomaly score is formed without labels. We address each point below and will revise the manuscript to strengthen the presentation of the unsupervised weighting procedure while preserving all original claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the statement that 'The model's losses are then weighted and combined in an anomaly score' supplies no derivation, fixed a priori rule, or parameter-free procedure for selecting the loss weights. Because the headline performance and speedup claims rest on this score reliably identifying intrusions without labels, any implicit tuning on labeled subsets or attack-type validation would render the unsupervised premise and the comparison to Anomal-E invalid.

    Authors: We agree the abstract is concise and omits the weighting details. The weights are selected via a fixed, parameter-free rule that normalizes each loss term by its empirical mean and standard deviation computed exclusively on the unlabeled training graph; no labeled data or attack-type information is used at any stage. In the revision we will expand the abstract to state this rule explicitly and add a short derivation in the main text so that the unsupervised character of the score is immediately clear. revision: yes

  2. Referee: [Anomaly score construction (likely §4)] Anomaly score construction (likely §4): the weighted combination of VAE reconstruction loss and contrastive loss must be shown to be either fixed by model structure or learned in a purely unsupervised fashion; otherwise the reported equivalence or superiority to Anomal-E cannot be taken as evidence for the method's unsupervised nature.

    Authors: Section 4 already defines the anomaly score as a linear combination whose coefficients are obtained from training-set loss statistics alone. We will revise the section to include (i) a formal statement that the procedure uses only unlabeled data, (ii) pseudocode for the normalization step, and (iii) an explicit statement that no downstream labeled validation or attack-type information enters the weighting. These additions will make the fully unsupervised nature of the score transparent and thereby support the validity of the performance and runtime comparisons. revision: yes

Circularity Check

0 steps flagged

No circularity detected in derivation chain

full rationale

The paper describes a standard unsupervised heterogeneous variational graph autoencoder trained via reconstruction and contrastive losses on connection/IP graphs, with the anomaly score formed by weighting those losses. No equations or steps in the abstract or described method reduce by construction to fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations. The performance comparison to Anomal-E is external and the unsupervised premise relies on established VAE techniques without importing uniqueness theorems or ansatzes from the authors' prior work. The derivation remains self-contained against external benchmarks for graph anomaly detection.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach relies on standard assumptions in graph ML and unsupervised learning for anomaly detection; specific weights for combining losses are likely free parameters tuned for performance.

free parameters (1)
  • loss weights
    The anomaly score combines weighted losses, implying weights are chosen or fitted.
axioms (1)
  • domain assumption Heterogeneous graph representation captures network activity sufficiently for anomaly detection
    The model operates on graphs made from connection and IP nodes.

pith-pipeline@v0.9.0 · 5499 in / 1314 out tokens · 64253 ms · 2026-05-17T20:55:06.187389+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages · 6 internal anchors

  1. [1]

    Research trends in network- based intrusion detection systems: A review,

    S. Kumar, S. Gupta, and S. Arora, “Research trends in network- based intrusion detection systems: A review,”IEEE Access, vol. 9, pp. 157 761–157 779, 2021

  2. [2]

    Hast-ids: Learning hierarchical spatial-temporal features using deep neural networks to improve intrusion detection,

    W. Wang, Y . Sheng, J. Wang, X. Zeng, X. Ye, Y . Huang, and M. Zhu, “Hast-ids: Learning hierarchical spatial-temporal features using deep neural networks to improve intrusion detection,”IEEE Access, vol. 6, pp. 1792–1806, 2018

  3. [3]

    Lightweight ids based on features selection and ids classification scheme,

    S. Zaman and F. Karray, “Lightweight ids based on features selection and ids classification scheme,” in2009 International Conference on Computational Science and Engineering, vol. 3, 2009, pp. 365–370

  4. [4]

    Network intrusion detection using machine learning algorithms,

    B. Babu, G. Reddy, D. Goud, K. Naveen, and K. T. Reddy, “Network intrusion detection using machine learning algorithms,” in2023 3rd International Conference on Smart Data Intelligence (ICSMDI), 2023, pp. 367–371

  5. [5]

    A comprehensive survey of machine learning-based network intrusion detection,

    R. Chapaneri and S. Shah, “A comprehensive survey of machine learning-based network intrusion detection,” inSmart Intelligent Computing and Applications, S. C. Satapathy, V . Bhateja, and S. Das, Eds. Singapore: Springer Singapore, 2019, pp. 345–356

  6. [6]

    The limitations of deep learning in adversarial settings,

    N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, “The limitations of deep learning in adversarial settings,” in2016 IEEE European Symposium on Security and Privacy (EuroS&P), 2016, pp. 372–387

  7. [7]

    Unveiling the potential of graph neural networks for robust intrusion detection,

    D. Pujol-Perich, J. Su ´arez-Varela, A. Cabellos-Aparicio, and P. Barlet-Ros, “Unveiling the potential of graph neural networks for robust intrusion detection,” 2021. [Online]. Available: https://arxiv.org/abs/2107.14756

  8. [8]

    Graph neural networks for intrusion detection: A survey,

    T. Bilot, N. E. Madhoun, K. A. Agha, and A. Zouaoui, “Graph neural networks for intrusion detection: A survey,”IEEE Access, vol. 11, pp. 49 114–49 139, 2023

  9. [9]

    Gnn-ids: Graph neural network based intrusion detection system,

    Z. Sun, A. M. Teixeira, and S. Toor, “Gnn-ids: Graph neural network based intrusion detection system,” inProceedings of the 19th International Conference on Availability, Reliability and Security, ser. ARES ’24. New York, NY , USA: Association for Computing Machinery, 2024. [Online]. Available: https: //doi.org/10.1145/3664476.3664515

  10. [10]

    Network intrusion datasets: A survey, limitations, and recommendations,

    P. Goldschmidt and D. Chud ´a, “Network intrusion datasets: A survey, limitations, and recommendations,” 2025. [Online]. Available: https://arxiv.org/abs/2502.06688

  11. [11]

    A review on intrusion detection datasets: tools, processes, and features,

    D. Pinto, I. Amorim, E. Maia, and I. Prac ¸a, “A review on intrusion detection datasets: tools, processes, and features,”Computer Networks, vol. 262, p. 111177, 2025. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/S1389128625001458

  12. [12]

    Understanding the process of data labeling in cybersecurity,

    T. Braun, I. Pekaric, and G. Apruzzese, “Understanding the process of data labeling in cybersecurity,” inProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing, ser. SAC ’24. ACM, Apr. 2024, p. 1596–1605. [Online]. Available: http://dx.doi.org/10.1145/3605098.3636046

  13. [13]

    Datasets are not enough: Challenges in labeling network traffic,

    J. L. Guerra, C. Catania, and E. Veas, “Datasets are not enough: Challenges in labeling network traffic,”Computers & Security, vol. 120, p. 102810, 2022. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/S0167404822002048

  14. [14]

    Anomal-e: A self-supervised network intrusion detection system based on graph neural networks,

    E. Caville, W. W. Lo, S. Layeghy, and M. Portmann, “Anomal-e: A self-supervised network intrusion detection system based on graph neural networks,”Knowledge-Based Systems, vol. 258, p. 110030, Dec. 2022. [Online]. Available: http: //dx.doi.org/10.1016/j.knosys.2022.110030

  15. [15]

    Bellman, Dynamic programming,Science.153(3731), 34–37 (1966)

    R. Bellman, “Dynamic programming,”Science, vol. 153, no. 3731, pp. 34–37, 1966. [Online]. Available: https://www.science. org/doi/abs/10.1126/science.153.3731.34

  16. [16]

    Characterizing concept drift,

    G. I. Webb, R. Hyde, H. Cao, H. L. Nguyen, and F. Petitjean, “Characterizing concept drift,”Data Min. Knowl. Discov., vol. 30, no. 4, pp. 964–994, Jul. 2016

  17. [17]

    Ethereum proof-of-stake under scrutiny,

    A. Venturi, M. Ferrari, M. Marchetti, and M. Colajanni, “Arganids: a novel network intrusion detection system based on adversarially regularized graph autoencoder,” inProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing, ser. SAC ’23. New York, NY , USA: Association for Computing Machinery, 2023, p. 1540–1548. [Online]. Available: https: //do...

  18. [18]

    Integrating graph neural networks with scattering transform for anomaly detection,

    A. Zoubir and B. Missaoui, “Integrating graph neural networks with scattering transform for anomaly detection,” 2024. [Online]. Available: https://arxiv.org/abs/2404.10800

  19. [19]

    Towards network anomaly detection using graph embedding,

    Q. Xiao, J. Liu, Q. Wang, Z. Jiang, X. Wang, and Y . Yao, “Towards network anomaly detection using graph embedding,” in Computational Science – ICCS 2020, V . V . Krzhizhanovskaya, G. Z´avodszky, M. H. Lees, J. J. Dongarra, P. M. A. Sloot, S. Bris- sos, and J. Teixeira, Eds. Cham: Springer International Publishing, 2020, pp. 156–169

  20. [20]

    Review of anomaly detection algorithms for data streams,

    T. Lu, L. Wang, and X. Zhao, “Review of anomaly detection algorithms for data streams,”Applied Sciences, vol. 13, no. 10,

  21. [21]

    Available: https://www.mdpi.com/2076-3417/13/ 10/6353

    [Online]. Available: https://www.mdpi.com/2076-3417/13/ 10/6353

  22. [22]

    Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm,

    M. Goldstein and A. Dengel, “Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm,” inKI 2012: Poster and Demo Track, 2012, pp. 59–63. [Online]. Avail- able: https://www.goldiges.de/publications/HBOS-KI-2012.pdf

  23. [23]

    Support vector method for novelty detection,

    B. Sch ¨olkopf, R. C. Williamson, A. Smola, J. Shawe-Taylor, and J. Platt, “Support vector method for novelty detection,” in Advances in Neural Information Processing Systems, S. Solla, T. Leen, and K. M ¨uller, Eds., vol. 12. MIT Press, 1999. [Online]. Available: https://proceedings.neurips.cc/paper files/ paper/1999/file/8725fb777f25776ffa9076e44fcfd776...

  24. [24]

    Scikit-learn: Machine learning in Python,

    F. Pedregosa, G. Varoquaux, A. Gramfort, V . Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V . Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011

  25. [25]

    Gat-ad: Graph attention networks for contextual anomaly detection in network monitoring,

    H. Latif-Mart ´ınez, J. Su ´arez-Varela, A. Cabellos-Aparicio, and P. Barlet-Ros, “Gat-ad: Graph attention networks for contextual anomaly detection in network monitoring,”Computers & Industrial Engineering, vol. 200, p. 110830, 2025. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/S0360835224009525

  26. [26]

    Variational graph auto-encoders,

    T. N. Kipf and M. Welling, “Variational graph auto-encoders,”

  27. [27]

    Variational Graph Auto-Encoders

    [Online]. Available: https://arxiv.org/abs/1611.07308

  28. [28]

    Auto-encoding variational bayes,

    D. P. Kingma and M. Welling, “Auto-encoding variational bayes,”

  29. [29]

    Auto-Encoding Variational Bayes

    [Online]. Available: https://arxiv.org/abs/1312.6114

  30. [30]

    Inductive representation learning on large graphs,

    W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” inAdvances in Neural Information Processing Systems, I. Guyon, U. V . Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017. [Online]. Available: https://proceedings.neurips.cc/paper files/paper/201...

  31. [31]

    Ad- versarially regularized graph autoencoder for graph embedding,

    S. Pan, R. Hu, G. Long, J. Jiang, L. Yao, and C. Zhang, “Ad- versarially regularized graph autoencoder for graph embedding,”

  32. [32]

    Adversarially Regularized Graph Autoencoder for Graph Embedding

    [Online]. Available: https://arxiv.org/abs/1802.04407

  33. [33]

    Variational inference for monte carlo objectives,

    A. Mnih and D. Rezende, “Variational inference for monte carlo objectives,” inProceedings of The 33rd International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. F. Balcan and K. Q. Weinberger, Eds., vol. 48. New York, New York, USA: PMLR, 20– 22 Jun 2016, pp. 2188–2196. [Online]. Available: https: //proceedings.mlr.pre...

  34. [34]

    Quasi-Monte Carlo variational inference,

    A. Buchholz, F. Wenzel, and S. Mandt, “Quasi-Monte Carlo variational inference,” inProceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 10–15 Jul 2018, pp. 668–677. [Online]. Available: https://proceedings.mlr.press/v80/buchholz18a.html

  35. [35]

    Generating sentences from a continuous space,

    S. Bowman, L. Vilnis, O. Vinyals, A. Dai, R. Jozefowicz, and S. Bengio, “Generating sentences from a continuous space,” in Proceedings of the 20th SIGNLL conference on computational natural language learning, 2016, pp. 10–21

  36. [36]

    Graphmae: Self-supervised masked graph autoencoders,

    Z. Hou, X. Liu, Y . Cen, Y . Dong, H. Yang, C. Wang, and J. Tang, “Graphmae: Self-supervised masked graph autoencoders,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, ser. KDD ’22. New York, NY , USA: Association for Computing Machinery, 2022, p. 594–604. [Online]. Available: https://doi.org/10.1145/3534678.3539321

  37. [37]

    Heterogeneous graph masked autoencoders,

    Y . Tian, K. Dong, C. Zhang, C. Zhang, and N. V . Chawla, “Heterogeneous graph masked autoencoders,” 2023. [Online]. Available: https://arxiv.org/abs/2208.09957

  38. [38]

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

    J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” 2019. [Online]. Available: https://arxiv.org/abs/ 1810.04805

  39. [39]

    Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set),

    N. Moustafa and J. Slay, “Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set),” in2015 Military Communications and Information Systems Conference (MilCIS), 2015, pp. 1–6

  40. [40]

    Nfstream: A flexible network data analysis framework,

    Z. Aouini and A. Pekar, “Nfstream: A flexible network data analysis framework,”Computer Networks, vol. 204, p. 108719,

  41. [41]

    Available: https://www.sciencedirect.com/science/ article/pii/S1389128621005739

    [Online]. Available: https://www.sciencedirect.com/science/ article/pii/S1389128621005739

  42. [42]

    Cisco systems netflow services export version 9,

    B. Claise, “Cisco systems netflow services export version 9,” Cisco, Tech. Rep., 2004

  43. [43]

    Survey on categorical data for neural networks,

    J. T. Hancock and T. M. Khoshgoftaar, “Survey on categorical data for neural networks,”Journal of Big Data, vol. 7, no. 1, Apr. 2020. [Online]. Available: http://dx.doi.org/10.1186/ s40537-020-00305-w

  44. [44]

    Feature-engine: A python package for feature engineering for machine learning,

    S. Galli, “Feature-engine: A python package for feature engineering for machine learning,”Journal of Open Source Software, vol. 6, no. 65, p. 3642, 2021. [Online]. Available: https://doi.org/10.21105/joss.03642

  45. [45]

    C. Shi, X. Wang, and P. S. Yu,The State-of-the- Art of Heterogeneous Graph Representation. Singapore: Springer Singapore, 2022, pp. 9–25. [Online]. Available: https://doi.org/10.1007/978-981-16-6166-2 2

  46. [46]

    Anomaly detection by robust statistics,

    P. J. Rousseeuw and M. Hubert, “Anomaly detection by robust statistics,”WIREs Data Mining and Knowledge Discovery, vol. 8, no. 2, p. e1236, 2018. [Online]. Available: https: //wires.onlinelibrary.wiley.com/doi/abs/10.1002/widm.1236

  47. [47]

    Ensembles for unsupervised outlier detection: challenges and research questions a position paper,

    A. Zimek, R. J. Campello, and J. Sander, “Ensembles for unsupervised outlier detection: challenges and research questions a position paper,”SIGKDD Explor. Newsl., vol. 15, no. 1, p. 11–22, Mar. 2014. [Online]. Available: https://doi.org/10.1145/ 2594473.2594476

  48. [48]

    Pytorch: An imperative style, high-performance deep learning library,

    A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” inAdvances in Neural Information Processing Sy...

  49. [49]

    Fast Graph Representation Learning with PyTorch Geometric

    M. Fey and J. E. Lenssen, “Fast graph representation learning with pytorch geometric,” 2019. [Online]. Available: https: //arxiv.org/abs/1903.02428

  50. [50]

    Pytorch lightning,

    W. Falcon and The PyTorch Lightning team, “Pytorch lightning,” March 2019. [Online]. Available: https://github.com/Lightning-AI/ lightning

  51. [51]

    R., Millman , K

    C. R. Harris, K. J. Millman, S. J. van der Walt, R. Gommers, P. Virtanen, D. Cournapeau, E. Wieser, J. Taylor, S. Berg, N. J. Smith, R. Kern, M. Picus, S. Hoyer, M. H. van Kerkwijk, M. Brett, A. Haldane, J. F. del R ´ıo, M. Wiebe, P. Peterson, P. G ´erard-Marchant, K. Sheppard, T. Reddy, W. Weckesser, H. Abbasi, C. Gohlke, and T. E. Oliphant, “Array progr...

  52. [52]

    Pyod: A python toolbox for scalable outlier detection,

    Y . Zhao, Z. Nasrullah, and Z. Li, “Pyod: A python toolbox for scalable outlier detection,”Journal of Machine Learning Research, vol. 20, no. 96, pp. 1–7, 2019. [Online]. Available: http://jmlr.org/papers/v20/19-011.html

  53. [53]

    Data Structures for Statistical Computing in Python,

    Wes McKinney, “Data Structures for Statistical Computing in Python,” inProceedings of the 9th Python in Science Conference, St´efan van der Walt and Jarrod Millman, Eds., 2010, pp. 56 – 61

  54. [54]

    Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks

    M. Wang, D. Zheng, Z. Ye, Q. Gan, M. Li, X. Song, J. Zhou, C. Ma, L. Yu, Y . Gai, T. Xiao, T. He, G. Karypis, J. Li, and Z. Zhang, “Deep graph library: A graph-centric, highly-performant package for graph neural networks,” 2020. [Online]. Available: https://arxiv.org/abs/1909.01315

  55. [55]

    Zhao and P

    Y . Zhao and P. contributors. (2025) pyod.models.pca — pyod 2.0.5 documentation. Source code and API notes for the PCA outlier detector; BSD-2-Clause. [Online]. Available: https: //pyod.readthedocs.io/en/latest/ modules/pyod/models/pca.html

  56. [56]

    Discovering cluster-based local outliers,

    Z. He, X. Xu, and S. Deng, “Discovering cluster-based local outliers,”Pattern Recognition Letters, vol. 24, no. 9, pp. 1641– 1650, 2003. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S0167865503000035

  57. [57]

    A skew-sensitive evaluation framework for imbalanced data classification,

    M. Du, N. Tatbul, B. Rivers, A. K. Gupta, L. Hu, W. Wang, R. Marcus, S. Zhou, I. Lee, and J. Gottschlich, “A skew-sensitive evaluation framework for imbalanced data classification,” 2023. [Online]. Available: https://arxiv.org/abs/2010.05995 Appendix TABLE 5. NETFLOWDATASETFEATURESET #Column name Feature Value NF (v9) 1 src ip Source IP address Str Yes 2 ...