Few-Shot Network Intrusion Detection Using Online Triplet Mining
Pith reviewed 2026-05-19 23:03 UTC · model grok-4.3
The pith
A triplet network with online mining and KNN classification detects intrusions competitively after training on only ten malicious samples per class.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors demonstrate that a triplet network employing online triplet mining learns distance-preserving embeddings from network flow features; a KNN classifier operating on those embeddings then achieves competitive accuracy in few-shot binary and multiclass intrusion detection tasks when trained on as few as ten malicious samples of each class drawn from standard benchmark datasets.
What carries the argument
Online triplet mining inside a triplet network that produces embeddings for a subsequent KNN classifier.
If this is right
- The method can be deployed in new or small networks that lack extensive labeled attack data.
- It provides a practical middle path between data-hungry supervised classifiers and high-false-positive anomaly detectors.
- Ablation studies on mining strategies, distance metrics, and inference choices allow systematic tuning for particular traffic distributions.
- The same architecture supports both binary detection of any malicious traffic and multiclass identification of specific attack types.
Where Pith is reading between the lines
- The embedding approach could be applied to other data-scarce security tasks such as malware or phishing detection with minimal modification.
- Evaluating the model on live traffic streams rather than static benchmark files would test whether embedding stability holds under concept drift.
- Pre-training the triplet network on large public traffic corpora before fine-tuning on ten local samples might further lower the data threshold.
Load-bearing premise
Embeddings learned from the triplet network remain sufficiently clustered for nearest-neighbor separation even when only ten malicious samples per class are supplied during training.
What would settle it
Running the same evaluation protocol on the same datasets but finding that KNN accuracy on the learned embeddings drops well below the levels reported for competing few-shot methods when restricted to ten samples per class would falsify the central claim.
Figures
read the original abstract
Network intrusion detection systems play a vital role in protecting networks by detecting malicious network traffic which can then be investigated by a cybersecurity operations centre. State-of-the-art approaches utilise supervised machine learning methods to train a classification model to recognise known cyberattacks; however, these models require a large labelled dataset to train and show poor performance when trained on smaller datasets. In an attempt to address this shortcoming, anomaly detection models learn the distribution of benign traffic and flag non-conforming traffic as malicious. While these methods do not require malicious examples to train, they suffer from high false-positive rates rendering them impractical. As a result, networks may be particularly vulnerable when there are insufficient labelled instances of a specific attack class to train an effective classifier. This often occurs in newly established networks or when previously unseen types of attacks emerge. To address this challenge, this work proposes the use of a triplet network, utilising online triplet mining and a KNN classifier, which is able to perform few-shot classification, enabling effective intrusion detection after being trained on a limited number of malicious examples. Various online triplet mining algorithms were explored and model design choices, such as the inference algorithm and optimised distance metrics, were compared and evaluated through a series of ablation studies. The final model was compared against other state-of-the-art approaches in few-shot binary and multiclass classification, where the proposed approach was found to be competitive with existing methods when trained on as little as 10 malicious samples of each class.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a triplet network architecture that employs online triplet mining to learn embeddings from network traffic features, followed by a KNN classifier for few-shot intrusion detection. It explores multiple online mining strategies, performs ablation studies on design choices including inference algorithms and distance metrics, and reports that the final model achieves competitive performance against state-of-the-art few-shot methods on both binary and multiclass tasks when trained with as few as 10 malicious samples per class.
Significance. If the empirical results are reproducible and free of data leakage, the work would offer a practical advance for network intrusion detection in low-data regimes, such as newly deployed networks or zero-day attacks, where conventional supervised classifiers fail and anomaly detectors produce excessive false positives. The combination of metric learning via online triplet mining with KNN provides a concrete, trainable alternative to purely unsupervised approaches.
major comments (2)
- [§4 and §5] §4 (Experimental Setup) and §5 (Results): The central claim of competitiveness with only 10 malicious samples per class rests on reported ablation studies and final comparisons, yet the manuscript provides no description of the exact datasets (e.g., CICIDS2017, UNSW-NB15), feature preprocessing steps, train/test split methodology, or any checks for temporal or feature leakage between the few-shot malicious samples and the evaluation set. Without these details the reported gains cannot be verified or attributed to the triplet mining rather than dataset artifacts.
- [§3.2 and §5.3] §3.2 (Online Triplet Mining) and §5.3 (Few-shot Ablations): With only 10 samples per attack class the number of distinct valid (anchor, positive, negative) triplets per batch is combinatorially small. The paper does not report the effective number of unique triplets seen during training, the batch construction strategy, or any analysis of embedding stability across different random selections of the 10 samples. This leaves the key assumption—that the learned metric produces KNN-separable clusters—untested against the skeptic concern that repeated reuse of the same limited pairs may yield brittle embeddings.
minor comments (2)
- [Table 2, Figure 3] Table 2 and Figure 3: axis labels and legend entries use inconsistent abbreviations for the mining strategies (e.g., “semi-hard” vs. “SH”); standardize notation for readability.
- [§2] §2 (Related Work): the discussion of prior few-shot NIDS papers omits recent metric-learning baselines that also combine triplet loss with KNN; adding 2–3 citations would strengthen the positioning.
Simulated Author's Rebuttal
We thank the referee for their thorough review and valuable suggestions. We have carefully considered the comments and revised the manuscript to provide the requested details and analyses. Our responses to the major comments are as follows.
read point-by-point responses
-
Referee: [§4 and §5] §4 (Experimental Setup) and §5 (Results): The central claim of competitiveness with only 10 malicious samples per class rests on reported ablation studies and final comparisons, yet the manuscript provides no description of the exact datasets (e.g., CICIDS2017, UNSW-NB15), feature preprocessing steps, train/test split methodology, or any checks for temporal or feature leakage between the few-shot malicious samples and the evaluation set. Without these details the reported gains cannot be verified or attributed to the triplet mining rather than dataset artifacts.
Authors: We agree with the referee that additional details are necessary for reproducibility and to rule out potential data artifacts. In the revised manuscript, we have substantially expanded Section 4 (Experimental Setup) to include: (1) complete descriptions of the CICIDS2017 and UNSW-NB15 datasets, including the specific attack classes used and sample counts; (2) detailed feature preprocessing steps, such as one-hot encoding for categorical features, normalization using training set statistics, and removal of irrelevant features; (3) the train/test split methodology, which employs a temporal split based on timestamps to prevent leakage; and (4) explicit verification steps confirming no temporal or feature overlap between the selected few-shot malicious samples and the evaluation set. These revisions allow the results to be properly verified and attributed to the online triplet mining approach. revision: yes
-
Referee: [§3.2 and §5.3] §3.2 (Online Triplet Mining) and §5.3 (Few-shot Ablations): With only 10 samples per attack class the number of distinct valid (anchor, positive, negative) triplets per batch is combinatorially small. The paper does not report the effective number of unique triplets seen during training, the batch construction strategy, or any analysis of embedding stability across different random selections of the 10 samples. This leaves the key assumption—that the learned metric produces KNN-separable clusters—untested against the skeptic concern that repeated reuse of the same limited pairs may yield brittle embeddings.
Authors: We acknowledge this valid concern regarding the limited sample regime. In the revised version, we have updated Section 3.2 to describe the batch construction strategy in detail: all 10 malicious samples per class are included in each training batch along with a larger number of benign samples, and online mining is performed within these batches. We now report the effective number of unique triplets generated per epoch for each mining strategy (e.g., semi-hard, hard). Furthermore, in Section 5.3, we have added an analysis of embedding stability: the model was retrained five times using different random selections of the 10 samples per class, and we report the mean and standard deviation of the few-shot classification accuracy. The low variance observed supports that the learned metric produces stable, KNN-separable clusters rather than brittle embeddings dependent on specific sample choices. revision: yes
Circularity Check
No circularity: empirical method with experimental validation
full rationale
The paper describes a triplet network architecture with online mining strategies, followed by KNN classification for few-shot intrusion detection. All claims rest on ablation studies and benchmark comparisons using standard datasets and metrics. No equations, derivations, or predictions are presented that reduce by construction to fitted inputs or self-referential definitions. The approach is self-contained against external benchmarks, with performance evaluated directly on held-out test data rather than through tautological reuse of training statistics.
Axiom & Free-Parameter Ledger
free parameters (2)
- triplet margin
- embedding dimension
axioms (1)
- domain assumption Network traffic samples can be represented in a metric space where Euclidean or similar distances reflect semantic similarity for benign vs malicious classes.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquationwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
triplet loss function... LTL(xa,xp,xn)=max{0,d(za,zp)−d(za,zn)+m}
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Corporation, I.Cost of a Data Breach Report 2022; Technical Report; IBM Security: Cambridge, MA, USA, 2022
work page 2022
-
[2]
Wei, P .; Li, Y.; Zhang, Z.; Hu, T.; Li, Z.; Liu, D. An optimization method for intrusion detection classification model based on deep belief network.IEEE Access2019,7, 87593–87605. https://doi.org/10.1109/ACCESS.2019.2925828
-
[3]
Xiao, Y.; Xing, C.; Zhang, T.; Zhao, Z. An Intrusion Detection Model Based on Feature Reduction and Convolutional Neural Networks.IEEE Access2019,7, 42210–42219. https://doi.org/10.1109/ACCESS.2019.2904620
-
[4]
A CNN-LSTM Model for Intrusion Detection System from High Dimensional Data.J
Kottapalle, P . A CNN-LSTM Model for Intrusion Detection System from High Dimensional Data.J. Inf. Comput. Sci.2020, 10, 1362–1370
work page 2020
-
[5]
Learning a Neural-network-based Representation for Open Set Recognition
Hassen, M.; Chan, P .K. Learning a Neural-network-based Representation for Open Set Recognition. InProceedings of the 2020 SIAM International Conference on Data Mining (SDM); SIAM: Philadelphia, PA, USA, 2020; pp. 154–162. https://doi.org/10.1137/ 1.9781611976236.18
work page 2020
-
[6]
A Grassmannian Approach to Zero-Shot Learning for Network Intrusion Detection
Rivero, J.; Ribeiro, B.; Chen, N.; Leite, F.S. A Grassmannian Approach to Zero-Shot Learning for Network Intrusion Detection. In Proceedings of the Neural Information Processing; Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, E.S.M., Eds.; Springer: Cham, Switzerland, 2017; pp. 565–575
work page 2017
-
[7]
Anomaly Based Unknown Intrusion Detection in Endpoint Environments.Electronics2020,9, 1022
Kim, S.; Hwang, C.; Lee, T. Anomaly Based Unknown Intrusion Detection in Endpoint Environments.Electronics2020,9, 1022. https://doi.org/10.3390/electronics9061022
-
[8]
Hindy, H.; Atkinson, R.; Tachtatzis, C.; Colin, J.N.; Bayne, E.; Bellekens, X. Towards an Effective Zero-Day Attack Detection Using Outlier-Based Deep Learning Techniques.arXiv2020, arXiv:2006.15344
-
[9]
Network Intrusion Detector Based On Isolation
S, S.; G, S.; Priya, B. Network Intrusion Detector Based On Isolation . . . Forest Algorithm. InProceedings of the 2022 1st International Conference on Computational Science and Technology (ICCST); IEEE: New York, NY, USA, 2022; pp. 932–935. https: //doi.org/10.1109/ICCST55948.2022.10040395
-
[10]
Reducing the Dimensionality of Data with Neural Networks.Science2006,313, 504–507
Hinton, G.E.; Salakhutdinov, R.R. Reducing the Dimensionality of Data with Neural Networks.Science2006,313, 504–507. https://doi.org/10.1126/science.1127647
-
[11]
Autoencoder-based Intrusion Detection System
Kamalov, F.; Zgheib, R.; Leung, H.H.; Al-Gindy, A.; Moussa, S. Autoencoder-based Intrusion Detection System. InProceedings of the 2021 International Conference on Engineering and Emerging Technologies (ICEET); IEEE: New York, NY, USA, 2021; pp. 1–5. https://doi.org/10.1109/ICEET53442.2021.9659562
-
[12]
Unknown Attack Detection Based on Zero-Shot Learning.IEEE Access2020, 8, 193981–193991
Zhang, Z.; Liu, Q.; Qiu, S.; Zhou, S.; Zhang, C. Unknown Attack Detection Based on Zero-Shot Learning.IEEE Access2020, 8, 193981–193991. https://doi.org/10.1109/ACCESS.2020.3033494
-
[13]
Nkashama, D.K.; Félicien, J.M.; Soltani, A.; Verdier, J.C.; Tardif, P .M.; Frappier, M.; Kabanza, F. Deep Learning for Network Anomaly Detection under Data Contamination: Evaluating Robustness and Mitigating Performance Degradation. InProceedings of the Computer Security. ESORICS 2024 International Workshops: SECAI, DisA, CPS4CIP , and SecAssure, Bydgoszc...
-
[14]
Deep Unsupervised Anomaly Detection
Li, T.; Wang, Z.; Liu, S.; Lin, W.Y. Deep Unsupervised Anomaly Detection. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV); IEEE: New York, NY, USA, 2021; pp. 3925–3934
work page 2021
-
[15]
In IEEE Symposium on Security and Privacy (SP)
Liu, Z.; Li, S.; Zhang, Y.; Yun, X.; Cheng, Z. Efficient Malware Originated Traffic Classification by Using Generative Adversarial Networks. InProceedings of the 2020 IEEE Symposium on Computers and Communications (ISCC); IEEE: New York, NY, USA, 2020; pp. 1–7. https://doi.org/10.1109/ISCC50000.2020.9219561
-
[16]
Yang, Y.; Zheng, K.; Wu, B.; Yang, Y.; Wang, X. Network Intrusion Detection Based on Supervised Adversarial Variational Auto-Encoder With Regularization.IEEE Access2020,8, 42169–42184. https://doi.org/10.1109/ACCESS.2020.2977007
-
[18]
Al-Qatf, M.; Lasheng, Y.; Al-Habib, M.; Al-Sabahi, K. Deep Learning Approach Combining Sparse Autoencoder With SVM for Network Intrusion Detection.IEEE Access2018,6, 52843–52856. https://doi.org/10.1109/ACCESS.2018.2869577
-
[19]
Ruff, L.; Vandermeulen, R.; Goernitz, N.; Deecke, L.; Siddiqui, S.A.; Binder, A.; Müller, E.; Kloft, M. Deep One-Class Classification. InProceedings of the 35th International Conference on Machine Learning, 10–15 July 2018; Dy, J., Krause, A., Eds.; Proceedings of Machine Learning Research; PMLR: New York, NY, USA, 2018; Volume 80, pp. 4393–4402
work page 2018
-
[20]
Anomaly Detection Using Replicator Neural Networks Trained on Examples of One Class
Dau, H.A.; Ciesielski, V .; Song, A. Anomaly Detection Using Replicator Neural Networks Trained on Examples of One Class. In Proceedings of the Simulated Evolution and Learning; Dick, G., Browne, W.N., Whigham, P ., Zhang, M., Bui, L.T., Ishibuchi, H., Jin, Y., Li, X., Shi, Y., Singh, P ., et al., Eds.; Springer: Cham, Switzerland, 2014; pp. 311–322
work page 2014
-
[21]
Developing a Siamese Network for Intrusion Detection Systems
Hindy, H.; Tachtatzis, C.; Atkinson, R.; Bayne, E.; Bellekens, X. Developing a Siamese Network for Intrusion Detection Systems. InProceedings of the 1st Workshop on Machine Learning and Systems; EuroMLSys ’21; Association for Computing Machinery: New York, NY, USA, 2021; pp. 120–126. https://doi.org/10.1145/3437984.3458842
-
[22]
Intrusion Detection Systems using Machine Learning and Deep Learning Techniques
Hindy, H. Intrusion Detection Systems using Machine Learning and Deep Learning Techniques. Ph.D. Thesis, Abertay University, Dundee, Scotland, 2021
work page 2021
-
[23]
Wang, Z.M.; Tian, J.Y.; Qin, J.; Fang, H.; Chen, L.M. A Few-Shot Learning-Based Siamese Capsule Network for Intrusion Detection with Imbalanced Training Data.Comput. Intell. Neurosci.2021,2021, 7126913. https://doi.org/10.1155/2021/7126913
-
[24]
Leveraging siamese networks for one-shot intrusion detection model.J
Hindy, H.; Tachtatzis, C.; Atkinson, R.; Brosset, D.; Bures, M.; Andonovic, I.; Michie, C.; Bellekens, X. Leveraging siamese networks for one-shot intrusion detection model.J. Intell. Inf. Syst.2022,60, 407–436. https://doi.org/10.1007/s10844-022-00747-z
-
[25]
FaceNet : A unified embedding for face recognition and clustering
Schroff, F.; Kalenichenko, D.; Philbin, J. FaceNet: A unified embedding for face recognition and clustering. InProceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE Computer Society: New York, NY, USA, 2015; pp. 815–823. https://doi.org/10.1109/CVPR.2015.7298682
-
[26]
Autoencoder-based deep metric learning for network intrusion detection.Inf
Andresini, G.; Appice, A.; Malerba, D. Autoencoder-based deep metric learning for network intrusion detection.Inf. Sci.2021, 569, 706–727. https://doi.org/10.1016/j.ins.2021.05.016
-
[27]
Wang, Z.; Li, Z.; He, D.; Chan, S. A lightweight approach for network intrusion detection in industrial cyber-physical systems based on knowledge distillation and deep metric learning.Expert Syst. Appl.2022,206, 117671. https://doi.org/10.1016/j.eswa. 2022.117671
-
[28]
A few-shot network intrusion detection method based on mutual centralized learning.Sci
Xu, C.; Zhang, F.; Yang, Z.; Zhou, Z.; Zheng, Y. A few-shot network intrusion detection method based on mutual centralized learning.Sci. Rep.2025,15, 9848. https://doi.org/10.1038/s41598-025-93185-0
-
[29]
Sun, H.; Wan, L.; Liu, M.; Wang, B. Few-Shot network intrusion detection based on prototypical capsule network with attention mechanism.PLoS ONE2023,18, e0284632. https://doi.org/10.1371/journal.pone.0284632
-
[30]
Boosting Few-Shot Network Intrusion Detection with Adaptive Feature Fusion Mechanism
Bo, J.; Chen, K.; Li, S.; Gao, P . Boosting Few-Shot Network Intrusion Detection with Adaptive Feature Fusion Mechanism. Electronics2024,13, 4560. https://doi.org/10.3390/electronics13224560
-
[31]
Few-Shot Network Intrusion Detection Based on Model-Agnostic Meta-Learning with L2F Method
Shi, Z.; Xing, M.; Zhang, J.; Hao Wu, B. Few-Shot Network Intrusion Detection Based on Model-Agnostic Meta-Learning with L2F Method. InProceedings of the 2023 IEEE Wireless Communications and Networking Conference (WCNC); IEEE: New York, NY, USA, 2023; pp. 1–6. https://doi.org/10.1109/WCNC55385.2023.10118898
-
[32]
A Study on Few-Shot Learning Approach for Intrusion Detection System with Class Incremental Learning
Cao, Q.P .X.; Tran, D.D.; Ngo, S.T.T.; Nghi, K.H.; et al. A Study on Few-Shot Learning Approach for Intrusion Detection System with Class Incremental Learning. InProceedings of the 10th International Conference on Intelligent Information Technology (ICIIT 2025); Association for Computing Machinery: New York, NY, USA, 2025. https://doi.org/10.1145/3731763.3731795
-
[33]
SMOTE: synthetic minority over-sampling technique,
Chawla, N.V .; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P . SMOTE: Synthetic Minority Over-sampling Technique.J. Artif. Intell. Res.2002,16, 321–357. https://doi.org/10.1613/jair.953
-
[34]
Toward generating a new intrusion detection dataset and intrusion traffic characterization
Sharafaldin., I.; Habibi Lashkari., A.; Ghorbani., A.A. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. InProceedings of the 4th International Conference on Information Systems Security and Privacy; ICISSP , INSTICC; SciTePress: Setúbal, Portugal, 2018; pp. 108–116. https://doi.org/10.5220/0006639801080116
-
[35]
From CIC-IDS2017 to LYCOS-IDS2017: A corrected dataset for better performance
ROSAY, A.; CARLIER, F.; CHEVAL, E.; LEROUX, P . From CIC-IDS2017 to LYCOS-IDS2017: A corrected dataset for better performance. InProceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology; WI-IAT ’21;IEEE: New York, NY, USA, 2022; pp. 570–575. https://doi.org/10.1145/3486622.3493973
-
[36]
Decoupled Weight Decay Regularization
Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization.arXiv2019, arXiv:1711.05101
work page internal anchor Pith review Pith/arXiv arXiv
-
[37]
Utilising deep learning techniques for effective zero-day attack detection.Electronics2020,9, 1684
Hindy, H.; Atkinson, R.; Tachtatzis, C.; Colin, J.N.; Bayne, E.; Bellekens, X. Utilising deep learning techniques for effective zero-day attack detection.Electronics2020,9, 1684. https://doi.org/10.3390/electronics9101684
-
[38]
Chen, F.; Ye, Z.; Wang, C.; Yan, L.; Wang, R. A Feature Selection Approach for Network Intrusion Detection Based on Tree-Seed Algorithm and K-Nearest Neighbor. InProceedings of the 2018 IEEE 4th International Symposium on Wireless Systems within the International Conferences on Intelligent Data Acquisition and Advanced Computing Systems (IDAACS-SWS); IEEE...
-
[39]
In Defense of the Triplet Loss for Person Re-Identification
Hermans, A.; Beyer, L.; Leibe, B. In Defense of the Triplet Loss for Person Re-Identification.arXiv2017, arXiv:1703.07737
work page internal anchor Pith review Pith/arXiv arXiv
-
[40]
Emerging Properties in Self-Supervised Vision Transformers
Caron, M.; Touvron, H.; Misra, I.; Jégou, H.; Mairal, J.; Bojanowski, P .; Joulin, A. Emerging Properties in Self-Supervised Vision Transformers.arXiv2021, arXiv:2104.14294. https://doi.org/10.3390/app1010000 Appl. Sci.2026,1, 0 28 of 28 Disclaimer/Publisher’s Note:The statements, opinions and data contained in all publications are solely those of the ind...
work page internal anchor Pith review Pith/arXiv arXiv doi:10.3390/app1010000 2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.