pith. sign in

arxiv: 2605.17960 · v1 · pith:PGQRBXDTnew · submitted 2026-05-18 · 💻 cs.CR

From Detection to Response: A Deep Learning and Retrieval-Augmented Generation Framework for Network Intrusion Mitigation

Pith reviewed 2026-05-20 10:14 UTC · model grok-4.3

classification 💻 cs.CR
keywords intrusion detectiondeep neural networksretrieval-augmented generationnetwork securitymitigation reportsDoSDDoSCICIDS2018
0
0 comments X

The pith

The framework uses an ensemble of deep neural networks to classify intrusions at up to 99.84 percent accuracy and then applies retrieval-augmented generation to produce structured mitigation reports from the top anomalous features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to close the gap between merely detecting network intrusions and telling a security analyst what concrete steps to take next. It builds a two-stage system in which three separate deep neural networks first label traffic flows as benign, DoS, or DDoS and surface the top five anomalous features. A retrieval-augmented generation stage then pulls relevant guidance from a curated knowledge base of authorized sources and directs a local language model to assemble citation-backed response reports. If the approach holds, analysts would move from raw alerts to ready-to-use, sourced instructions without leaving the protected environment. The work therefore targets the practical shortfall that high detection scores alone have not solved.

Core claim

The authors present a unified end-to-end framework whose first stage is an ensemble of three independently trained binary deep neural networks that classify network flows and isolate the top-five anomalous features, while the second stage constructs explanation-aware prompts, retrieves the most relevant mitigation guidance from a knowledge base of authorized sources, and directs a locally deployed language model to synthesize structured, citation-grounded reports; the resulting reports outperform vanilla large-language-model outputs on automated metrics while the classifiers reach 99.84 percent accuracy on CICIDS2018 and 95.30 percent on UNSW-NB15.

What carries the argument

The two-stage pipeline in which the detection ensemble surfaces the top-five anomalous features that are then used by the retrieval-augmented generation component to query a curated knowledge base and synthesize citation-grounded mitigation reports.

If this is right

  • Security analysts receive structured, explanation-aware reports with citations rather than isolated alerts.
  • The system can run locally, keeping sensitive network data inside the protected perimeter.
  • Classification performance remains high across two independent benchmark datasets.
  • Automated quality metrics for the generated reports improve over standard large-language-model outputs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same pipeline could be wired into existing security information and event management platforms to shorten the time from alert to first response action.
  • Expanding the knowledge base to cover additional attack categories would allow the framework to handle a wider range of threats without retraining the detection stage.
  • Live deployment on operational networks would expose whether the top-five-feature retrieval still suffices when traffic patterns drift from the training distributions.

Load-bearing premise

The knowledge base assembled from authorized sources contains sufficiently complete, current, and semantically relevant mitigation guidance that can be reliably retrieved and synthesized from only the top-five anomalous features for any detected intrusion.

What would settle it

If independent evaluation on the same or new traffic traces shows detection accuracy dropping below 90 percent or if side-by-side expert review finds the RAG-generated reports no more useful, accurate, or complete than direct outputs from an unaugmented language model, the claimed advantage of the combined system would be refuted.

Figures

Figures reproduced from arXiv: 2605.17960 by Md Navid Bin Islam, Sajal Saha, Senior Member (IEEE).

Figure 1
Figure 1. Figure 1: Proposed end-to-end framework for intrusion detection and RAG-enhanced mitigation generation. The architecture integrates DNN-based attack [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: ROC Curves for CSECICIDS2018 Dataset [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: ROC Curves for UNSW-NB15 Dataset D. Model Training and Validation Analysis Figs. 4 and 5 show that all three classifiers converge smoothly with minimal train/validation divergence. Final vali￾TABLE VIII RETRIEVAL PERFORMANCE AND KNOWLEDGE GROUNDING Metric Value Retrieval Performance Precision@5 0.91 Recall@5 0.84 Mean Reciprocal Rank (MRR) 0.87 Retrieval Success Rate 0.97 Average Retrieval Latency 1.8 s Kn… view at source ↗
Figure 4
Figure 4. Figure 4: Validation Accuracy Progression for Benign, DoS, and DDoS Classifiers [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Training and Validation Loss for Benign, DoS, and DDoS Classifiers [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 7
Figure 7. Figure 7 [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗
Figure 6
Figure 6. Figure 6: Performance Quality Core LLM Reasoning Synthesizes classification output, feature indicators, and retrieved knowledge chunks Analytical Summary Key indicators, anomaly pat￾terns, and supporting evidence Feature Vector Summary Feature values and interpre￾tation of security relevance Conclusion Final determination (Benign / DoS / DDoS) with confidence-aligned assessment [PITH_FULL_IMAGE:figures/full_fig_p01… view at source ↗
Figure 7
Figure 7. Figure 7: Structure of the explanation generated by the LLM. [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
read the original abstract

Machine-learning-based Intrusion Detection Systems (IDS) have achieved impressive accuracy in classifying network attacks, yet they consistently fall short on the question that matters most to a security analyst: what should I do next? This paper presents a unified, end-to-end framework that closes the gap between threat detection and actionable response. The system operates in two tightly coupled stages. First, an ensemble of three independently trained binary Deep Neural Networks (DNNs) classifies network traffic flows as Benign, Denial of Service (DoS), or Distributed Denial of Service (DDoS), achieving 99.84% accuracy on the CICIDS2018 dataset and 95.30% on the UNSW-NB15 dataset. Second, a Retrieval-Augmented Generation (RAG) pipeline constructs explanation-aware prompts from the top-5 anomalous features, retrieves the most semantically and lexically relevant guidance from a knowledge base derived from authorized sources and di- rects a locally deployed language model to synthesise structured, citation-grounded mitigation reports. The RAG-enhanced reports outperform vanilla LLM outputs across all automated evaluation metrics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper presents a two-stage end-to-end framework for network intrusion mitigation. An ensemble of three independently trained binary DNNs first classifies traffic flows as Benign, DoS, or DDoS, reporting 99.84% accuracy on CICIDS2018 and 95.30% on UNSW-NB15. A RAG pipeline then constructs prompts from the top-5 anomalous features, retrieves guidance from a knowledge base derived from authorized sources, and directs a locally deployed LLM to synthesize structured, citation-grounded mitigation reports that outperform vanilla LLM baselines on automated metrics.

Significance. If the empirical results hold under scrutiny, the work addresses a practical gap between high-accuracy detection and actionable response guidance for security analysts. Credit is due for the direct comparison against vanilla LLM baselines and the use of public benchmark datasets. The absence of training protocols, KB construction details, and coverage validation, however, leaves the central claim only moderately supported and limits immediate reproducibility or deployment value.

major comments (2)
  1. [Abstract] Abstract: The reported detection accuracies (99.84% on CICIDS2018, 95.30% on UNSW-NB15) are presented without model architectures, training hyperparameters, dataset splits, loss functions, or statistical significance tests. These omissions are load-bearing for the central detection-stage claim and prevent assessment of whether the ensemble genuinely advances the state of the art.
  2. [Abstract] Abstract (RAG pipeline description): The claim that RAG-enhanced reports outperform vanilla LLM outputs rests on retrieval from a knowledge base using only the top-5 anomalous features. No details are supplied on KB construction, size, update process, coverage testing, or validation that the top-5 features suffice for complete mitigation guidance across attack variants; if coverage is incomplete, the reported metric gains cannot be attributed to reliable grounding rather than LLM synthesis.
minor comments (1)
  1. [Abstract] Abstract contains a typographical hyphenation artifact: 'di- rects' should be 'directs'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the insightful comments on our paper. We believe the suggested additions will enhance the manuscript's clarity and reproducibility. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The reported detection accuracies (99.84% on CICIDS2018, 95.30% on UNSW-NB15) are presented without model architectures, training hyperparameters, dataset splits, loss functions, or statistical significance tests. These omissions are load-bearing for the central detection-stage claim and prevent assessment of whether the ensemble genuinely advances the state of the art.

    Authors: We concur that these details are essential for evaluating the detection stage. While the abstract is constrained in length, the full manuscript details the three binary DNN architectures in Section 3, the training hyperparameters, loss functions, and dataset splits in Section 4, along with the ensemble method. We will revise the abstract to incorporate a brief description of the model setup and add statistical significance testing to the results if not already included. This revision will be made to better support the central claims. revision: yes

  2. Referee: [Abstract] Abstract (RAG pipeline description): The claim that RAG-enhanced reports outperform vanilla LLM outputs rests on retrieval from a knowledge base using only the top-5 anomalous features. No details are supplied on KB construction, size, update process, coverage testing, or validation that the top-5 features suffice for complete mitigation guidance across attack variants; if coverage is incomplete, the reported metric gains cannot be attributed to reliable grounding rather than LLM synthesis.

    Authors: We agree that expanding on the RAG components is important. The paper indicates that the knowledge base is derived from authorized sources, but we will provide additional details in the revised manuscript regarding its construction, size, update process, and coverage validation. We will also include discussion or experiments validating that the top-5 anomalous features provide adequate information for generating comprehensive mitigation reports. These changes will help attribute the performance gains more clearly to the RAG approach. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results on public benchmarks with no self-referential reductions

full rationale

The paper presents a two-stage empirical framework: an ensemble of DNNs for binary classification of network flows (Benign/DoS/DDoS) evaluated directly on the public CICIDS2018 and UNSW-NB15 datasets, plus a RAG pipeline that retrieves from an external knowledge base derived from authorized sources and compares outputs to vanilla LLM baselines via automated metrics. No equations, derivations, or fitted parameters are described that reduce the reported accuracies or RAG performance to quantities defined by the authors' own inputs or prior self-citations. The central claims rest on external benchmark performance and baseline comparisons, rendering the evaluation chain self-contained against independent data sources.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework depends on the representativeness of the two chosen public datasets for real traffic and on the completeness of an externally curated knowledge base; the top-5 feature selection is a design choice rather than a fitted parameter.

axioms (1)
  • domain assumption CICIDS2018 and UNSW-NB15 datasets are representative of real-world network traffic for training and evaluating intrusion classifiers.
    Invoked when reporting the 99.84% and 95.30% accuracy figures as evidence of system performance.

pith-pipeline@v0.9.0 · 5729 in / 1404 out tokens · 61230 ms · 2026-05-20T10:14:15.075815+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

72 extracted references · 72 canonical work pages · 4 internal anchors

  1. [1]

    Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,

    K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological cybernetics, vol. 36, no. 4, pp. 193–202, 1980

  2. [2]

    A framework for classifying denial of service attacks,

    A. Hussain, J. Heidemann, and C. Papadopoulos, “A framework for classifying denial of service attacks,” inProceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, 2003, pp. 99–110

  3. [3]

    A taxonomy of ddos attack and ddos defense mechanisms,

    J. Mirkovic and P. Reiher, “A taxonomy of ddos attack and ddos defense mechanisms,”ACM SIGCOMM Computer Communication Re- view, vol. 34, no. 2, pp. 39–53, 2004

  4. [4]

    Toward gener- ating a new intrusion detection dataset and intrusion traffic characteriza- tion,

    I. Sharafaldin, A. Habibi Lashkari, and A. A. Ghorbani, “Toward gener- ating a new intrusion detection dataset and intrusion traffic characteriza- tion,” inProceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP 2018). SCITEPRESS - Science and Technology Publications, 2018, pp. 108–116

  5. [5]

    Unsw-nb15: A comprehensive data set for network intrusion detection systems (unsw-nb15 network data set),

    N. Moustafa and J. Slay, “Unsw-nb15: A comprehensive data set for network intrusion detection systems (unsw-nb15 network data set),” in 2015 Military Communications and Information Systems Conference (MilCIS). IEEE, 2015, pp. 1–6

  6. [6]

    Retrieval- augmented generation for knowledge-intensive nlp tasks,

    P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. K ¨uttler, M. Lewis, W.-t. Yih, T. Rockt ¨aschelet al., “Retrieval- augmented generation for knowledge-intensive nlp tasks,”Advances in neural information processing systems, vol. 33, pp. 9459–9474, 2020

  7. [7]

    Ddos attacks on cloud computing and iot devices: Strategies for mitigation,

    J. John and E. A. Fraser, “Ddos attacks on cloud computing and iot devices: Strategies for mitigation,”Kasu J. Comput. Sci., vol. 1, no. 4, pp. 778–795, 2024

  8. [8]

    Explainable intrusion detection systems (x-ids): A survey of current methods, challenges, and opportunities,

    S. Neupane, J. Ables, W. Anderson, S. Mittal, S. Rahimi, I. Banicescu, and M. Seale, “Explainable intrusion detection systems (x-ids): A survey of current methods, challenges, and opportunities,”IEEE Access, vol. 10, pp. 112 392–112 415, 2022

  9. [9]

    Explainable artificial intelligence (xai),

    D. Gunning, “Explainable artificial intelligence (xai),”Defense advanced research projects agency (DARPA), nd Web, vol. 2, no. 2, p. 1, 2017

  10. [10]

    A unified approach to interpreting model predictions,

    S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,”Advances in neural information processing systems, vol. 30, 2017

  11. [11]

    Lime: Local interpretable model-agnostic explanations,

    M. T. Ribeiro, S. Singh, and C. Guestrin, “Lime: Local interpretable model-agnostic explanations,” inProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016, pp. 2145–2154

  12. [12]

    Nist cybersecurity framework and mitre cybersecurity criteria,

    D. P. M ¨oller, “Nist cybersecurity framework and mitre cybersecurity criteria,” inGuide to Cybersecurity in Digital Transformation: Trends, Methods, Technologies, Applications and Best Practices. Springer, 2023, pp. 231–271

  13. [13]

    Enterprise ATT&CK matrix,

    MITRE Corporation, “Enterprise ATT&CK matrix,” https://attack.mitre.org/matrices/enterprise/, 2025, accessed: May 12, 2026

  14. [14]

    The institutional and structural transformation of the european union agency for cybersecurity (enisa),

    H. ˙Ipek and H. Y ¨uksel, “The institutional and structural transformation of the european union agency for cybersecurity (enisa),”Journal of International Relations and Political Science Studies, no. 14, pp. 28– 64

  15. [15]

    Robertson and H

    S. Robertson and H. Zaragoza,The probabilistic relevance framework: BM25 and beyond. Now Publishers Inc, 2009, vol. 4

  16. [16]

    Billion-scale similarity search with gpus,

    J. Johnson, M. Douze, and H. J ´egou, “Billion-scale similarity search with gpus,”arXiv, 2017

  17. [17]

    Evaluating intrusion detection systems: The 1998 darpa off- line intrusion detection evaluation,

    R. P. Lippmann, D. J. Fried, I. Graf, J. W. Haines, K. R. Kendall, D. Mc- Clung, D. Weber, S. E. Webster, D. Wyschogrod, R. K. Cunningham et al., “Evaluating intrusion detection systems: The 1998 darpa off- line intrusion detection evaluation,” inProceedings DARPA Information survivability conference and exposition. DISCEX’00, vol. 2. IEEE, 2000, pp. 12–26

  18. [18]

    LSTM-Based System-Call Language Modeling and Robust Ensemble Method for Designing Host-Based Intrusion Detection Systems

    G. Kim, H. Yi, J. Lee, Y . Paek, and S. Yoon, “Lstm-based system- call language modeling and robust ensemble method for designing host- based intrusion detection systems,”arXiv preprint arXiv:1611.01726, 2016

  19. [19]

    A deep learning approach for intrusion detection using recurrent neural networks,

    C. Yin, Y . Zhu, J. Fei, and X. He, “A deep learning approach for intrusion detection using recurrent neural networks,”Ieee Access, vol. 5, pp. 21 954–21 961, 2017

  20. [20]

    A deep learning approach to network intrusion detection,

    N. Shone, T. N. Ngoc, V . D. Phai, and Q. Shi, “A deep learning approach to network intrusion detection,”IEEE transactions on emerging topics in computational intelligence, vol. 2, no. 1, pp. 41–50, 2018

  21. [21]

    Deep learning approach for intelligent intrusion detection system,

    R. Vinayakumar, M. Alazab, K. Soman, P. Poornachandran, and A. Al- Nemrat, “Deep learning approach for intelligent intrusion detection system,”IEEE Access, vol. 7, pp. 41 525–41 550, 2019

  22. [22]

    Deep learning-based intrusion detection systems: a systematic review,

    J. Lansky, S. Ali, M. Mohammadi, M. K. Majeed, S. H. T. Karim, S. Rashidi, M. Hosseinzadeh, and A. M. Rahmani, “Deep learning-based intrusion detection systems: a systematic review,”IEEE Access, vol. 9, pp. 101 574–101 599, 2021

  23. [23]

    Revolutionizing cyber threat detection with large language models: A privacy-preserving bert-based lightweight model for iot/iiot devices,

    M. A. Ferrag, M. Ndhlovu, N. Tihanyi, L. C. Cordeiro, M. Debbah, T. Lestable, and N. S. Thandi, “Revolutionizing cyber threat detection with large language models: A privacy-preserving bert-based lightweight model for iot/iiot devices,”IEEe Access, vol. 12, pp. 23 733–23 750, 2024

  24. [24]

    Secbert: Privacy-preserving pre-training based neural network inference system,

    H. Huang and Y . Wang, “Secbert: Privacy-preserving pre-training based neural network inference system,”Neural Networks, vol. 172, p. 106135, 2024

  25. [25]

    Cybert: Contextualized embeddings for the cybersecurity domain,

    P. Ranade, A. Piplai, A. Joshi, and T. Finin, “Cybert: Contextualized embeddings for the cybersecurity domain,” in2021 IEEE international conference on big data (Big Data). IEEE, 2021, pp. 3334–3342

  26. [26]

    Morse: Bridging the gap in cybersecurity expertise with retrieval augmented generation,

    M. Simoni, A. Saracino, V . P, and M. Conti, “Morse: Bridging the gap in cybersecurity expertise with retrieval augmented generation,” inPro- ceedings of the 40th ACM/SIGAPP Symposium on Applied Computing, 2025, pp. 1213–1222

  27. [27]

    Cyberrag: An agentic rag cyber attack classification and reporting tool,

    F. Blefari, C. Cosentino, F. A. Pironti, A. Furfaro, and F. Marozzo, “Cyberrag: An agentic rag cyber attack classification and reporting tool,” arXiv preprint arXiv:2507.02424, 2025

  28. [28]

    Instruction-based chain-of-thought for multi-agent rag in snort rule generation

    V . Setiawan and B. Soewito, “Instruction-based chain-of-thought for multi-agent rag in snort rule generation.”International Journal of Intelligent Engineering & Systems, vol. 18, no. 11, 2025

  29. [29]

    Review of cybersecurity analysis in smart distribution systems and future directions for using unsuper- vised learning methods for cyber detection,

    S. J. Pinto, P. Siano, and M. Parente, “Review of cybersecurity analysis in smart distribution systems and future directions for using unsuper- vised learning methods for cyber detection,”Energies, vol. 16, no. 4, p. 1651, 2023

  30. [30]

    Shap happens: an explainable ids for industrial iot networks,

    P. Loi, D. Canavese, L. Regano, D. Maiorca, and G. Giacinto, “Shap happens: an explainable ids for industrial iot networks,” in2025 IEEE 9th Forum on Research and Technologies for Society and Industry (RTSI). IEEE, 2025, pp. 71–76

  31. [31]

    Explaining the decisions of an intrusion detection system: A case study,

    G. Tjhai, D. Papamartzivanos, S. Furnell, and M. Papadaki, “Explaining the decisions of an intrusion detection system: A case study,”Journal of Information Security and Applications, vol. 68, p. 103233, 2022

  32. [32]

    Convolutional networks for images, speech, and time series,

    Y . LeCun and Y . Bengio, “Convolutional networks for images, speech, and time series,”The handbook of brain theory and neural networks, 1998

  33. [33]

    Learning repre- sentations by back-propagating errors,

    D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning repre- sentations by back-propagating errors,”nature, vol. 323, no. 6088, pp. 533–536, 1986

  34. [34]

    Long short-term memory,

    S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997

  35. [35]

    Support-vector networks,

    C. Cortes and V . Vapnik, “Support-vector networks,”Machine learning, vol. 20, no. 3, pp. 273–297, 1995

  36. [36]

    Random forests,

    L. Breiman, “Random forests,”Machine learning, vol. 45, no. 1, pp. 5–32, 2001

  37. [37]

    Leveraging large language models in cybersecurity: A systematic review of emerging methods and techniques,

    T. Sandaruwan, J. Wijayanayake, and J. Senanayake, “Leveraging large language models in cybersecurity: A systematic review of emerging methods and techniques,”DRC, p. 155, 2024

  38. [38]

    Soc good practices: Operational controls and workflows for security operations centres,

    European Union Agency for Cybersecurity, “Soc good practices: Operational controls and workflows for security operations centres,” ENISA, Tech. Rep., 2022, accessed: 2026-05-14. [Online]. Available: https://www.enisa.europa.eu/publications/soc-good-practices

  39. [39]

    Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set),

    N. Moustafa and J. Slay, “Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set),” in2015 military communications and information systems conference (MilCIS). IEEE, 2015, pp. 1–6

  40. [40]

    Garreta, G

    R. Garreta, G. Moncecchi, T. Hauck, and G. Hackeling,Scikit-learn: machine learning simplified: implement scikit-learn into every step of the data science pipeline. Packt Publishing Ltd, 2017

  41. [41]

    A comprehensive review of binary neural network,

    C. Yuan and S. S. Agaian, “A comprehensive review of binary neural network,”Artificial Intelligence Review, vol. 56, no. 11, pp. 12 949– 13 013, 2023

  42. [42]

    Multiple sclerosis identi- fication by convolutional neural network with dropout and parametric relu,

    Y .-D. Zhang, C. Pan, J. Sun, and C. Tang, “Multiple sclerosis identi- fication by convolutional neural network with dropout and parametric relu,”Journal of computational science, vol. 28, pp. 1–10, 2018

  43. [43]

    Restruc- turing batch normalization to accelerate cnn training,

    W. Jung, D. Jung, B. Kim, S. Lee, W. Rhee, and J. H. Ahn, “Restruc- turing batch normalization to accelerate cnn training,”Proceedings of machine learning and systems, vol. 1, pp. 14–26, 2019

  44. [44]

    Soft-margin softmax for deep classification,

    X. Liang, X. Wang, Z. Lei, S. Liao, and S. Z. Li, “Soft-margin softmax for deep classification,” inInternational Conference on Neural Information Processing. Springer, 2017, pp. 413–421

  45. [45]

    Class-weighted classifica- tion: Trade-offs and robust approaches,

    Z. Xu, C. Dan, J. Khim, and P. Ravikumar, “Class-weighted classifica- tion: Trade-offs and robust approaches,” inInternational conference on machine learning. PMLR, 2020, pp. 10 544–10 554

  46. [46]

    Machine learning with oversampling and undersampling techniques: overview study and exper- imental results,

    R. Mohammed, J. Rawashdeh, and M. Abdullah, “Machine learning with oversampling and undersampling techniques: overview study and exper- imental results,” in2020 11th international conference on information and communication systems (ICICS). IEEE, 2020, pp. 243–248

  47. [47]

    Adam: A Method for Stochastic Optimization

    D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014

  48. [48]

    Gradient-based attribution methods,

    M. Ancona, E. Ceolini, C. ¨Oztireli, and M. Gross, “Gradient-based attribution methods,” inExplainable AI: Interpreting, explaining and visualizing deep learning. Springer, 2019, pp. 169–191

  49. [49]

    Prompting frameworks for large language models: A survey,

    X. Liu, J. Wang, X. Yuan, J. Sun, G. Dong, P. Di, W. Wang, and D. Wang, “Prompting frameworks for large language models: A survey,” ACM Computing Surveys, 2023

  50. [50]

    The ground truth about metadata and community detection in networks,

    L. Peel, D. B. Larremore, and A. Clauset, “The ground truth about metadata and community detection in networks,”Science advances, vol. 3, no. 5, p. e1602548, 2017

  51. [51]

    Dynamically detecting security threats and updating a signature-based intrusion detection system’s database,

    M. Y . AlYousef and N. T. Abdelmajeed, “Dynamically detecting security threats and updating a signature-based intrusion detection system’s database,”Procedia Computer Science, vol. 159, pp. 1507–1516, 2019

  52. [52]

    Is semantic chunking worth the computational cost?

    R. Qu, R. Tu, and F. Bao, “Is semantic chunking worth the computational cost?” inFindings of the Association for Computational Linguistics: NAACL 2025, 2025, pp. 2155–2177

  53. [53]

    Performance of 4 pre-trained sentence transformer models in the semantic query of a systematic review dataset on peri-implantitis,

    C. Galli, N. Donos, and E. Calciolari, “Performance of 4 pre-trained sentence transformer models in the semantic query of a systematic review dataset on peri-implantitis,”Information, vol. 15, no. 2, p. 68, 2024

  54. [54]

    Approximate similarity search with faiss framework using fpgas on the cloud,

    D. Danopoulos, C. Kachris, and D. Soudris, “Approximate similarity search with faiss framework using fpgas on the cloud,” inInternational Conference on Embedded Computer Systems. Springer, 2019, pp. 373– 386

  55. [55]

    The implementation of cosine similarity to calculate text relevance between two documents,

    D. Gunawan, C. Sembiring, and M. A. Budiman, “The implementation of cosine similarity to calculate text relevance between two documents,” inJournal of physics: conference series, vol. 978, no. 1. IOP Publishing, 2018, p. 012120

  56. [56]

    Novel hybrid retrieval and rerank- ing with score fusion for advanced financial question answering using large language models,

    S. Ravishankar and P. Varalakshmi, “Novel hybrid retrieval and rerank- ing with score fusion for advanced financial question answering using large language models,” in2025 IEEE Silchar Subsection Conference (SILCON). IEEE, 2025, pp. 1–6

  57. [57]

    LLaMA: Open and Efficient Foundation Language Models

    H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozi `ere, N. Goyal, E. Hambro, F. Azharet al., “Llama: Open and efficient foundation language models,”arXiv preprint arXiv:2302.13971, 2023

  58. [58]

    Using ollama,

    F. S. Marcondes, A. Gala, R. Magalh ˜aes, F. Perez de Britto, D. Dur ˜aes, and P. Novais, “Using ollama,” inNatural Language Analytics with Generative Large-Language Models: A Practical Approach with Ollama and Open-Source LLMs. Springer, 2025, pp. 23–35

  59. [59]

    Cyber resiliency and nist special publica- tion 800-53 rev. 4 controls,

    D. Bodeau and R. Graubart, “Cyber resiliency and nist special publica- tion 800-53 rev. 4 controls,” 2013

  60. [60]

    Security and Privacy Controls for Information Syst ems and Organizations

    National Institute of Standards and Technology, “Security and privacy controls for information systems and organizations,” U.S. Department of Commerce, Gaithersburg, MD, Tech. Rep. NIST SP 800-53, Revision 5, 2020. [Online]. Available: https://doi.org/10.6028/NIST.SP.800-53r5

  61. [61]

    Rfc 4732: Internet denial-of-service considerations,

    IAB, “Rfc 4732: Internet denial-of-service considerations,” 2006

  62. [62]

    Common weakness enumeration (cwe) ver- sion 4.19.1,

    The MITRE Corporation, “Common weakness enumeration (cwe) ver- sion 4.19.1,” https://cwe.mitre.org/data/index.html, January 2026, ac- cessed: April 29, 2026

  63. [63]

    LangChain,

    H. Chase, “LangChain,” Oct. 2022. [Online]. Available: https://github.com/langchain-ai/langchain

  64. [64]

    [Online]

    Ollama Team, “Ollama,” 2024. [Online]. Available: https://ollama.com

  65. [65]

    text-embedding-ada-002: Openai embedding model,

    OpenAI, “text-embedding-ada-002: Openai embedding model,” https://platform.openai.com/docs/guides/embeddings, 2022

  66. [66]

    CICIDS2018 intrusion detection dataset,

    Canadian Institute for Cybersecurity, “CICIDS2018 intrusion detection dataset,” https://www.unb.ca/cic/datasets/ids-2018.html, 2018

  67. [67]

    Detecting cybersecurity attacks across different network features and learners,

    J. L. Leevy, J. Hancock, R. Zuech, and T. M. Khoshgoftaar, “Detecting cybersecurity attacks across different network features and learners,” Journal of Big Data, vol. 8, no. 1, p. 38, 2021

  68. [68]

    Information technology — information security incident management — part 1: Principles and process,

    ISO/IEC, “Information technology — information security incident management — part 1: Principles and process,” International Organization for Standardization, Geneva, Switzerland, Tech. Rep. ISO/IEC 27035-1:2023, 2023. [Online]. Available: https://www.iso.org/standard/78973.html

  69. [69]

    BERTScore: Evaluating Text Generation with BERT

    T. Zhang, V . Kishore, F. Wu, K. Q. Weinberger, and Y . Artzi, “Bertscore: Evaluating text generation with bert,”arXiv preprint arXiv:1904.09675, 2019

  70. [70]

    Rouge: A package for automatic evaluation of summaries,

    C.-Y . Lin, “Rouge: A package for automatic evaluation of summaries,” inText summarization branches out, 2004, pp. 74–81

  71. [71]

    A call for clarity in reporting bleu scores,

    M. Post, “A call for clarity in reporting bleu scores,” inProceedings of the third conference on machine translation: Research papers, 2018, pp. 186–191

  72. [72]

    Individual comparisons by ranking methods,

    F. Wilcoxon, “Individual comparisons by ranking methods,”Biometrics Bulletin, vol. 1, no. 6, pp. 80–83, 1945