From CVE to CWE: Syscall-Based HIDS Generalisation

Alexander V. Kozachok; Shamil G. Magomedov; Stanislav G. Vyugov

arxiv: 2606.22581 · v1 · pith:AZM3OUEPnew · submitted 2026-06-21 · 💻 cs.CR · cs.AI

From CVE to CWE: Syscall-Based HIDS Generalisation

Alexander V. Kozachok , Stanislav G. Vyugov , Shamil G. Magomedov This is my paper

Pith reviewed 2026-06-26 10:02 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords syscall traceshost intrusion detectionCWE generalizationone-class anomaly detectionCVE to CWE transferfalse positive rate calibrationIsolation ForestOne-Class SVM

0 comments

The pith

Syscall anomaly detectors trained on multiple CVEs sharing a CWE class can detect unseen CVEs in that class for some weakness types but not others.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates whether training one-class anomaly detectors on normal syscall behavior from several CVEs in the same CWE class allows detection of a new CVE in that class. Experiments use six scenarios grouped into CWE-307, CWE-89, and CWE-434, with 66-dimensional feature vectors and detectors calibrated to fixed false positive rates. The combined profile for CWE-307 achieves an F1 score of 0.6976 at 5% FPR, while the other classes show F1 scores of 0.21 or lower. Transfer performance depends primarily on the breadth of the normal training data rather than the shared CWE label. This indicates that CWE-level generalization in HIDS is feasible selectively with existing syscall features.

Core claim

A combined CWE-level normal profile supports detection of an unseen CVE within the same class for CWE-307 broken authentication, reaching F1 = 0.6976 at target FPR = 0.05, but the same approach collapses to F1 <= 0.21 for CWE-89 SQL injection and CWE-434 unrestricted file upload. Cross-CVE transfer is asymmetric and governed by the breadth of the source normal profile rather than the CWE label.

What carries the argument

The combined CWE-level normal profile extracted from multiple CVEs to train one-class anomaly detectors on sliding-window syscall feature vectors.

If this is right

Self-detection on the same CVE works reliably across the tested families.
Combining CVEs into a single normal profile improves results only for certain classes like CWE-307.
Transfer success between CVEs is direction-dependent and tied to normal profile breadth.
Feature filtering does not substantially change transferability.
Reporting at calibrated false positive rates is essential for valid comparisons.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Collecting diverse normal traces could benefit detection more than strict CWE grouping.
Syscall features appear insufficient for reliable generalization in SQL injection and file upload classes.
Operational systems might benefit from prioritizing weakness classes with consistent normal behavior across exploits.
Further tests with additional CVEs per class would clarify the conditions for successful generalization.

Load-bearing premise

The normal syscall profiles collected from the chosen training CVEs are representative of normal behavior for other unseen CVEs in the same CWE class.

What would settle it

A new experiment showing F1 scores below 0.3 for the combined CWE-307 detector when applied to an additional unseen CVE from the same class under identical calibration would falsify the generalization claim for that family.

Figures

Figures reproduced from arXiv: 2606.22581 by Alexander V. Kozachok, Shamil G. Magomedov, Stanislav G. Vyugov.

**Figure 1.** Figure 1: Calibrated one-class CWE detection pipeline. Normal-only training and calibration on the left; target FPR sets the threshold; the decision on the right is binary at window level and supports self, cross-CVE and combined evaluations. Per-CVE detector CVE-2012-2122 modeli EPS_CWE-434 modeli CWE-89-SQLi modeli One model per CVE: no transfer to a new exploit of the same defect class CWE-level detector (this w… view at source ↗

**Figure 2.** Figure 2: Per-CVE detection (left) versus the CWE-level detector studied in this paper (right). The right panel pools the normal profiles of several CVEs that share a CWE class into a single combined detector [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Cross-CVE transfer F1 inside each CWE family. The diagonals show selfdetection (uncalibrated). The off-diagonal cells show the F1 of an anomaly model fitted on the source CVE and applied to the target CVE [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: F1 across protocols and CWE families. “Best transfer” is taken over the seven feature sets of [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 5.** Figure 5: Combined CWE-307 detector under three target FPRs. The realised FPR tracks the target FPR closely, which is the desired behaviour of Algorithm 1. Recall and F1 grow with α, while precision declines smoothly. memory-management patterns specific to the runtime (PHP vs. EPS) rather than the upload-then-execute motif. The feature extractor does not surface a single upload-then-execute temporal motif, so the re… view at source ↗

**Figure 6.** Figure 6: Two-sample KS distance between the normal-window distributions of the two CWE-307 scenarios. Resource-size and PID-switch features dominate the shifted group; the lseek and pread byte counters are essentially identical in both normals. with additional CWE classes (e.g. CWE-22 path traversal, CWE-94 code injection) that are not yet covered here. Our feature extractor follows Peng Guo [11]; richer represent… view at source ↗

**Figure 7.** Figure 7: Effect of feature filters on cross-CVE transfer F1 inside CWE-307. The most aggressive normal-domain stability filter (stable) destroys the strong direction; importance-only top-20 (score_l0p0) preserves most of it. with structural patterns that are invariant to application identity – graph-based syscall models [11] and contrastive prototype embeddings [25] are concrete candidates – or adopt explicit mult… view at source ↗

read the original abstract

Host intrusion detection systems (HIDS) based on system-call traces are typically trained and evaluated against individual Common Vulnerabilities and Exposures (CVE) instances. In operational settings, however, defenders need to recognise new exploits of an already known type of weakness. We empirically examine whether a one-class anomaly detector trained on the normal behaviour of a set of CVEs that share a Common Weakness Enumeration (CWE) class generalises to a different, unseen CVE inside the same class. Using six scenarios drawn from LID-DS-2021 and grouped into three CWE families (CWE-307 broken authentication, CWE-89 SQL injection, CWE-434 unrestricted file upload), we extract a 66-dimensional Peng-Guo-style feature vector per sliding window and train Isolation Forest and SGD One-Class SVM detectors with normal-only thresholds calibrated to fixed target false positive rates. We define and answer four research questions covering self-detection, asymmetric cross-CVE transfer, the value of a combined CWE-level normal profile, and the effect of feature filtering on transferability. The combined CWE-307 detector reaches F1 = 0.6976 at calibration target FPR = 0.05 (precision = 0.8994, recall = 0.5698), whereas CWE-89 and CWE-434 collapse to F1 <= 0.21 under the same protocol. Cross-CVE transfer turns out to be strongly direction-dependent and dominated by the breadth of the source normal profile rather than by the CWE label. We conclude that CWE-level generalisation in HIDS is empirically attainable for some but not all weakness families with current syscall features, and we argue that calibrated FPR is a methodological prerequisite for honest reporting in this setting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CWE-level syscall HIDS generalization holds for one family but fails for others and tracks source profile breadth more than the CWE label itself.

read the letter

The main thing to know is that pooling normal traces from several CVEs in the same CWE produces usable detection on a held-out CVE only for CWE-307 (F1 0.70 at FPR 0.05), while the same protocol collapses for CWE-89 and CWE-434. The authors themselves state that transfer success is driven by how broad the training normal set is rather than by the weakness category.

The work runs a direct empirical check on six LID-DS-2021 scenarios grouped into those three CWE families. It trains Isolation Forest and SGD One-Class SVM on 66-dimensional Peng-Guo features from sliding windows, calibrates thresholds to fixed target FPRs, and answers four questions on self-detection, asymmetric transfer, pooled CWE profiles, and feature filtering. Reporting performance at calibrated FPR rather than best-case thresholds is a clear methodological improvement over much of the prior HIDS literature.

The soft spot is that the paper gives no data on intra-CWE variance in normal profiles or on the criteria used to pick which CVEs go into the training set versus the test CVE. The direction-dependence they report already shows that CWE membership alone does not guarantee a shared normal distribution, so the representativeness assumption stays untested. Without those details the claim that CWE grouping enables generalization rests on a small and possibly non-representative sample.

The paper is useful for researchers who build or evaluate syscall HIDS and want concrete numbers on the limits of cross-CVE transfer. It deserves referee time because the experimental design is straightforward, the metrics are reported honestly at fixed FPR, and the negative results for two families are stated plainly. A revision that adds profile variance statistics and explicit CVE selection rationale would make the contribution stronger, but the current version is already worth reviewing.

Referee Report

2 major / 2 minor

Summary. The manuscript empirically examines whether one-class anomaly detectors (Isolation Forest, SGD One-Class SVM) trained on normal syscall profiles from multiple CVEs sharing a CWE class can generalize to an unseen CVE in the same class. Using six scenarios from LID-DS-2021 grouped into CWE-307, CWE-89, and CWE-434, 66-dimensional Peng-Guo-style features are extracted per sliding window; detectors are trained on normal-only data with thresholds calibrated to fixed target FPRs. Four research questions address self-detection, asymmetric cross-CVE transfer, the value of combined CWE-level profiles, and feature filtering. Key result: the combined CWE-307 detector reaches F1=0.6976 (precision=0.8994, recall=0.5698) at FPR=0.05, while CWE-89 and CWE-434 yield F1<=0.21; transfer is direction-dependent and dominated by source-profile breadth rather than CWE label. The authors conclude CWE-level generalization is attainable for some but not all families under current features.

Significance. If the empirical findings hold after verification of representativeness, the work demonstrates that CWE grouping can support generalization in syscall HIDS for certain weakness families when source normal profiles are sufficiently broad, while providing a cautionary example for others. It contributes concrete, FPR-calibrated performance numbers on held-out CVEs and stresses calibrated FPR as a methodological requirement for honest reporting. This could guide practical detector design in operational settings where new exploits of known weakness types must be caught.

major comments (2)

[Experimental setup and research questions] The central claim of CWE-class generalization rests on the assumption that normal profiles from the selected training CVEs within each CWE are representative of the class. However, no details are provided on CVE selection criteria within families, intra-CWE profile variance, or any test confirming the held-out CVE lies within the support of the training normal distribution. This assumption is load-bearing, especially given the abstract's own observation that transfer is dominated by source-profile breadth rather than CWE label.
[Results (performance tables and RQ answers)] The reported F1, precision, and recall values (e.g., CWE-307 combined detector at target FPR=0.05) are presented without statistical significance tests, confidence intervals, or explicit description of data splits and cross-validation. This makes it impossible to assess whether observed differences across CWE families are reliable, directly affecting verification of the claim that generalization succeeds for CWE-307 but collapses for the others.

minor comments (2)

[Methods] The 66-dimensional feature vector is described as 'Peng-Guo-style' but would benefit from an explicit reference or brief definition in the methods to aid reproducibility.
[Experimental design] Clarify whether the same six scenarios are used across all four research questions or if subsets are employed for specific transfer experiments.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that additional transparency on CVE selection and statistical reporting will strengthen the manuscript and will revise accordingly. Point-by-point responses to the major comments follow.

read point-by-point responses

Referee: [Experimental setup and research questions] The central claim of CWE-class generalization rests on the assumption that normal profiles from the selected training CVEs within each CWE are representative of the class. However, no details are provided on CVE selection criteria within families, intra-CWE profile variance, or any test confirming the held-out CVE lies within the support of the training normal distribution. This assumption is load-bearing, especially given the abstract's own observation that transfer is dominated by source-profile breadth rather than CWE label.

Authors: We acknowledge that the paper does not explicitly detail CVE selection criteria or provide intra-CWE variance statistics. The six scenarios were the complete set available in LID-DS-2021 that map to the three studied CWE classes, with assignment following the dataset's official CWE labels. No formal test of distributional support was performed. The empirical results, particularly the strong dependence on source-profile breadth, already illustrate the limits of the CWE label as a predictor. We will add a new subsection describing the selection process, basic variance metrics across normal profiles within each CWE, and an explicit discussion of the representativeness assumption as a limitation. revision: yes
Referee: [Results (performance tables and RQ answers)] The reported F1, precision, and recall values (e.g., CWE-307 combined detector at target FPR=0.05) are presented without statistical significance tests, confidence intervals, or explicit description of data splits and cross-validation. This makes it impossible to assess whether observed differences across CWE families are reliable, directly affecting verification of the claim that generalization succeeds for CWE-307 but collapses for the others.

Authors: We agree that the lack of confidence intervals and formal tests reduces interpretability. The experiments follow the fixed scenario splits provided by LID-DS-2021; cross-validation is not applicable given the small number of CVEs per CWE. We will augment all performance tables with bootstrap 95% confidence intervals computed over the test windows and add a limitations paragraph noting that formal significance testing between CWE families is under-powered with only three groups. These additions will allow readers to better gauge the reliability of the reported differences. revision: yes

Circularity Check

0 steps flagged

No significant circularity; purely empirical evaluation with direct measurements

full rationale

The paper conducts an empirical study training one-class anomaly detectors (Isolation Forest, SGD One-Class SVM) on syscall feature vectors from selected CVEs within CWE families and measuring performance (F1, precision, recall) on held-out CVEs. No derivations, equations, fitted parameters renamed as predictions, or self-citations are load-bearing for the central claims. Reported metrics are direct experimental outcomes on the test splits; the direction-dependent transfer results are likewise measured outcomes rather than constructed by definition. The representativeness assumption is an empirical premise open to falsification by the experiments themselves and does not reduce the reported results to tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that Peng-Guo-style syscall features plus sliding windows are sufficient to separate normal from anomalous behavior within a CWE class; no free parameters are explicitly fitted beyond the choice of target FPR, and no new entities are postulated.

axioms (1)

domain assumption Syscall traces from different CVEs within the same CWE share enough normal-behavior structure for a single one-class model to generalize.
This premise is required for the cross-CVE transfer experiments to be meaningful and is invoked by grouping the six scenarios into three CWE families.

pith-pipeline@v0.9.1-grok · 5856 in / 1445 out tokens · 33066 ms · 2026-06-26T10:02:03.932758+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 27 canonical work pages

[1]

In: Proc

Forrest, S., Hofmeyr, S.A., Somayaji, A., Longstaff, T.A.: A sense of self for Unix processes. In: Proc. IEEE Symp. Security and Privacy, pp. 120–128 (1996). doi: 10.1109/SECPRI.1996.502675 16 Kozachok, Vyugov, Magomedov

work page doi:10.1109/secpri.1996.502675 1996
[2]

Hofmeyr, S.A., Forrest, S., Somayaji, A.: Intrusion detection using sequences of system calls. J. Comput. Secur.6(3), 151–180 (1998). doi:10.3233/JCS-980109

work page doi:10.3233/jcs-980109 1998
[3]

Liao, Y., Vemuri, V.R.: Use ofk-nearest neighbor classifier for intrusion detection. Comput. Secur.21(5), 439–448 (2002). doi:10.1016/S0167-4048(02)00514-X

work page doi:10.1016/s0167-4048(02)00514-x 2002
[4]

In: Proc

Kang, D.K., Fuller, D., Honavar, V.: Learning classifiers for misuse and anomaly detection using a bag of system calls representation. In: Proc. IEEE SMC Infor- mation Assurance Workshop, pp. 118–125 (2005). doi:10.1109/IAW.2005.1495944

work page doi:10.1109/iaw.2005.1495944 2005
[5]

IEEE Trans

Maggi, F., Matteucci, M., Zanero, S.: Detecting intrusions through system call sequence and argument analysis. IEEE Trans. Depend. Secur. Comput.7(4), 381– 395 (2010). doi:10.1109/TDSC.2008.69

work page doi:10.1109/tdsc.2008.69 2010
[6]

IEEE Trans

Creech, G., Hu, J.: A semantic approach to host-based intrusion detection systems using contiguous and discontiguous system call patterns. IEEE Trans. Comput. 63(4), 807–819 (2014). doi:10.1109/TC.2013.13

work page doi:10.1109/tc.2013.13 2014
[7]

In: D-A-CH Security 2019, pp

Grimmer, M., Roehling, M.M., Kreusel, D., Rechert, K.: A modern and sophisti- cated host based intrusion detection data set. In: D-A-CH Security 2019, pp. 135–

2019
[8]

Mendeley Data, v3 (2021)

Grimmer, M., Kaelble, T., Rucks, F., Pirl, J.: LID-DS 2021 – A modern host-based intrusion detection data set. Mendeley Data, v3 (2021). doi:10.17632/4xj3p3z5kj.3

work page doi:10.17632/4xj3p3z5kj.3 2021
[9]

In: Proc

Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation Forest. In: Proc. ICDM 2008, pp. 413–

2008
[10]

Isolation forest,

IEEE (2008). doi:10.1109/ICDM.2008.17

work page doi:10.1109/icdm.2008.17 2008
[11]

Platt, John C

Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Es- timating the support of a high-dimensional distribution. Neural Comput.13(7), 1443–1471 (2001). doi:10.1162/089976601750264965

work page doi:10.1162/089976601750264965 2001
[12]

In: Proc

Guo, P.: Intrusion detection based on complete system call information. In: Proc. DSAI 2024, pp. 1–5. ACM (2024). doi:10.1145/3677892.3677893

work page doi:10.1145/3677892.3677893 2024
[13]

In: Proc

El Khairi, A., Caselli, M., Knierim, C., Peter, A., Continella, A.: Contextualiz- ing system calls in containers for anomaly-based intrusion detection. In: Proc. ACMCloudComputingSecurityWorkshop(CCSW),pp.9–21(2022).doi:10.1145/ 3560810.3564266

arXiv 2022
[14]

In: Proc

Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection. In: Proc. IEEE Symp. Security and Privacy, pp. 305– 316 (2010). doi:10.1109/SP.2010.25

work page doi:10.1109/sp.2010.25 2010
[15]

In: Proc

Tunde-Onadele, O., He, J., Dai, T., Gu, X.: A study on container vulnerability exploit detection. In: Proc. IEEE IC2E, pp. 121–127 (2019). doi:10.1109/IC2E. 2019.00026

work page doi:10.1109/ic2e 2019
[16]

In: Proc

Lin,Y.,Tunde-Onadele,O.,Gu,X.:CDL:Classifieddistributedlearningfordetect- ing security attacks in containerized applications. In: Proc. ACSAC, pp. 179–188 (2020). doi:10.1145/3427228.3427236

work page doi:10.1145/3427228.3427236 2020
[17]

In: Proc

Lin, Y., Tunde-Onadele, O., Gu, X., He, J., Latapie, H.: SHIL: Self-supervised hybrid learning for security attack detection in containerized applications. In: Proc. IEEE ACSOS, pp. 41–50 (2022). doi:10.1109/ACSOS55765.2022.00022

work page doi:10.1109/acsos55765.2022.00022 2022
[18]

ACM Trans

Tunde-Onadele, O., Lin, Y., Gu, X., He, J., Latapie, H.: A self-supervised machine learning framework for online container security attack detection. ACM Trans. Auton. Adapt. Syst.19(3), 17 (2024). doi:10.1145/3665795

work page doi:10.1145/3665795 2024
[19]

In: Proc

Suneja, S., Kanso, A., Le, M., Isci, C.: SecQuant: quantifying container security exposure. In: Proc. ESORICS 2022, LNCS 13554, pp. 525–546. Springer (2022). doi:10.1007/978-3-031-17143-7_26

work page doi:10.1007/978-3-031-17143-7_26 2022
[20]

In: Proc

Aghaei, E., Shadid, W., Al-Shaer, E.: ThreatZoom: CVE2CWE using hierarchical neural network. In: Proc. SecureComm 2020, LNICST 335, pp. 23–41. Springer (2020). doi:10.1007/978-3-030-63086-7_2 From CVE to CWE: Syscall-Based HIDS Generalisation 17

work page doi:10.1007/978-3-030-63086-7_2 2020
[21]

In: Proc

Das, S.S., Serra, E., Halappanavar, M., Pothen, A., Al-Shaer, E.: V2W-BERT: A framework for effective hierarchical multiclass classification of software vulnerabil- ities. In: Proc. IEEE DSAA 2021, pp. 1–12 (2021). doi:10.1109/DSAA53316.2021. 9564227

work page doi:10.1109/dsaa53316.2021 2021
[22]

In: Proc

Pan, S., Bao, L., Xia, X., Lo, D., Li, S.: Fine-grained commit-level vulnerability type prediction by CWE tree structure. In: Proc. ICSE 2023, pp. 957–969 (2023). doi:10.1109/ICSE48619.2023.00088

work page doi:10.1109/icse48619.2023.00088 2023
[23]

ACM Trans

Li, L., Ding, S.H.H., Tian, Y., Fung, B.C.M., Charland, P., Ou, W., Song, L., Chen, C.: VulANalyzeR: Explainable binary vulnerability detection with multi- task learning and attentional graph convolution. ACM Trans. Priv. Secur.26(3), 1–25 (2023). doi:10.1145/3585386

work page doi:10.1145/3585386 2023
[24]

In: Proc

Atiiq, S.A., Gehrmann, C., Dahlen, K., Khalil, K.: From generalist to specialist: exploring CWE-specific vulnerability detection. In: Proc. ARES 2024, pp. 1–12 (2024). doi:10.1145/3664476.3670872

work page doi:10.1145/3664476.3670872 2024
[25]

arXiv:2403.13013 (2024)

Uddin, M.A., Aryal, S., Bouadjenek, M.R., Al-Hawawreh, M., Talukder, M.A.: Hierarchical classification for intrusion detection system: effective design and em- pirical analysis. arXiv:2403.13013 (2024). https://arxiv.org/abs/2403.13013

arXiv 2024
[26]

Lopez-Martin, M., Sanchez-Esguevillas, A., Arribas, J.I., Carro, B.: Supervised contrastive learning over prototype-label embeddings for network intrusion detec- tion. Inf. Fusion79, 200–228 (2022). doi:10.1016/j.inffus.2021.09.014

work page doi:10.1016/j.inffus.2021.09.014 2022
[27]

IEEE Commun

Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. Tutor.16(1), 303–336 (2014). doi:10.1109/SURV.2013.052213.00046

work page doi:10.1109/surv.2013.052213.00046 2014
[28]

Garcia-Teodoro, P., Díaz-Verdejo, J., Maciá-Fernández, G., Vázquez, E.: Anomaly- based network intrusion detection: techniques, systems and challenges. Comput. Secur.28(1–2), 18–28 (2009). doi:10.1016/j.cose.2008.08.003

work page doi:10.1016/j.cose.2008.08.003 2009
[29]

IEEE Access8, 6249–6271 (2020)

Aslan, Ö., Samet, R.: A comprehensive review on malware detection approaches. IEEE Access8, 6249–6271 (2020). doi:10.1109/ACCESS.2019.2963724

work page doi:10.1109/access.2019.2963724 2020
[30]

https://cwe

MITRE Corporation: Common Weakness Enumeration, version 4.15. https://cwe. mitre.org/ (Accessed: 1 May 2026)

2026
[31]

https://www.cve.org/ (Accessed: 1 May 2026)

MITRE Corporation: CVE Program. https://www.cve.org/ (Accessed: 1 May 2026)

2026
[32]

Zhang, J., Wei, F., Hu, X., Yang, B., Xie, F., Liu, S.: MCLDM: multi-channel contrastive learning network for intrusion detection. Comput. Netw.237, 110083 (2023). doi:10.1016/j.comnet.2023.110083

work page doi:10.1016/j.comnet.2023.110083 2023
[33]

SN Comput

Canbek, G., Temizel, T.T., Sagiroglu, S.: PToPI: A comprehensive review, anal- ysis, and knowledge representation of binary classification performance mea- sures/metrics. SN Comput. Sci.4, 13 (2022). doi:10.1007/s42979-022-01409-1

work page doi:10.1007/s42979-022-01409-1 2022
[34]

Rahimi, A., Recht, B.: Random features for large-scale kernel machines. In: Adv. Neural Inf. Process. Syst. 20 (NIPS 2007), pp. 1177–1184. MIT Press (2008)

2007

[1] [1]

In: Proc

Forrest, S., Hofmeyr, S.A., Somayaji, A., Longstaff, T.A.: A sense of self for Unix processes. In: Proc. IEEE Symp. Security and Privacy, pp. 120–128 (1996). doi: 10.1109/SECPRI.1996.502675 16 Kozachok, Vyugov, Magomedov

work page doi:10.1109/secpri.1996.502675 1996

[2] [2]

Hofmeyr, S.A., Forrest, S., Somayaji, A.: Intrusion detection using sequences of system calls. J. Comput. Secur.6(3), 151–180 (1998). doi:10.3233/JCS-980109

work page doi:10.3233/jcs-980109 1998

[3] [3]

Liao, Y., Vemuri, V.R.: Use ofk-nearest neighbor classifier for intrusion detection. Comput. Secur.21(5), 439–448 (2002). doi:10.1016/S0167-4048(02)00514-X

work page doi:10.1016/s0167-4048(02)00514-x 2002

[4] [4]

In: Proc

Kang, D.K., Fuller, D., Honavar, V.: Learning classifiers for misuse and anomaly detection using a bag of system calls representation. In: Proc. IEEE SMC Infor- mation Assurance Workshop, pp. 118–125 (2005). doi:10.1109/IAW.2005.1495944

work page doi:10.1109/iaw.2005.1495944 2005

[5] [5]

IEEE Trans

Maggi, F., Matteucci, M., Zanero, S.: Detecting intrusions through system call sequence and argument analysis. IEEE Trans. Depend. Secur. Comput.7(4), 381– 395 (2010). doi:10.1109/TDSC.2008.69

work page doi:10.1109/tdsc.2008.69 2010

[6] [6]

IEEE Trans

Creech, G., Hu, J.: A semantic approach to host-based intrusion detection systems using contiguous and discontiguous system call patterns. IEEE Trans. Comput. 63(4), 807–819 (2014). doi:10.1109/TC.2013.13

work page doi:10.1109/tc.2013.13 2014

[7] [7]

In: D-A-CH Security 2019, pp

Grimmer, M., Roehling, M.M., Kreusel, D., Rechert, K.: A modern and sophisti- cated host based intrusion detection data set. In: D-A-CH Security 2019, pp. 135–

2019

[8] [8]

Mendeley Data, v3 (2021)

Grimmer, M., Kaelble, T., Rucks, F., Pirl, J.: LID-DS 2021 – A modern host-based intrusion detection data set. Mendeley Data, v3 (2021). doi:10.17632/4xj3p3z5kj.3

work page doi:10.17632/4xj3p3z5kj.3 2021

[9] [9]

In: Proc

Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation Forest. In: Proc. ICDM 2008, pp. 413–

2008

[10] [10]

Isolation forest,

IEEE (2008). doi:10.1109/ICDM.2008.17

work page doi:10.1109/icdm.2008.17 2008

[11] [11]

Platt, John C

Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Es- timating the support of a high-dimensional distribution. Neural Comput.13(7), 1443–1471 (2001). doi:10.1162/089976601750264965

work page doi:10.1162/089976601750264965 2001

[12] [12]

In: Proc

Guo, P.: Intrusion detection based on complete system call information. In: Proc. DSAI 2024, pp. 1–5. ACM (2024). doi:10.1145/3677892.3677893

work page doi:10.1145/3677892.3677893 2024

[13] [13]

In: Proc

El Khairi, A., Caselli, M., Knierim, C., Peter, A., Continella, A.: Contextualiz- ing system calls in containers for anomaly-based intrusion detection. In: Proc. ACMCloudComputingSecurityWorkshop(CCSW),pp.9–21(2022).doi:10.1145/ 3560810.3564266

arXiv 2022

[14] [14]

In: Proc

Sommer, R., Paxson, V.: Outside the closed world: on using machine learning for network intrusion detection. In: Proc. IEEE Symp. Security and Privacy, pp. 305– 316 (2010). doi:10.1109/SP.2010.25

work page doi:10.1109/sp.2010.25 2010

[15] [15]

In: Proc

Tunde-Onadele, O., He, J., Dai, T., Gu, X.: A study on container vulnerability exploit detection. In: Proc. IEEE IC2E, pp. 121–127 (2019). doi:10.1109/IC2E. 2019.00026

work page doi:10.1109/ic2e 2019

[16] [16]

In: Proc

Lin,Y.,Tunde-Onadele,O.,Gu,X.:CDL:Classifieddistributedlearningfordetect- ing security attacks in containerized applications. In: Proc. ACSAC, pp. 179–188 (2020). doi:10.1145/3427228.3427236

work page doi:10.1145/3427228.3427236 2020

[17] [17]

In: Proc

Lin, Y., Tunde-Onadele, O., Gu, X., He, J., Latapie, H.: SHIL: Self-supervised hybrid learning for security attack detection in containerized applications. In: Proc. IEEE ACSOS, pp. 41–50 (2022). doi:10.1109/ACSOS55765.2022.00022

work page doi:10.1109/acsos55765.2022.00022 2022

[18] [18]

ACM Trans

Tunde-Onadele, O., Lin, Y., Gu, X., He, J., Latapie, H.: A self-supervised machine learning framework for online container security attack detection. ACM Trans. Auton. Adapt. Syst.19(3), 17 (2024). doi:10.1145/3665795

work page doi:10.1145/3665795 2024

[19] [19]

In: Proc

Suneja, S., Kanso, A., Le, M., Isci, C.: SecQuant: quantifying container security exposure. In: Proc. ESORICS 2022, LNCS 13554, pp. 525–546. Springer (2022). doi:10.1007/978-3-031-17143-7_26

work page doi:10.1007/978-3-031-17143-7_26 2022

[20] [20]

In: Proc

Aghaei, E., Shadid, W., Al-Shaer, E.: ThreatZoom: CVE2CWE using hierarchical neural network. In: Proc. SecureComm 2020, LNICST 335, pp. 23–41. Springer (2020). doi:10.1007/978-3-030-63086-7_2 From CVE to CWE: Syscall-Based HIDS Generalisation 17

work page doi:10.1007/978-3-030-63086-7_2 2020

[21] [21]

In: Proc

Das, S.S., Serra, E., Halappanavar, M., Pothen, A., Al-Shaer, E.: V2W-BERT: A framework for effective hierarchical multiclass classification of software vulnerabil- ities. In: Proc. IEEE DSAA 2021, pp. 1–12 (2021). doi:10.1109/DSAA53316.2021. 9564227

work page doi:10.1109/dsaa53316.2021 2021

[22] [22]

In: Proc

Pan, S., Bao, L., Xia, X., Lo, D., Li, S.: Fine-grained commit-level vulnerability type prediction by CWE tree structure. In: Proc. ICSE 2023, pp. 957–969 (2023). doi:10.1109/ICSE48619.2023.00088

work page doi:10.1109/icse48619.2023.00088 2023

[23] [23]

ACM Trans

Li, L., Ding, S.H.H., Tian, Y., Fung, B.C.M., Charland, P., Ou, W., Song, L., Chen, C.: VulANalyzeR: Explainable binary vulnerability detection with multi- task learning and attentional graph convolution. ACM Trans. Priv. Secur.26(3), 1–25 (2023). doi:10.1145/3585386

work page doi:10.1145/3585386 2023

[24] [24]

In: Proc

Atiiq, S.A., Gehrmann, C., Dahlen, K., Khalil, K.: From generalist to specialist: exploring CWE-specific vulnerability detection. In: Proc. ARES 2024, pp. 1–12 (2024). doi:10.1145/3664476.3670872

work page doi:10.1145/3664476.3670872 2024

[25] [25]

arXiv:2403.13013 (2024)

Uddin, M.A., Aryal, S., Bouadjenek, M.R., Al-Hawawreh, M., Talukder, M.A.: Hierarchical classification for intrusion detection system: effective design and em- pirical analysis. arXiv:2403.13013 (2024). https://arxiv.org/abs/2403.13013

arXiv 2024

[26] [26]

Lopez-Martin, M., Sanchez-Esguevillas, A., Arribas, J.I., Carro, B.: Supervised contrastive learning over prototype-label embeddings for network intrusion detec- tion. Inf. Fusion79, 200–228 (2022). doi:10.1016/j.inffus.2021.09.014

work page doi:10.1016/j.inffus.2021.09.014 2022

[27] [27]

IEEE Commun

Bhuyan, M.H., Bhattacharyya, D.K., Kalita, J.K.: Network anomaly detection: methods, systems and tools. IEEE Commun. Surv. Tutor.16(1), 303–336 (2014). doi:10.1109/SURV.2013.052213.00046

work page doi:10.1109/surv.2013.052213.00046 2014

[28] [28]

Garcia-Teodoro, P., Díaz-Verdejo, J., Maciá-Fernández, G., Vázquez, E.: Anomaly- based network intrusion detection: techniques, systems and challenges. Comput. Secur.28(1–2), 18–28 (2009). doi:10.1016/j.cose.2008.08.003

work page doi:10.1016/j.cose.2008.08.003 2009

[29] [29]

IEEE Access8, 6249–6271 (2020)

Aslan, Ö., Samet, R.: A comprehensive review on malware detection approaches. IEEE Access8, 6249–6271 (2020). doi:10.1109/ACCESS.2019.2963724

work page doi:10.1109/access.2019.2963724 2020

[30] [30]

https://cwe

MITRE Corporation: Common Weakness Enumeration, version 4.15. https://cwe. mitre.org/ (Accessed: 1 May 2026)

2026

[31] [31]

https://www.cve.org/ (Accessed: 1 May 2026)

MITRE Corporation: CVE Program. https://www.cve.org/ (Accessed: 1 May 2026)

2026

[32] [32]

Zhang, J., Wei, F., Hu, X., Yang, B., Xie, F., Liu, S.: MCLDM: multi-channel contrastive learning network for intrusion detection. Comput. Netw.237, 110083 (2023). doi:10.1016/j.comnet.2023.110083

work page doi:10.1016/j.comnet.2023.110083 2023

[33] [33]

SN Comput

Canbek, G., Temizel, T.T., Sagiroglu, S.: PToPI: A comprehensive review, anal- ysis, and knowledge representation of binary classification performance mea- sures/metrics. SN Comput. Sci.4, 13 (2022). doi:10.1007/s42979-022-01409-1

work page doi:10.1007/s42979-022-01409-1 2022

[34] [34]

Rahimi, A., Recht, B.: Random features for large-scale kernel machines. In: Adv. Neural Inf. Process. Syst. 20 (NIPS 2007), pp. 1177–1184. MIT Press (2008)

2007