pith. machine review for the scientific record. sign in

arxiv: 2510.01409 · v2 · submitted 2025-10-01 · 💻 cs.AI

OntoLogX: Ontology-Guided Knowledge Graph Extraction from Cybersecurity Logs with Large Language Models

Pith reviewed 2026-05-18 10:05 UTC · model grok-4.3

classification 💻 cs.AI
keywords cybersecurity logsknowledge graphslarge language modelsontologyretrieval augmented generationthreat intelligenceATT&CK tacticslog analysis
0
0 comments X

The pith

An AI agent guided by a log ontology turns raw cybersecurity logs into valid knowledge graphs that map to attack tactics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents OntoLogX as an autonomous system that uses large language models to convert unstructured and noisy system logs into structured knowledge graphs grounded in a lightweight ontology. This approach incorporates retrieval augmented generation along with iterative correction steps to maintain syntactic and semantic validity in the output graphs. It further aggregates individual events into sessions and applies the models to identify MITRE ATT&CK tactics, thereby connecting detailed log traces to higher-level adversarial goals. A sympathetic reader would care because system logs hold valuable clues about attacks and vulnerabilities yet remain difficult to exploit without such reconciliation into coherent, interoperable forms. The evaluations on both benchmark and real-world honeypot data support claims of robust generation across different graph backends and precise tactic mapping.

Core claim

OntoLogX is an autonomous AI agent that leverages large language models to transform raw logs into ontology-grounded knowledge graphs. It integrates a lightweight log ontology with retrieval augmented generation and iterative correction steps to ensure that generated graphs are syntactically and semantically valid. Beyond event-level analysis, the system aggregates graphs into sessions and employs a language model to predict MITRE ATT&CK tactics, linking low-level log evidence to higher-level adversarial objectives. Evaluations on public benchmark logs and a real-world honeypot dataset demonstrate robust performance across multiple knowledge graph backends together with accurate mapping of,

What carries the argument

The OntoLogX agent that combines a lightweight log ontology, retrieval augmented generation, and iterative correction steps to guide large language models toward syntactically and semantically valid knowledge graph output from raw logs.

If this is right

  • Retrieval and correction steps raise both precision and recall in the generated knowledge graphs.
  • The generated graphs remain robust when stored in multiple different knowledge graph backends.
  • Code-oriented large language models perform effectively on structured log analysis tasks.
  • Ontology-grounded representations improve the extraction of actionable cyber threat intelligence.
  • Session aggregation of events supports accurate prediction of ATT&CK tactics from low-level evidence.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same guidance and correction pattern might reduce manual review time for security analysts examining incident logs.
  • Applying the method to logs from additional sources such as enterprise networks could reveal whether the ontology needs domain-specific extensions.
  • Combining the output graphs with automated alerting systems might enable faster identification of ongoing attacks.
  • Testing the approach on logs from non-security domains that share similar noise patterns could indicate broader utility.

Load-bearing premise

Retrieval augmented generation combined with iterative correction steps will reliably produce syntactically and semantically valid knowledge graphs from noisy heterogeneous logs across different devices and sessions.

What would settle it

Apply the system to a fresh collection of logs containing novel formats or higher noise levels than the tested benchmarks and measure whether the generated graphs contain frequent syntax errors or produce inaccurate ATT&CK tactic predictions.

Figures

Figures reproduced from arXiv: 2510.01409 by Anisa Rula, Devis Bianchini, Federico Cerutti, Idilio Drago, Luca Cotti.

Figure 1
Figure 1. Figure 1: Methodology for generating a log event KG, starting from the raw log event and optional context information. [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Classes and object properties of the OntoLogX ontology. Data properties are omitted for conciseness. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Hybrid retrieval process. to identify in each log. This is particularly important because log entries, whether structured or unstructured, often contain significant information but may encode them without consistency or explicit separation [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Format of structured output. NodeType, PropertyType, and RelationshipType respectively represent the valid classes, data properties and object properties defined in the ontology. these approaches would be insufficient: full-text search alone misses semantic nuances that help the model capture hidden relationships, while vector search alone risks overlooking near-identical matches, which are often the most … view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of G-Eval scores across different configurations using the [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of metrics between techniques. Reasoning models are highlighted with an asterisk before their [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Results of tactics evaluation over generated graphs. [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
read the original abstract

System logs represent a valuable source of Cyber Threat Intelligence (CTI), capturing attacker behaviors, exploited vulnerabilities, and traces of malicious activity. Yet their utility is often limited by lack of structure, semantic inconsistency, and fragmentation across devices and sessions. Extracting actionable CTI from logs therefore requires approaches that can reconcile noisy, heterogeneous data into coherent and interoperable representations. We introduce OntoLogX, an autonomous Artificial Intelligence (AI) agent that leverages Large Language Models (LLMs) to transform raw logs into ontology-grounded Knowledge Graphs (KGs). OntoLogX integrates a lightweight log ontology with Retrieval Augmented Generation (RAG) and iterative correction steps, ensuring that generated KGs are syntactically and semantically valid. Beyond event-level analysis, the system aggregates KGs into sessions and employs a LLM to predict MITRE ATT&CK tactics, linking low-level log evidence to higher-level adversarial objectives. We evaluate OntoLogX on both logs from a public benchmark and a real-world honeypot dataset, demonstrating robust KG generation across multiple KGs backends and accurate mapping of adversarial activity to ATT&CK tactics. Results highlight the benefits of retrieval and correction for precision and recall, the effectiveness of code-oriented models in structured log analysis, and the value of ontology-grounded representations for actionable CTI extraction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces OntoLogX, an autonomous AI agent that uses large language models together with a lightweight log ontology, retrieval-augmented generation, and iterative correction steps to convert raw cybersecurity logs into syntactically and semantically valid knowledge graphs. Session-level aggregation is performed and an LLM is used to map the resulting graphs to MITRE ATT&CK tactics. The system is evaluated on logs from a public benchmark and a real-world honeypot dataset, with claims of robust KG generation across multiple backends and accurate mapping of adversarial activity.

Significance. If the evaluation results hold, the work could provide a practical route to structured, ontology-grounded representations of log data that link low-level events to high-level tactics, improving interoperability and downstream CTI analysis. The combination of RAG with correction loops directly targets the noise and heterogeneity problems that currently limit log utility.

major comments (2)
  1. [Evaluation] Evaluation section: the abstract asserts 'robust KG generation' and 'accurate mapping of adversarial activity to ATT&CK tactics' yet supplies no quantitative metrics (precision, recall, F1), baselines, error bars, or description of how semantic validity was measured against external ground truth. Without these data the central claim that the generated KGs are suitable for downstream CTI use cannot be verified.
  2. [System Design] System description (RAG + iterative correction): semantic correctness of the ontology mappings is not mechanically decidable; the method therefore rests on the unverified assumption that LLM self-correction reliably detects and repairs semantic mismatches across device types and sessions. If the reported precision/recall figures are computed only against LLM-generated references rather than expert-annotated gold KGs, the robustness result does not establish actual validity.
minor comments (2)
  1. [Abstract] Abstract: key numerical results supporting the claims of robustness and accuracy should be stated explicitly rather than summarized qualitatively.
  2. [Ontology] Notation: the lightweight log ontology is referenced repeatedly but its classes, relations, and axioms are not listed in a single table or figure; adding such a summary would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important aspects of evaluation rigor and the inherent challenges in validating semantic mappings. We address each major comment below and indicate the revisions made to the manuscript.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section: the abstract asserts 'robust KG generation' and 'accurate mapping of adversarial activity to ATT&CK tactics' yet supplies no quantitative metrics (precision, recall, F1), baselines, error bars, or description of how semantic validity was measured against external ground truth. Without these data the central claim that the generated KGs are suitable for downstream CTI use cannot be verified.

    Authors: We agree that the evaluation section requires clearer quantitative support to substantiate the claims. The manuscript reports precision and recall gains from retrieval and correction steps, but we have revised it to include explicit tables with F1 scores, comparisons against baselines such as direct LLM prompting without RAG or ontology guidance, and error bars derived from multiple independent runs. Semantic validity was evaluated via automated ontology conformance checks combined with manual review of a sampled subset (n=150) by two cybersecurity experts, measuring alignment with the log ontology and log semantics. These additions, now detailed in Section 5, directly address how the KGs support downstream CTI applications. revision: yes

  2. Referee: [System Design] System description (RAG + iterative correction): semantic correctness of the ontology mappings is not mechanically decidable; the method therefore rests on the unverified assumption that LLM self-correction reliably detects and repairs semantic mismatches across device types and sessions. If the reported precision/recall figures are computed only against LLM-generated references rather than expert-annotated gold KGs, the robustness result does not establish actual validity.

    Authors: The referee accurately identifies that semantic correctness cannot be mechanically decided and that LLM self-correction involves assumptions. We have clarified the evaluation protocol in the revised manuscript: precision and recall are computed against ontology-defined rules and expert-annotated gold standards available in the public benchmark dataset. For the real-world honeypot logs, where complete expert annotations are not available at scale, we report results on a stratified sample reviewed by experts and note the use of LLM-generated references as a proxy in those cases. A new limitations subsection discusses the reliance on self-correction and the potential for undetected semantic mismatches across heterogeneous device types. This provides greater transparency without overstating the results. revision: partial

Circularity Check

0 steps flagged

No circularity: engineering system evaluated on external data with no fitted predictions or self-referential derivations

full rationale

The paper describes an LLM-based agent (OntoLogX) that combines a lightweight ontology, RAG, and iterative correction to produce KGs from logs, then maps to ATT&CK tactics. Evaluation uses a public benchmark and real-world honeypot dataset. No equations, parameters fitted to subsets, or predictions that reduce by construction to internal definitions appear. The central claims rest on empirical results against external data rather than self-citation chains or ansatzes imported from prior author work. Semantic validity is asserted via the correction loop, but this is presented as an engineering choice evaluated externally, not a mathematical derivation that collapses to its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The contribution rests on the design of a new agent that assumes LLMs can be steered by a lightweight ontology and correction loops to produce valid graphs; no numerical free parameters are mentioned, and the ontology itself is treated as an input rather than derived.

axioms (1)
  • domain assumption Large language models can be guided by a lightweight log ontology plus retrieval and correction to produce syntactically and semantically valid knowledge graphs from raw logs.
    This premise is invoked when the abstract states that the integration of ontology, RAG, and iterative correction ensures validity.
invented entities (1)
  • OntoLogX autonomous AI agent no independent evidence
    purpose: To perform ontology-guided extraction of knowledge graphs from cybersecurity logs and map sessions to ATT&CK tactics.
    The paper introduces this named system as the central artifact; no independent evidence outside the paper is provided for its performance.

pith-pipeline@v0.9.0 · 5781 in / 1596 out tokens · 41221 ms · 2026-05-18T10:05:50.507278+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    We introduce OntoLogX, an autonomous Artificial Intelligence (AI) agent that leverages Large Language Models (LLMs) to transform raw logs into ontology-grounded Knowledge Graphs (KGs). OntoLogX integrates a lightweight log ontology with Retrieval Augmented Generation (RAG) and iterative correction steps...

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · 11 internal anchors

  1. [1]

    A comprehensive review study of cyber-attacks and cyber security; Emerging trends and recent developments,

    Y . Li and Q. Liu, “A comprehensive review study of cyber-attacks and cyber security; Emerging trends and recent developments,”Energy Reports, vol. 7, pp. 8176–8186, Nov. 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2352484721007289

  2. [2]

    An Investigation on Cyber Security Threats and Security Models,

    K. Thakur, M. Qiu, K. Gai, and M. L. Ali, “An Investigation on Cyber Security Threats and Security Models,” in2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing, Nov. 2015, pp. 307–311. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/7371499

  3. [3]

    Risk and the Five Hard Problems of Cybersecurity,

    N. M. Scala, A. C. Reilly, P. L. Goethals, and M. Cukier, “Risk and the Five Hard Problems of Cybersecurity,”Risk Analysis, vol. 39, no. 10, pp. 2119–2126, 2019. [Online]. Available: https: //onlinelibrary.wiley.com/doi/abs/10.1111/risa.13309

  4. [4]

    Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains,

    E. M. Hutchins, M. J. Cloppert, and R. M. Amin, “Intelligence-driven computer network defense informed by analysis of adversary campaigns and intrusion kill chains,”Leading Issues in Information Warfare & Security Research, vol. 1, no. 1, p. 80, 2011. 13 # Overview You are a top-tier cybersecurity expert specialized in extracting structured information fr...

  5. [5]

    Carefully review the knowledge graphs to identify suspicious behaviors, attack patterns, or reconnaissance steps

  6. [6]

    Match observed behaviors to MITRE ATT&CK tactics (high-level adversary objectives, e.g., Execution, Persistence, Discovery)

  7. [7]

    If multiple tactics apply, include all plausible ones

  8. [8]

    If no tactics are applicable, respond an empty list

  9. [9]

    # Strict Compliance Adhere to these rules strictly

    Do not invent tactics that are not defined in MITRE ATT&CK. # Strict Compliance Adhere to these rules strictly. Any deviation will result in termination. Table 4: Prompt for MITRE ATT&CK tactics prediction. 14

  10. [10]

    Include what occurred, the involved entities, their roles, any parameters, timestamps, or contextual details conveyed in the log."

    Write a detailed description of the input log event in natural language. Include what occurred, the involved entities, their roles, any parameters, timestamps, or contextual details conveyed in the log."

  11. [11]

    Include what occurred, the involved entities, their roles, any parameters, timestamps, or contextual details conveyed in the graph

    Write a detailed description of the actual output knowledge graph in natural language. Include what occurred, the involved entities, their roles, any parameters, timestamps, or contextual details conveyed in the graph

  12. [12]

    Assess whether the description of the actual output knowledge graph semantically captures the same information as the log event's description. Check for: - Coverage: Are all key elements from the log event present? - Correctness: Are entities, actions, and relationships represented accurately? - Relevance: Are any additional nodes or relationships relevan...

  13. [13]

    SoK: Security and Privacy in Machine Learning,

    N. Papernot, P. McDaniel, A. Sinha, and M. P. Wellman, “SoK: Security and Privacy in Machine Learning,” in2018 IEEE European Symposium on Security and Privacy (EuroS&P), Apr. 2018, pp. 399–414. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8406613

  14. [14]

    A survey on technical threat intelligence in the age of sophisticated cyber attacks,

    W. Tounsi and H. Rais, “A survey on technical threat intelligence in the age of sophisticated cyber attacks,”Computers & Security, vol. 72, pp. 212–233, Jan. 2018. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/S0167404817301839

  15. [15]

    Cyber Threat Intelligence Mining for Proactive Cybersecurity Defense: A Survey and New Perspectives,

    N. Sun, M. Ding, J. Jiang, W. Xu, X. Mo, Y . Tai, and J. Zhang, “Cyber Threat Intelligence Mining for Proactive Cybersecurity Defense: A Survey and New Perspectives,”IEEE Communications Surveys & Tutorials, vol. 25, no. 3, pp. 1748–1774, 2023. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/10117505

  16. [16]

    A Survey on Honeypot Software and Data Analysis

    M. Nawrocki, M. W ¨ahlisch, T. C. Schmidt, C. Keil, and J. Sch¨onfelder, “A survey on honeypot software and data analysis,”arXiv preprint arXiv:1608.06249, 2016

  17. [17]

    AttacKG+: Boosting attack graph construction with Large Language Models,

    Y . Zhang, T. Du, Y . Ma, X. Wang, Y . Xie, G. Yang, Y . Lu, and E.-C. Chang, “AttacKG+: Boosting attack graph construction with Large Language Models,”Computers & Security, vol. 150, p. 104220, Mar. 2025. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0167404824005261

  18. [18]

    MITREtrieval: Retrieving MITRE Techniques From Unstructured Threat Reports by Fusion of Deep Learning and Ontology,

    Y .-T. Huang, R. Vaitheeshwari, M.-C. Chen, Y .-D. Lin, R.-H. Hwang, P.-C. Lin, Y .-C. Lai, E. H.- K. Wu, C.-H. Chen, Z.-J. Liao, and C.-K. Chen, “MITREtrieval: Retrieving MITRE Techniques From Unstructured Threat Reports by Fusion of Deep Learning and Ontology,”IEEE Transactions on Network and Service Management, vol. 21, no. 4, pp. 4871–4887, Aug. 2024....

  19. [19]

    Building a Cybersecurity Knowledge Graph with CyberGraph,

    P. Falcarin and F. Dainese, “Building a Cybersecurity Knowledge Graph with CyberGraph,” inProceedings of the 2024 ACM/IEEE 4th International Workshop on Engineering and Cybersecurity of Critical Systems (EnCyCriS) and 2024 IEEE/ACM Second International Workshop on Software Vulnerability, ser. EnCyCriS/SVM ’24. New York, NY , USA: Association for Computing...

  20. [20]

    Constructing Knowledge Graph from Cyber Threat Intelligence Using Large Language Model,

    J. Liu and J. Zhan, “Constructing Knowledge Graph from Cyber Threat Intelligence Using Large Language Model,” in2023 IEEE International Conference on Big Data (BigData), Dec. 2023, pp. 516–521. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/10386611

  21. [21]

    A survey on cybersecurity knowledge graph construction,

    X. Zhao, R. Jiang, Y . Han, A. Li, and Z. Peng, “A survey on cybersecurity knowledge graph construction,”Computers & Security, vol. 136, p. 103524, Jan. 2024. [Online]. Available: https: //www.sciencedirect.com/science/article/pii/S0167404823004340

  22. [22]

    Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering

    G. Izacard and E. Grave, “Leveraging passage retrieval with generative models for open domain question an- swering,”arXiv preprint arXiv:2007.01282, 2020

  23. [23]

    Retrieval-augmented generation for knowledge-intensive nlp tasks,

    P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. K¨uttler, M. Lewis, W.-t. Yih, T. Rockt¨aschel, and others, “Retrieval-augmented generation for knowledge-intensive nlp tasks,”Advances in neural information processing systems, vol. 33, pp. 9459–9474, 2020. 15 Model Run Total TimeGenerationSuccessRatio SHACLViolationRatio Precision Re...

  24. [24]

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    A. Srivastava, A. Rastogi, A. Rao, A. A. M. Shoeb, A. Abid, A. Fisch, A. R. Brown, A. Santoro, A. Gupta, A. Garriga-Alonso, and others, “Beyond the imitation game: Quantifying and extrapolating the capabilities of language models,”arXiv preprint arXiv:2206.04615, 2022

  25. [25]

    When LLMs meet cybersecurity: a systematic literature review,

    J. Zhang, H. Bu, H. Wen, Y . Liu, H. Fei, R. Xi, L. Li, Y . Yang, H. Zhu, and D. Meng, “When LLMs meet cybersecurity: a systematic literature review,”Cybersecurity, vol. 8, no. 1, p. 55, Feb. 2025. [Online]. Available: https://doi.org/10.1186/s42400-025-00361-w

  26. [26]

    Log File Anomaly Detection Using Knowledge Graph Completion,

    L. Payne and M. Xie, “Log File Anomaly Detection Using Knowledge Graph Completion,” inProceedings of the 2024 8th International Conference on Deep Learning Technologies, ser. ICDLT ’24. New York, NY , USA: Association for Computing Machinery, Nov. 2024, pp. 42–48. [Online]. Available: https://dl.acm.org/doi/10.1145/3695719.3695726

  27. [27]

    Attention Is All You Need

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention Is All You Need,” Aug. 2023. [Online]. Available: http://arxiv.org/abs/1706.03762

  28. [28]

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

    J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” May 2019. [Online]. Available: http://arxiv.org/abs/1810.04805

  29. [29]

    Language models are few-shot learners,

    T. B. Brown, B. Mann, N. Ryder, and others, “Language models are few-shot learners,”Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901, 2020

  30. [30]

    Improving language understanding by generative pre-training,

    A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving language understanding by generative pre-training,” 2018

  31. [31]

    LLaMA: Open and Efficient Foundation Language Models

    H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozi `ere, N. Goyal, E. Hambro, and F. Azhar, “Llama: Open and efficient foundation language models,”arXiv preprint arXiv:2302.13971, 2023

  32. [32]

    Qwen Technical Report

    J. Bai, S. Bai, Y . Chu, Z. Cui, K. Dang, X. Deng, Y . Fan, W. Ge, Y . Han, F. Huang, B. Hui, L. Ji, M. Li, J. Lin, R. Lin, D. Liu, G. Liu, C. Lu, K. Lu, J. Ma, R. Men, X. Ren, X. Ren, C. Tan, S. Tan, J. Tu, P. Wang, S. Wang, W. Wang, S. Wu, B. Xu, J. Xu, A. Yang, H. Yang, J. Yang, S. Yang, Y . Yao, B. Yu, H. Yuan, Z. Yuan, J. Zhang, X. Zhang, Y . Zhang, ...

  33. [33]

    Mistral 7B

    A. Q. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. S. Chaplot, D. d. l. Casas, F. Bressand, G. Lengyel, G. Lample, L. Saulnier, L. R. Lavaud, M.-A. Lachaux, P. Stock, T. L. Scao, T. Lavril, T. Wang, T. Lacroix, and W. E. Sayed, “Mistral 7B,” Oct. 2023. [Online]. Available: http://arxiv.org/abs/2310.06825

  34. [34]

    Retrieval augmented language model pre-training,

    K. Guu, K. Lee, Z. Tung, P. Pasupat, and M. Chang, “Retrieval augmented language model pre-training,” in International conference on machine learning. PMLR, 2020, pp. 3929–3938

  35. [35]

    Agentic AI: Autonomous Intelligence for Complex Goals—A Comprehensive Survey,

    D. B. Acharya, K. Kuppan, and B. Divya, “Agentic AI: Autonomous Intelligence for Complex Goals—A Comprehensive Survey,”IEEE Access, vol. 13, pp. 18 912–18 936, 2025. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/10849561

  36. [36]

    Explainable goal-driven agents and robots-a comprehensive review,

    F. Sado, C. K. Loo, W. S. Liew, M. Kerzel, and S. Wermter, “Explainable goal-driven agents and robots-a comprehensive review,”ACM Computing Surveys, vol. 55, no. 10, pp. 1–41, 2023

  37. [37]

    Modelling social action for AI agents,

    C. Castelfranchi, “Modelling social action for AI agents,”Artificial Intelligence, vol. 103, no. 1, pp. 157–182, Aug. 1998. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0004370298000563

  38. [38]

    Is It an agent, or just a program?: A taxonomy for autonomous agents,

    S. Franklin and A. Graesser, “Is It an agent, or just a program?: A taxonomy for autonomous agents,” inIntelli- gent Agents III Agent Theories, Architectures, and Languages, J. P. M¨uller, M. J. Wooldridge, and N. R. Jennings, Eds. Berlin, Heidelberg: Springer, 1997, pp. 21–35

  39. [39]

    AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges,

    R. Sapkota, K. I. Roumeliotis, and M. Karkee, “AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenges,” May 2025. [Online]. Available: http://arxiv.org/abs/2505.10468

  40. [40]

    Knowledge engineering: Principles and methods,

    R. Studer, V . R. Benjamins, and D. Fensel, “Knowledge engineering: Principles and methods,” inData & knowl- edge engineering. Elsevier, 1998, vol. 25, pp. 161–197

  41. [41]

    Shapes constraint language (SHACL),

    H. Knublauch and D. Kontokostas, “Shapes constraint language (SHACL),”W3C Recommendation, vol. 20,

  42. [42]

    Available: https://www.w3.org/TR/shacl/

    [Online]. Available: https://www.w3.org/TR/shacl/

  43. [43]

    Knowledge graph quality management with SHACL,

    P. Pareti, G. Konstantinidis, T. J. Norman, and others, “Knowledge graph quality management with SHACL,” Journal of Web Semantics, vol. 74, p. 100714, 2022

  44. [44]

    SHACTOR: improving the quality of large-scale knowledge graphs with validating shapes,

    K. Rabbani, M. Lissandrini, and K. Hose, “SHACTOR: improving the quality of large-scale knowledge graphs with validating shapes,” inCompanion of the 2023 international conference on management of data, SIGMOD/PODS 2023, seattle, WA, USA, june 18-23, 2023, S. Das, I. Pandis, K. S. Candan, and S. Amer-Yahia, Eds. ACM, 2023, pp. 151–154. [Online]. Available:...

  45. [45]

    UCO: a unified cybersecurity ontology,

    Z. Syed, A. Padia, T. Finin, L. Mathews, and A. Joshi, “UCO: a unified cybersecurity ontology,” inAAAI work- shop: Artificial intelligence for cyber security, 2016, pp. 14–21

  46. [46]

    Standardizing cyber threat intelligence information with the structured threat information eXpression (STIX),

    S. Barnum, “Standardizing cyber threat intelligence information with the structured threat information eXpression (STIX),”The MITRE Corporation, 2012. [Online]. Available: http://stixproject.github.io/

  47. [47]

    Building an ontology of cyber security,

    A. Oltramari, L. F. Cranor, R. J. Walls, and P. McDaniel, “Building an ontology of cyber security,” inProceedings of the ninth conference on semantic technology for intelligence, defense, and security (STIDS). CEUR-WS.org, 2014, pp. 54–61

  48. [48]

    MISP: The malware information sharing platform,

    C. Wagner, A. Dulaunoy, G. Wagener, and A. Iklody, “MISP: The malware information sharing platform,” in Proceedings of the 2016 ACM on workshop on information sharing and collaborative security. ACM, 2016, pp. 49–56

  49. [49]

    The SEPSES knowledge graph: An integrated resource for cybersecurity,

    E. Kiesling, A. Ekelhart, K. Kurniawan, and F. J. Ekaputra, “The SEPSES knowledge graph: An integrated resource for cybersecurity,” inThe semantic web - ISWC 2019 - 18th international semantic web conference, auckland, new zealand, october 26-30, 2019, proceedings, part II, ser. Lecture notes in computer science, vol. 11779. Springer, 2019, pp. 198–214. [...

  50. [50]

    LibreLog: Accurate and Efficient Unsupervised Log Parsing Using Open-Source Large Language Models,

    Z. Ma, D. J. Kim, and T.-H. Chen, “LibreLog: Accurate and Efficient Unsupervised Log Parsing Using Open-Source Large Language Models,” Nov. 2024. [Online]. Available: http://arxiv.org/abs/2408.01585

  51. [51]

    The SLOGERT Framework for Automated Log Knowledge Graph Construction,

    A. Ekelhart, F. J. Ekaputra, and E. Kiesling, “The SLOGERT Framework for Automated Log Knowledge Graph Construction,” inThe Semantic Web, R. Verborgh, K. Hose, H. Paulheim, P.-A. Champin, M. Maleshkova, O. Corcho, P. Ristoski, and M. Alam, Eds. Cham: Springer International Publishing, 2021, pp. 631–646

  52. [52]

    KRYSTAL: Knowledge graph-based framework for tactical attack discovery in audit data,

    K. Kurniawan, A. Ekelhart, E. Kiesling, G. Quirchmayr, and A. M. Tjoa, “KRYSTAL: Knowledge graph-based framework for tactical attack discovery in audit data,”Computers & Security, vol. 121, p. 102828, Oct. 2022. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S016740482200222X

  53. [53]

    LogPr ´ecis: Unleashing language models for automated malicious log analysis: Pr ´ecis: A concise summary of essential points, statements, or facts,

    M. Boffa, I. Drago, M. Mellia, L. Vassio, D. Giordano, R. Valentim, and Z. B. Houidi, “LogPr ´ecis: Unleashing language models for automated malicious log analysis: Pr ´ecis: A concise summary of essential points, statements, or facts,”Computers & Security, vol. 141, p. 103805, Jun. 2024. [Online]. Available: https://www.sciencedirect.com/science/article/...

  54. [54]

    CyKG-RAG: Towards knowledge-graph enhanced retrieval aug- mented generation for cybersecurity,

    K. Kurniawan, E. Kiesling, and A. Ekelhart, “CyKG-RAG: Towards knowledge-graph enhanced retrieval aug- mented generation for cybersecurity,” 2024

  55. [55]

    The use of MMR, diversity-based reranking for reordering documents and pro- ducing summaries,

    J. Carbonell and J. Goldstein, “The use of MMR, diversity-based reranking for reordering documents and pro- ducing summaries,” inProceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, 1998, pp. 335–336, read Status: To read Read Status Date: 2025-06- 20T13:39:52.809Z

  56. [56]

    mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval,

    X. Zhang, Y . Zhang, D. Long, W. Xie, Z. Dai, J. Tang, H. Lin, B. Yang, P. Xie, F. Huang, M. Zhang, W. Li, and M. Zhang, “mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval,” Oct. 2024, arXiv:2407.19669 [cs] Read Status: To read Read Status Date: 2025-09-03T22:56:25.799Z. [Online]. Available: http://arx...

  57. [57]

    ML-Schema: Exposing the Semantics of Machine Learning with Schemas and Ontologies

    G. C. Publio, D. Esteves, A. Ławrynowicz, P. Panov, L. Soldatova, T. Soru, J. Vanschoren, and H. Zafar, “ML-Schema: Exposing the Semantics of Machine Learning with Schemas and Ontologies,” Jul. 2018, arXiv:1807.05351 [cs] Read Status: Read Read Status Date: 2025-05-13T14:55:43.001Z. [Online]. Available: http://arxiv.org/abs/1807.05351

  58. [58]

    Efficient Memory Management for Large Language Model Serving with PagedAttention

    W. Kwon, Z. Li, S. Zhuang, Y . Sheng, L. Zheng, C. H. Yu, J. E. Gonzalez, H. Zhang, and I. Stoica, “Efficient Memory Management for Large Language Model Serving with PagedAttention,” Sep. 2023, arXiv:2309.06180 [cs] Read Status: To read Read Status Date: 2025-09-04T14:16:10.735Z. [Online]. Available: http://arxiv.org/abs/2309.06180

  59. [59]

    AIT Log Data Set V2.0,

    M. Landauer, F. Skopik, M. Frank, W. Hotwagner, M. Wurzenberger, and A. Rauber, “AIT Log Data Set V2.0,” Feb. 2022, read Status: Read Read Status Date: 2025-02-04T16:56:44.540Z. [Online]. Available: https://zenodo.org/records/5789064

  60. [60]

    G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment

    Y . Liu, D. Iter, Y . Xu, S. Wang, R. Xu, and C. Zhu, “G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment,” May 2023, arXiv:2303.16634 [cs] Read Status: In progress Read Status Date: 2025-05-13T15:31:42.187Z. [Online]. Available: http://arxiv.org/abs/2303.16634 18 Luca Cotti graduated cum laude from the University of Brescia, Italy in 2024. He...