pith. sign in

arxiv: 2606.25608 · v1 · pith:XGNS2JDLnew · submitted 2026-06-24 · 💻 cs.CR · cs.AI

An Approach for a Supporting Multi-LLM System for Automated Certification Based on the German IT-Grundschutz

Pith reviewed 2026-06-25 20:49 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords IT-GrundschutzMulti-LLM systemHybridRAGNIS2 directivecertification automationknowledge graphssecurity conceptsBSI standards
0
0 comments X

The pith

A multi-LLM system with HybridRAG supports semi-automated BSI IT-Grundschutz certification to meet NIS2-driven demand.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a Multi-LLM System architecture that pairs large language models with knowledge graphs through HybridRAG to assist across the IT-Grundschutz certification workflow. It targets phases including protection needs assessment, modeling, the Grundschutz check itself, measure consolidation, and realization. The proposal responds to specialist shortages and the expanded scope of the NIS2 directive by aiming for higher throughput at lower cost while preserving concept quality. A reader would care because the system is positioned as a practical way to scale certifications for newly covered organizations without proportional growth in expert hours.

Core claim

The authors claim that an MLS architecture combining LLMs and KGs via HybridRAG can support the full sequence of IT-Grundschutz certification steps, thereby raising efficiency, lowering costs, and helping certifiers sustain quality under the increased volume created by NIS2.

What carries the argument

The Multi-LLM System (MLS) with HybridRAG, which routes domain tasks between LLMs and knowledge graphs to cover certification phases.

If this is right

  • The architecture can process the larger number of companies now subject to certification under NIS2.
  • Implementation and certification costs drop while certifiers retain oversight of quality.
  • All listed process phases from assessment through realization receive automated assistance.
  • Specialist time is redirected from routine tasks to higher-level review.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same MLS pattern could be retargeted at other national or sector-specific security certification schemes.
  • Smaller organizations might reach compliance thresholds faster once the system is production-ready.
  • Workflow integration studies would be needed to measure actual time savings versus current manual practice.
  • Error patterns observed in early deployments could guide targeted knowledge-graph expansions.

Load-bearing premise

The premise that LLMs plus knowledge graphs can reliably manage the intricate, domain-specific steps of IT-Grundschutz certification without frequent major errors that still require heavy human correction.

What would settle it

A controlled run on a standard IT-Grundschutz case in which the system outputs protection needs or measures that certified experts judge to be materially incorrect or incomplete.

Figures

Figures reproduced from arXiv: 2606.25608 by Lea Roxanne Muth, Marian Margraf.

Figure 1
Figure 1. Figure 1: Nine-Step Process for IT-Grundschutz Certification under Standard [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Multi-LLM architecture with HybridRAG. Our MLS is based on [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
read the original abstract

This paper presents a novel approach to perform semi-automated BSI IT-Grundschutz certification using a MultiLarge Language Model system (MLS) with Hybrid RetrievalAugmented Generation (HybridRAG). Facing the challenges of the Network and Information Security Directive 2 (NIS2) directive, a shortage of specialists, and high implementation costs, our MLS architecture aims to increase efficiency, reduce costs, and support certifiers in maintaining the quality of security concepts while meeting the increased demand for certifications of newly affected companies. The system combines Large Language Models (LLMs) and Knowledge Graphs (KGs) to support different phases of the certification process, including protection needs assessment, modeling, IT-Grundschutz check, measure consolidation, and subsequent realization. Our architecture addresses the growing demand for security concepts and offers an approach to handle the digital security challenges introduced by NIS2.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper proposes a Multi-LLM System (MLS) with HybridRAG for semi-automated BSI IT-Grundschutz certification. It describes an architecture combining LLMs and knowledge graphs to support phases including protection-needs assessment, modeling, IT-Grundschutz checks, measure consolidation, and realization, with the goal of increasing efficiency and reducing costs to meet NIS2-driven demand.

Significance. If the architecture could be shown to perform the described tasks reliably with limited human oversight, the approach would address a practical bottleneck in scaling IT-security certifications amid specialist shortages.

major comments (1)
  1. [Abstract and system-architecture description] The central claim that the MLS + HybridRAG combination can reliably perform protection-needs assessment, modeling, IT-Grundschutz checks, and measure consolidation with limited human intervention is load-bearing yet unsupported. The manuscript provides only a high-level architecture description and offers no prototype implementation, test cases, accuracy/completeness metrics, error analysis, or comparison against expert-certified outputs (see Abstract and the sections describing component roles).

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive review. The major comment correctly identifies that the manuscript is a high-level architectural proposal without empirical evaluation. We will revise the paper to clarify its scope as a conceptual contribution and to moderate claims accordingly.

read point-by-point responses
  1. Referee: [Abstract and system-architecture description] The central claim that the MLS + HybridRAG combination can reliably perform protection-needs assessment, modeling, IT-Grundschutz checks, and measure consolidation with limited human intervention is load-bearing yet unsupported. The manuscript provides only a high-level architecture description and offers no prototype implementation, test cases, accuracy/completeness metrics, error analysis, or comparison against expert-certified outputs (see Abstract and the sections describing component roles).

    Authors: We agree that the current manuscript presents only a conceptual architecture and does not include any prototype, test cases, metrics, or expert comparisons. The abstract and body describe an approach that 'aims to increase efficiency' and 'support certifiers,' rather than asserting proven reliability. To address the concern, we will revise the abstract, introduction, and architecture sections to explicitly frame the work as a proposed framework whose benefits remain to be validated through implementation. We will add a new subsection on limitations and future work that outlines the planned prototype development, evaluation against certified outputs, and collection of accuracy metrics. These changes will ensure the claims match the evidential content of the paper. revision: yes

Circularity Check

0 steps flagged

No circularity: high-level architecture proposal with no derivations or fitted parameters

full rationale

The manuscript is a conceptual system proposal describing a Multi-LLM architecture with HybridRAG for IT-Grundschutz certification tasks. It contains no equations, no parameter fitting, no predictions derived from data, and no self-citations that serve as load-bearing justifications for uniqueness or ansatzes. All claims are forward-looking architectural suggestions rather than reductions of outputs to inputs by construction. The derivation chain is therefore self-contained at the level of design description.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available for review; no specific free parameters, axioms, or invented entities are detailed in the provided text.

pith-pipeline@v0.9.1-grok · 5682 in / 1030 out tokens · 24625 ms · 2026-06-25T20:49:00.435491+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

24 extracted references · 3 canonical work pages · 2 internal anchors

  1. [1]

    L333, pp

    European Parliament and Council of the European Union, “Directive (EU) 2022/2555 of the European Parliament and of the Council of 14 December 2022 on measures for a high common level of cybersecurity across the Union, amending Regulation (EU) No 910/2014 and Directive (EU) 2018/1972, and repealing Directive (EU) 2016/1148 (NIS 2 Directive),” Official Jour...

  2. [2]

    Assessing the economic impact of EU initiatives on cybersecurity,

    Frontier Economics, “Assessing the economic impact of EU initiatives on cybersecurity,” Jul. 2023. Available: https://www.frontier-economics.com/media/izyk5rgz/assessing-the- economic-cost-of-eu-initiatives-on-cybersecurity.pdf.(Accessed: 2025- 03-08)

  3. [3]

    Annex – Implications of new cyber security measures in Germany,

    Frontier Economics, “Annex – Implications of new cyber security measures in Germany,” Sep. 2023. Available: https://www.frontier-economics.com/media/zusb5lly/cost-impact- of-cyber-security-germany-080923-final.pdf.(Accessed: 2025-03-08)

  4. [4]

    Navigating cy- bersecurity investments in the time of NIS 2,

    European Union Agency for Cybersecurity (ENISA), “Navigating cy- bersecurity investments in the time of NIS 2,” ENISA, Jul. 2023. Available: https://www.enisa.europa.eu/news/navigating-cybersecurity- investments-in-the-time-of-nis-2.(Accessed: 2025-03-08)

  5. [5]

    IT- Grundschutz-Kompendium,

    Bundesamt f ¨ur Sicherheit in der Informationstechnik, “IT- Grundschutz-Kompendium,” Edition 2023. Available: https: //www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Grundschutz/IT- GS-Kompendium/IT Grundschutz Kompendium Edition2023.pdf. (Accessed: 2025-03-08)

  6. [6]

    BSI-Standards,

    BSI, “BSI-Standards,” Bundesamt f ¨ur Sicherheit in der Information- stechnik. Available: https://www.bsi.bund.de/dok/6603458. (Accessed: 2025-03-08)

  7. [7]

    Rechtsrahmen Cybersicherheit,

    Bundesministerium des Innern und f ¨ur Heimat, “Rechtsrahmen Cybersicherheit,” bmi.bund.de. Available: https://www.bmi.bund.de/ DE/themen/it-und-digitalpolitik/it-und-cybersicherheit/rechtsrahmen- cybersicherheit/rechtsrahmen-cybersicherheit-node.html. (Accessed: 2025-03-08)

  8. [8]

    FAQ zu NIS-2,

    Bundesamt f ¨ur Sicherheit in der Informationstechnik, “FAQ zu NIS-2,” BSI, 2023. Available: https://www.bsi.bund.de/DE/Themen/Regulierte- Wirtschaft/NIS-2-regulierte-Unternehmen/NIS-2-FAQ/FAQ-zu-NIS- 2 node.html. (Accessed: 2025-03-08)

  9. [9]

    IT-Grundschutz- Kompendium: Hilfsmittel und Anwenderbeitr ¨age,

    Bundesamt f ¨ur Sicherheit in der Informationstechnik, “IT-Grundschutz- Kompendium: Hilfsmittel und Anwenderbeitr ¨age,” BSI, 2023. Available:https://www.bsi.bund.de/DE/Themen/Unternehmen-und- Organisationen/Standards-und-Zertifizierung/IT-Grundschutz/IT- Grundschutz-Kompendium/Hilfsmittel-und-Anwenderbeitraege/ Recplast/recplast node.html. (Accessed: 2025-03-08)

  10. [10]

    findings-emnlp.765/

    K. Liu, F. Wang, Z. Ding, S. Liang, Z. Yu, and Y . Zhou, “A review of KG application scenarios in cyber security,” ArXiv preprint, vol. abs/2204.04769, Apr. 2022. Available: https://doi.org/10.48550/arXiv. 2204.04769

  11. [11]

    Towards A Knowledge Graph-based Frame- work for Integrated Security and Safety Analysis in Digital Produc- tion Systems,

    S. J. Kropatschek, K. Kurniawan, P. R. Bhosale, S. Hollerer, E. Kies- ling, and D. Winkler, “Towards A Knowledge Graph-based Frame- work for Integrated Security and Safety Analysis in Digital Produc- tion Systems,” in Proceedings of the ISWC 2023 Posters, Demos and Industry Tracks: From Novel Ideas to Industrial Practice co- located with 22nd Internationa...

  12. [12]

    Enhancing Legal Compliance and Regulation Analysis with Large Language Models,

    S. Hassani, “Enhancing Legal Compliance and Regulation Analysis with Large Language Models,” arXiv preprint arXiv:2404.17522, Apr. 2024. Available: https://doi.org/10.48550/arXiv.2404.17522. (Accessed: 2025- 03-08)

  13. [13]

    Gracenote.ai: Legal Generative AI for Regulatory Compliance,

    J. Ioannidis, J. Harper, M. S. Quah, and D. Hunter, “Gracenote.ai: Legal Generative AI for Regulatory Compliance,” in Proceedings of the Third International Workshop on Artificial Intelligence and Intelligent Assistance for Legal Professionals in the Digital Workplace (LegalAIIA 2023), Braga, Portugal, Jun. 2023

  14. [14]

    AI-Driven Regulatory Compliance: Transforming Financial Oversight through Large Language Models and Automation,

    S. Sinha, “AI-Driven Regulatory Compliance: Transforming Financial Oversight through Large Language Models and Automation,” ResearchGate, Dec. 2022. Available: https: //www.researchgate.net/publication/388231248 AI-Driven Regulatory Compliance Transforming Financial Oversight through Large Language Models and Automation. (Accessed: 2025-03-08)

  15. [15]

    From Local to Global: A Graph RAG Approach to Query-Focused Summarization,

    D. Edge, H. Trinh, N. Cheng, J. Bradley, A. Chao, A. Mody, S. Truitt, and J. Larson, “From Local to Global: A Graph RAG Approach to Query-Focused Summarization,” arXiv preprint arXiv:2404.16130, Apr

  16. [16]

    From Local to Global: A Graph RAG Approach to Query-Focused Summarization

    Available: https://doi.org/10.48550/arXiv.2404.16130. (Accessed: 2025-03-08)

  17. [17]

    HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Aug- mented Generation for Efficient Information Extraction,

    B. Sarmah, B. Hall, R. Rao, S. Patel, S. Pasquali, and D. Mehta, “HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Aug- mented Generation for Efficient Information Extraction,” arXiv preprint arXiv:2408.04948, Aug. 2024

  18. [18]

    GraphRAG: Leveraging Graph- Based Efficiency to Minimize Hallucinations in LLM-Driven RAG for Finance Data,

    M. Barry, G. Caillaut, P. Halftermeyer, R. Qader, M. Mouayad, D. Cariolaro, F. Le Deit, and J. Gesnouin, “GraphRAG: Leveraging Graph- Based Efficiency to Minimize Hallucinations in LLM-Driven RAG for Finance Data,” in Proceedings of the 2025 Conference on Generative AI and Knowledge (GenAIK), Jan. 2025, pp. 54–63

  19. [19]

    Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey,

    G. Agrawal, T. Kumarage, Z. Alghamdi, and H. Liu, “Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey,” in Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (V olume 1: Long Papers), Jun. 2024, pp. 3947–3960

  20. [20]

    Attention Is All You Need,

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention Is All You Need,” arXiv preprint arXiv:1706.03762, Jun. 2017

  21. [21]

    Chain-of-verification reduces hallucination in large language models,

    Y . Bai, S. Kadavath, S. Kundu, A. Askell, J. Kernion, A. Jones et al., “Chain-of-verification reduces hallucination in large language models,” arXiv preprint arXiv:2309.11495, Sep. 2023

  22. [22]

    BERTScore: Evaluating Text Generation with BERT,

    T. Zhang, V . Kishore, F. Wu, K. Q. Weinberger, and Y . Artzi, “BERTScore: Evaluating Text Generation with BERT,” in International Conference on Learning Representations (ICLR), Apr. 2020

  23. [23]

    Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection,

    J. Wu, S. Li, A. Deng, M. Xiong, and B. Hooi, “Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection,” in Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM ’23), ACM, New York, NY , USA, Sep. 2023, pp. 1–11

  24. [24]

    Self-Preference Bias in LLM-as-a- Judge,

    K. Wataoka, T. Takahashi, R. Ri, “Self-Preference Bias in LLM-as-a- Judge,” Submitted to ICLR 2025, Sep. 2024