arxiv: 2604.23905 · v1 · submitted 2026-04-26 · 💻 cs.CR · cs.AI

Recognition: unknown

SMSI: System Model Security Inference: Automated Threat Modeling for Cyber-Physical Systems

Ro\'Yah Radaideh , Ali Khreis

Authors on Pith no claims yet

Pith reviewed 2026-05-08 05:43 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords automated threat modelingcyber-physical systemsNIST 800-53MITRE ATT&CKSecureBERTSysMLvulnerability mappingneuro-symbolic pipeline

0 comments

The pith

SMSI automates threat modeling for cyber-physical systems by mapping SysML models through vulnerabilities and attack techniques to prioritized NIST 800-53 controls.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SMSI, a hybrid neuro-symbolic pipeline that takes a SysML architecture model and generates a prioritized list of NIST 800-53 security controls for cyber-physical systems. It combines a deterministic parser that links components to NVD vulnerabilities, machine learning models that connect those vulnerabilities to MITRE ATT&CK techniques, and a control recommender stage. Three CVE-to-ATT&CK methods are tested, including fine-tuned SecureBERT+, dense retrieval encoders, and a zero-shot LLM approach with Gemma-4 26B, with the full pipeline validated on a nine-component healthcare IoT gateway. The work shows that pretrained SecureBERT delivers the strongest results in the ATT&CK-to-NIST stage, suggesting dense embeddings can support reliable automated control recommendations. This matters because it targets the currently manual and error-prone process of securing complex interconnected systems.

Core claim

The central claim is that a three-stage hybrid pipeline starting from a SysML system model can automate threat modeling by first deterministically mapping components to vulnerabilities via the NVD, then using retrieval and classification models to link vulnerabilities to MITRE ATT&CK techniques, and finally recommending a prioritized set of NIST 800-53 controls. Among the CVE-to-ATT&CK options explored, supervised fine-tuned SecureBERT+, retrieval-based dense encoders, and zero-shot LLM with Gemma-4 26B were compared on a healthcare IoT gateway validation case with nine software components. For the final ATT&CK-to-NIST stage, pretrained SecureBERT achieved the highest control retrieval and F

What carries the argument

The SMSI three-stage pipeline: deterministic parser from SysML components to NVD vulnerabilities, family of retrieval and classification models for CVE-to-ATT&CK mapping, and control recommender using dense embeddings such as pretrained SecureBERT.

If this is right

Threat modeling for CPS can shift from fully manual processes to a semi-automated workflow that starts directly from architecture models.
Dense embedding models like pretrained SecureBERT provide a strong basis for retrieving relevant NIST controls from ATT&CK techniques without requiring stage-specific fine-tuning.
Multiple mapping strategies for vulnerabilities to attack techniques can be compared directly on retrieval and classification metrics within the same pipeline.
The resulting prioritized control lists can be produced for IoT-style systems containing a small number of software components.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the single-case validation generalizes, the approach could reduce the time and expertise required to secure new CPS designs in domains beyond healthcare IoT.
The neuro-symbolic combination of deterministic parsing with learned retrieval might offer more traceable recommendations than purely data-driven methods.
Extending the pipeline to accept live telemetry or additional system models could turn it into an ongoing monitoring tool rather than a one-time design aid.

Load-bearing premise

The mappings produced by the CVE-to-ATT&CK and ATT&CK-to-NIST stages are sufficiently accurate and generalizable to produce reliable prioritized control recommendations beyond the single nine-component healthcare IoT validation case.

What would settle it

Applying the full pipeline to a second, independent cyber-physical system such as an automotive or industrial control setup and measuring whether the generated prioritized control list aligns with an independent expert manual threat model on the same architecture.

Figures

Figures reproduced from arXiv: 2604.23905 by Ali Khreis, Ro\'Yah Radaideh.

**Figure 1.** Figure 1: CVE-to-ATT&CK retrieval: all methods vs KEV ground truth ( view at source ↗

**Figure 2.** Figure 2: Pearson correlation between all CVE-to-ATT&CK model scores. TF– view at source ↗

**Figure 3.** Figure 3: SecureBERT+ aggregate test metrics at optimal threshold 0.45 (105 view at source ↗

**Figure 4.** Figure 4: SecureBERT+ per-class F1: best 10 vs worst 10 parent techniques. view at source ↗

**Figure 7.** Figure 7: ATT&CK-to-controls retrieval: all methods vs CTID crosswalk ground view at source ↗

**Figure 8.** Figure 8: Priority distribution for technique-control pairs ( view at source ↗

read the original abstract

Threat modeling for cyber-physical systems (CPS) remains a largely manual exercise. This project presents SMSI (System Model Security Inference), a hybrid neuro-symbolic pipeline that starts from a SysML architecture model and produces a prioritized list of NIST 800-53 security controls. The prototype has three main stages: a deterministic parser mapping system components to vulnerabilities via the NVD; a family of retrieval and classification models linking vulnerabilities to MITRE ATT&CK techniques; and a control recommender. We explore three approaches for CVE-to-ATT&CK mapping: a supervised classifier using fine-tuned SecureBERT+, retrieval-based dense encoders, and a zero-shot LLM approach using Gemma-4 26B. We validate the pipeline on a healthcare IoT gateway with nine software components. For the ATT&CK-to-NIST stage, pretrained SecureBERT achieves the highest control retrieval scores, demonstrating that dense embeddings provide a strong basis for automated control recommendation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SMSI builds a workable end-to-end pipeline from SysML models to NIST controls via NVD and ATT&CK mappings, but the single nine-component case and missing metrics leave reliability unproven.

read the letter

The main point is that this paper describes SMSI, a pipeline that starts with a SysML model of a cyber-physical system, pulls vulnerabilities from the NVD, maps them to MITRE ATT&CK techniques using one of three methods, and then recommends prioritized NIST 800-53 controls. The new piece is the complete integration into an automated flow rather than any single novel component. They test a fine-tuned SecureBERT+ classifier, dense retrieval encoders, and a zero-shot LLM for the CVE-to-ATT&CK step, and note that pretrained SecureBERT works best for the final control recommendation stage on their example. That shows a practical way to combine symbolic parsing with embedding-based retrieval for a task that is still mostly manual today. The approach is grounded in existing databases and models, which is a plus for reproducibility if the code is released. The soft spots are clear and central. Validation is limited to one healthcare IoT gateway with nine software components, and the description supplies no accuracy numbers, precision-recall figures, baseline comparisons, or error propagation analysis for any stage. Without those, it is difficult to judge whether upstream mapping mistakes distort the final control list or whether the results hold for other CPS architectures. Generalizability is simply not tested. This work is aimed at applied researchers in CPS and IoT security who need concrete automation ideas. A reader already familiar with ATT&CK and NIST mappings could pick up the pipeline structure and the three mapping variants as a useful reference. It deserves peer review because the problem matters and the hybrid setup is coherent, but the authors will need to add quantitative results and additional test cases before the claims about reliable recommendations can be taken seriously.

Circularity Check

0 steps flagged

No circularity; empirical comparisons rest on external databases and model benchmarks.

full rationale

The paper presents a hybrid pipeline with three explicit stages: deterministic SysML-to-NVD parsing, CVE-to-ATT&CK mapping via three independent methods (fine-tuned SecureBERT+, dense retrieval encoders, zero-shot Gemma-4 LLM), and ATT&CK-to-NIST control recommendation. The strongest claim—that pretrained SecureBERT achieves highest control retrieval scores—is an empirical ranking obtained by running the models on the nine-component healthcare IoT validation case and comparing against external NIST controls. No equations, fitted parameters, or derivations are defined in terms of the final prioritized list; the result is not forced by construction. No load-bearing self-citations or uniqueness theorems from prior author work are invoked. The logic is self-contained against external security databases (NVD, MITRE ATT&CK, NIST 800-53) and does not reduce to its own inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The pipeline depends on the completeness of external databases (NVD, ATT&CK) and the generalization of pre-trained models; ML components introduce fitted parameters during fine-tuning, while the deterministic parser rests on domain assumptions about model-to-vulnerability mapping.

free parameters (1)

fine-tuning hyperparameters for SecureBERT+
The supervised classifier approach requires training a language model on security data, introducing parameters chosen or fitted during that process.

axioms (1)

domain assumption SysML architecture models can be deterministically parsed to accurately identify components and link them to NVD vulnerabilities
The first stage of the pipeline assumes reliable extraction and mapping without detailing error handling or coverage limitations.

pith-pipeline@v0.9.0 · 5461 in / 1448 out tokens · 52813 ms · 2026-05-08T05:43:24.073366+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 14 canonical work pages

[1]

CVE2ATT&CK: BERT-based mapping of CVEs to MITRE ATT&CK techniques,

O. Grigorescu, A. Nica, M. Dascalu, and R. Rughinis, “CVE2ATT&CK: BERT-based mapping of CVEs to MITRE ATT&CK techniques,”Algo- rithms, vol. 15, no. 9, Aug. 2022, doi: 10.3390/a15090314

work page doi:10.3390/a15090314 2022
[2]

Automated mapping of common vulnerabilities and exposures to MITRE ATT&CK tactics,

I. Branescu, O. Grigorescu, and M. Dascalu, “Automated mapping of common vulnerabilities and exposures to MITRE ATT&CK tactics,” Information, vol. 15, no. 4, Apr. 2024, doi: 10.3390/info15040214

work page doi:10.3390/info15040214 2024
[3]

Automated discovery and mapping ATT&CK tactics and techniques for unstructured cyber threat in- telligence,

L. Li, C. Huang, and J. Chen, “Automated discovery and mapping ATT&CK tactics and techniques for unstructured cyber threat in- telligence,”Comput. Secur., vol. 140, p. 103815, May 2024, doi: 10.1016/j.cose.2024.103815

work page doi:10.1016/j.cose.2024.103815 2024
[4]

SMET: Semantic mapping of CVE to ATT&CK and its application to cy- bersecurity,

B. Abdeen, E. Al-Shaer, A. Singhal, L. Khan, and K. Hamlen, “SMET: Semantic mapping of CVE to ATT&CK and its application to cy- bersecurity,” inData and Applications Security and Privacy XXXVII, V . Atluri and A. L. Ferrara, Eds. Cham: Springer, 2023, pp. 243–260, doi: 10.1007/978-3-031-37586-6 15

work page doi:10.1007/978-3-031-37586-6 2023
[5]

SMET: Semantic mapping of CTI reports and CVE to ATT&CK for advanced threat intelligence,

B. Abdeenet al., “SMET: Semantic mapping of CTI reports and CVE to ATT&CK for advanced threat intelligence,”J. Comput. Secur., 2024, to appear. [Online]. Available: https://doi.org/10.3233/JCS-230218

work page doi:10.3233/jcs-230218 2024
[6]

Linking CVE’s to MITRE ATT&CK techniques,

A. Kuppa, L. Aouad, and N.-A. Le-Khac, “Linking CVE’s to MITRE ATT&CK techniques,” inProc. 16th Int. Conf. Availability, Reliability and Security (ARES), New York, NY , USA: ACM, Aug. 2021, pp. 1–12, doi: 10.1145/3465481.3465758

work page doi:10.1145/3465481.3465758 2021
[7]

Not the end of story: An evaluation of ChatGPT-driven vulnerability description mappings,

X. Liu, Y . Tan, Z. Xiao, J. Zhuge, and R. Zhou, “Not the end of story: An evaluation of ChatGPT-driven vulnerability description mappings,” in Findings of the Assoc. Comput. Linguist.: ACL 2023, Toronto, Canada: ACL, Jul. 2023, pp. 3724–3731, doi: 10.18653/v1/2023.findings-acl.229

work page doi:10.18653/v1/2023.findings-acl.229 2023
[8]

Mapping vulnerability description to MITRE ATT&CK framework by LLM,

P. Rafiey and A. Namadchian, “Mapping vulnerability description to MITRE ATT&CK framework by LLM,” May 2024, Research Square, doi: 10.21203/rs.3.rs-4341401/v1

work page doi:10.21203/rs.3.rs-4341401/v1 2024
[9]

MITREtrieval: Retrieving MITRE techniques from unstructured threat reports by fusion of deep learning and ontology,

Y .-T. Huanget al., “MITREtrieval: Retrieving MITRE techniques from unstructured threat reports by fusion of deep learning and ontology,” IEEE Trans. Netw. Service Manag., vol. 21, no. 4, pp. 4871–4887, Aug. 2024, doi: 10.1109/TNSM.2024.3401200

work page doi:10.1109/tnsm.2024.3401200 2024
[10]

AttacKG: Constructing technique knowledge graph from cyber threat intelligence reports,

Z. Li, J. Zeng, Y . Chen, and Z. Liang, “AttacKG: Constructing technique knowledge graph from cyber threat intelligence reports,” arXiv:2111.07093, May 2022, doi: 10.48550/arXiv.2111.07093

work page doi:10.48550/arxiv.2111.07093 2022
[11]

Linking threat tactics, techniques, and patterns with defensive weaknesses, vulnerabilities and affected platform con- figurations for cyber hunting,

E. Hemberget al., “Linking threat tactics, techniques, and patterns with defensive weaknesses, vulnerabilities and affected platform con- figurations for cyber hunting,” arXiv:2010.00533, Feb. 2021, doi: 10.48550/arXiv.2010.00533

work page doi:10.48550/arxiv.2010.00533 2010
[12]

Cyber Evaluation and Management Toolkit (CEMT): Face validity of model-based cybersecurity deci- sion making,

S. Fowler, K. Joiner, and S. Ma, “Cyber Evaluation and Management Toolkit (CEMT): Face validity of model-based cybersecurity deci- sion making,”Systems, vol. 12, no. 7, Jun. 2024, doi: 10.3390/sys- tems12070238

work page doi:10.3390/sys- 2024
[13]

A systematic approach to predict the impact of cybersecurity vulnerabilities using LLMs,

A. Høst, P. Lison, and L. Moonen, “A systematic approach to predict the impact of cybersecurity vulnerabilities using LLMs,” 2025. [Online]. Available: https://arxiv.org/abs/2508.18439

work page arXiv 2025
[14]

Statistical word analysis to support the semiautomatic implementation of the NIST 800-53 cybersecurity frame- work,

R. Sahu and M. Speretta, “Statistical word analysis to support the semiautomatic implementation of the NIST 800-53 cybersecurity frame- work,” inProc. ASEE Annu. Conf., 2024

2024
[15]

5, National Institute of Standards and Technology, Dec

Joint Task Force,Security and Privacy Controls for Information Sys- tems and Organizations, NIST Special Publication 800-53 Rev. 5, National Institute of Standards and Technology, Dec. 2020, doi: 10.6028/NIST.SP.800-53r5

work page doi:10.6028/nist.sp.800-53r5 2020
[16]

National Vulnerability Database,

National Institute of Standards and Technology, “National Vulnerability Database,” [Online]. Available: https://nvd.nist.gov/. Accessed: Feb. 21, 2026

2026
[17]

MITRE ATT&CK,

MITRE Corporation, “MITRE ATT&CK,” [Online]. Available: https: //attack.mitre.org/. Accessed: Feb. 21, 2026

2026