Context-aware Entity-Relation Extraction for Threat Intelligence Knowledge Graphs

Inoussa Mouiche; Sherif Saad

arxiv: 2605.15904 · v1 · pith:WCD3QORTnew · submitted 2026-05-15 · 💻 cs.LG

Context-aware Entity-Relation Extraction for Threat Intelligence Knowledge Graphs

Inoussa Mouiche , sherif Saad This is my paper

Pith reviewed 2026-05-20 19:53 UTC · model grok-4.3

classification 💻 cs.LG

keywords cybersecurityknowledge graphsentity recognitionrelation extractionthreat intelligencenatural language processingSecureBERTSTIX

0 comments

The pith

A pipeline framework combines SecureBERT+ embeddings with domain ontology knowledge to extract entities and relations from cybersecurity threat reports while reducing error propagation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to address the challenge of building accurate cybersecurity knowledge graphs from unstructured CTI reports, where complex language and report structures cause traditional extraction pipelines to accumulate errors. It introduces the CTiKG framework as a context-aware pipeline that pairs contextual embeddings from SecureBERT+ with expert rules drawn from a domain ontology. This hybrid approach is meant to cut misclassifications at each stage and limit the cascading mistakes that degrade downstream accuracy. A sympathetic reader would care because more reliable entity-relation triples would yield higher-quality, queryable knowledge graphs that support faster and more automated security decisions.

Core claim

The CTiKG framework accurately extracts and classifies threat entities and their relationships from CTI reports by incorporating hybrid NLP models that leverage SecureBERT+ contextual embeddings and expert knowledge from a domain ontology to reduce misclassifications and mitigate cascading errors.

What carries the argument

The CTiKG pipeline architecture, which chains hybrid NLP models that fuse SecureBERT+ contextual embeddings with domain-ontology rules to extract and classify entity-relation triples from CTI text.

If this is right

Higher NER and RE accuracy produces cleaner triples for constructing queryable cybersecurity knowledge graphs.
Lower error propagation across the pipeline raises end-to-end reliability for real-time threat analysis.
Validation on DNRTI and STUCCO benchmarks indicates the approach generalizes beyond the main test set.
Public release of the DNRTI-AUG-STIX2 dataset supports direct replication and extension by others.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same hybrid pattern could be tested on other specialized report domains where jargon and structure create similar extraction problems.
If the ontology rules can be kept up to date, the framework might lower the ongoing manual curation burden for security teams.
Success here suggests that lightweight domain injection into pre-trained language models can outperform purely data-driven baselines in narrow technical fields.

Load-bearing premise

Integrating SecureBERT+ embeddings with domain ontology knowledge will meaningfully reduce misclassifications and stop errors from cascading through the extraction pipeline.

What would settle it

Re-running the experiments on the DNRTI-AUG-STIX2 dataset and finding no improvement or a drop in precision, recall, or F1 for named-entity recognition and relation extraction compared with prior baselines would falsify the performance claim.

Figures

Figures reproduced from arXiv: 2605.15904 by Inoussa Mouiche, Sherif Saad.

**Figure 2.** Figure 2: NER model’s archicteure (SecureBERT+-BiGRU-CRF) [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: SecureBERT+-BiGRU-TDD relation extraction model. – Input to SecureBERT+ Embedding: et = SecureBERT+(xt), ∀t ∈ {1, . . . , T }, (9) where x = (x1, . . . , xT ) is the input token sequence, and et is the contextual embedding of token xt produced by SecureBERT+. – SecureBERT+ Embedding to BiGRU: ht = BiGRU(et), (10) where ht is the hidden state at time step t produced by the bidirectional GRU. – BiGRU to Full… view at source ↗

**Figure 4.** Figure 4: A sample ontology schema containing entities (nodes [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 5.** Figure 5: SecureBERT+-BiGRU-TDD: training and validation losses and F1 Scores [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

read the original abstract

Cybersecurity Knowledge Graphs (CKGs) unify diverse Cyber Threat Intelligence (CTI) sources into structured, queryable formats, offering scalable solutions for automating proactive and real-time security responses. Their increasing adoption has significantly enhanced the workflow and decision-making efficiency of security professionals. However, constructing CKGs requires extracting entity-relation triples from unstructured CTI reports, a task hindered by complex report structure, domain-specific language, and semantic ambiguity. As a result, existing pipeline-based approaches often suffer from error propagation, reducing extraction accuracy and limiting generalizability. This paper introduces the Context-aware Threat Intelligence Knowledge Graph (CTiKG) framework, a pipeline architecture designed to accurately extract and classify threat entities and their relationships from CTI reports. CTiKG incorporates hybrid NLP models that leverage SecureBERT+ contextual embeddings and expert knowledge from a domain ontology to reduce misclassifications and mitigate cascading errors. Experiments on the DNRTI-AUG-STIX2 dataset, which comprises 21 entity types aligned with STIX 2.1, demonstrate significant improvements over state-of-the-art baselines, yielding 3-4% gains in NER and up to 8% in RE performance, based on precision, recall, and F1-score. Additional validation on DNRTI and STUCCO benchmarks confirms the framework's robustness and practical applicability. All datasets, including the curated DNRTI-AUG-STIX2, are released on GitHub to foster reproducibility and further research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper introduces the Context-aware Threat Intelligence Knowledge Graph (CTiKG) framework, a pipeline architecture for extracting entity-relation triples from unstructured cyber threat intelligence (CTI) reports. It employs hybrid NLP models that combine SecureBERT+ contextual embeddings with expert knowledge from a domain ontology to reduce misclassifications and mitigate error propagation. Experiments on the newly curated DNRTI-AUG-STIX2 dataset (21 entity types aligned with STIX 2.1) report 3-4% gains in named entity recognition (NER) and up to 8% in relation extraction (RE) over state-of-the-art baselines, measured by precision, recall, and F1-score; additional results are shown on the DNRTI and STUCCO benchmarks. All datasets are released on GitHub.

Significance. If the performance gains prove robust under proper statistical controls, the work could meaningfully advance automated construction of cybersecurity knowledge graphs by addressing a practical bottleneck in CTI processing. The public release of the DNRTI-AUG-STIX2 dataset and the focus on a domain-specific embedding (SecureBERT+) constitute clear strengths for reproducibility and applicability. The central empirical claim, however, rests on small absolute improvements whose reliability cannot be assessed without variance estimates or significance testing.

major comments (1)

[Abstract] Abstract and Experiments section: The claim of 'significant improvements' (3-4% NER, up to 8% RE) is load-bearing for the paper's contribution yet is presented without statistical significance tests, standard deviations across multiple random seeds, error bars, or details on baseline re-implementations and train-test split stability. On an augmented dataset, these omissions leave open the possibility that observed deltas fall within run-to-run noise.

minor comments (2)

The description of the domain ontology integration and how expert knowledge is injected into the hybrid model could be expanded with a concrete example or pseudocode to clarify the pipeline.
Table or figure captions for the benchmark results should explicitly state the number of runs and any hyperparameter search protocol used for the reported F1 scores.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the need for statistical rigor in validating the reported performance gains. We address the major comment below and will strengthen the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract and Experiments section: The claim of 'significant improvements' (3-4% NER, up to 8% RE) is load-bearing for the paper's contribution yet is presented without statistical significance tests, standard deviations across multiple random seeds, error bars, or details on baseline re-implementations and train-test split stability. On an augmented dataset, these omissions leave open the possibility that observed deltas fall within run-to-run noise.

Authors: We agree that the absence of variance estimates and significance testing leaves the robustness of the small absolute gains open to question. In the revised manuscript we will add: results from at least five independent runs with different random seeds, reporting means and standard deviations for all metrics; error bars on all bar charts; paired statistical significance tests (e.g., t-test or Wilcoxon signed-rank) with p-values comparing CTiKG to each baseline; and explicit documentation of baseline re-implementations together with the exact train-test split procedure used for DNRTI-AUG-STIX2. These additions will confirm that the observed 3-4 % NER and up to 8 % RE improvements exceed run-to-run variability. revision: yes

Circularity Check

0 steps flagged

No significant circularity: empirical application study without derivation chain

full rationale

The paper describes a pipeline framework (CTiKG) for NER and RE on CTI reports, using SecureBERT+ embeddings plus a domain ontology. No equations, first-principles derivations, or predictions are presented that reduce by construction to fitted parameters or input data. Reported 3-4% NER and 8% RE gains are empirical results on DNRTI-AUG-STIX2 and other benchmarks; they are not tautological with any model definition or self-citation. The work is self-contained as an applied ML study with no load-bearing self-citations, uniqueness theorems, or ansatz smuggling. This is the expected non-finding for an empirical application paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that domain ontology knowledge can be effectively injected into a transformer-based model without introducing new error modes, and that the chosen dataset augmentation preserves the original distribution of threat intelligence language.

axioms (1)

domain assumption Hybrid models combining contextual embeddings and domain ontologies reduce misclassifications in entity-relation extraction pipelines.
Invoked in the abstract when describing how CTiKG mitigates cascading errors.

pith-pipeline@v0.9.0 · 5791 in / 1350 out tokens · 36390 ms · 2026-05-20T19:53:06.452871+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

hybrid NLP models that leverage SecureBERT+ contextual embeddings and expert knowledge from a domain ontology
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

SecureBERT+-BiGRU-CRF model ... CRF layer to enforce valid tag transitions

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 1 internal anchor

[1]

MITRE: MITRE Response to Cyber Attack in One of Its R&D Netw orks, https://www.mitre.org/news-insights/news-release/mitre-response-cyber-attack- one-its-rd-networks (accessed June 02, 2024)

work page 2024
[2]

Crumpton, L., Clancy, C.: Advanced Cyber Threats Impact E ven the Most Pre- pared, https://medium.com/mitre-engenuity/advanced-c yber-threats-impact-even- the-most-prepared-56444e980dc8 (accessed June 02, 2024)

work page 2024
[3]

ASONAM ’19: Proceedings of the 2019 IEEE/ACM Internat ional Conference on Advances in Social Networks Analysis and Mining, vol

Pingle, A., Piplai, A., Mittal, S., Joshi, A., Holt, J., Za k, R.: RelExt: relation ex- traction using deep learning approaches for cybersecurity knowledge graph improve- ment. ASONAM ’19: Proceedings of the 2019 IEEE/ACM Internat ional Conference on Advances in Social Networks Analysis and Mining, vol. 2, p p. 879-886, ACM, Vancouver, British Columbia, C...

work page 2019
[4]

Knowledge-Based Systems, vol

Sarhan, I., Spruit, M.: Open-CyKG: An Open Cyber Threat In telli- gence Knowledge Graph. Knowledge-Based Systems, vol. 233, (2021), https://doi.org/10.1016/j.knosys.2021.107524

work page doi:10.1016/j.knosys.2021.107524 2021
[5]

2022 the 7th Inter national Conference on Big Data Analytics (ICBDA), pp

Zuo, J., Gao, Y., Li, X., Yuan, J.: An End-to-end Entity and Relation Joint Extrac- tion Model for Cyber Threat Intelligence. 2022 the 7th Inter national Conference on Big Data Analytics (ICBDA), pp. 204-209. IEEE, Guangzhou , China (2022), https://doi.org/10.1109/ICBDA55095.2022.9760342 CTiKG Framework 15

work page doi:10.1109/icbda55095.2022.9760342 2022
[6]

50-61, 2021

Zhong, Z., Chen, D.: A Frustratingly Easy Approach for Ent ity and Relation Ex- traction, In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human La nguage Technologies, ACL, pp. 50-61, 2021

work page 2021
[7]

Yan, Z., Jia, Z., Tu, K.: An Empirical Study of Pipeline vs. Joint approaches to Entity and Relation Extraction, In Proceedings of the 2nd Co nference of the Asia- Paciﬁc Chapter of the Association for Computational Lingui stics and the 12th In- ternational Joint Conference on Natural Language Processi ng, ACL, pp. 437-443, 2022

work page 2022
[8]

Computers & Sec urity, vol

Ahmed, K., Khurshid, S., K., Hina, S.: CyberEntRel: Joint extraction of cyber entities and relations using deep learning. Computers & Sec urity, vol. 136, (2024), https://doi.org/10.1016/j.cose.2023.103579

work page doi:10.1016/j.cose.2023.103579 2024
[9]

Information and Communications Security: 23rd Inte rnational Con- ference, ICICS 2021, pp

Guo, Y., Liu, Z., Huang, C., Liu, J., Jing, W., Wang, Z., Wan g, Y.: CyberRel: Joint Entity and Relation Extraction for Cyberse curity Con- cepts. Information and Communications Security: 23rd Inte rnational Con- ference, ICICS 2021, pp. 447—463, Springer, Chongqing, Chi na (2021), https://doi.org/10.1007/978-3-030-86890-1_25

work page doi:10.1007/978-3-030-86890-1_25 2021
[10]

Computers & Security, vol

Mouiche, I., Saad, S.: Entity and relation extractions f or threat in- telligence knowledge graphs. Computers & Security, vol. 14 8, (2025), https://doi.org/10.1016/j.cose.2024.104120

work page doi:10.1016/j.cose.2024.104120 2025
[11]

Hugging Face, https://huggingface.co/ehsanaghaei/SecureBERT_Plu, last ac- cessed 2025/02/25

work page 2025
[12]

Security and Privacy in C ommunication Net- works, vol

Aghaei, E., Niu, X., Shadid, W., Al-Shaer, E.: SecureBER T: A Domain-Speciﬁc Language Model for Cybersecurity. Security and Privacy in C ommunication Net- works, vol. 462, (2023) https://doi.org/10.1109/TrustCom50675.2020.00083

work page doi:10.1109/trustcom50675.2020.00083 2023
[13]

In Proceeding s of NAACL-HLT, 2016

Lample, G., Ballesteros, M., Subramanian, S., Kawakami , K., Dyer, C.: Neural Architectures for Named Entity Recognition. In Proceeding s of NAACL-HLT, 2016

work page 2016
[14]

Security and Privacy in C ommunication Net- works, vol

Wang, X. et al.: DNRTI: A Large-Scale Dataset for Named En tity Recognition in Threat Intelligence. 2020 IEEE 19th Interna tional Con- ference on Trust, Security and Privacy in Computing and Comm uni- cations (TrustCom), pp. 1842-1848. IEEE, Guangzhou, China , (2020), https://doi.org/10.1109/TrustCom50675.2020.00252

work page doi:10.1109/trustcom50675.2020.00252 2020
[15]

A., Jones, C

Bridges, R. A., Jones, C. L., Iannacone, M. D., Goodall, J , R.: Automatic Labeling for Entity Extraction in Cyber Security. The Third ASE Inter national Conference on Cyber Security 2014, (2014)

work page 2014
[16]

Computers & Security, vol

Wang, X., Liu, Z., Liu, J.: Information extraction of cyb ersecurity concepts: an lstm approach. Computers & Security, vol. 144, (2024)

work page 2024
[17]

Computers & Security, vol

Guo, Z., Liu, Z., Huang, C., Wang, N., Min, H., Guo, W., Liu , J.: A framework for threat intelligence extraction and fusion. Computers & Security, vol. 132, 2024, https://doi.org/10.1016/j.cose.2023.103371

work page doi:10.1016/j.cose.2023.103371 2024
[18]

Bilayer-induced asymmetric quantum Hall effect in epitaxial graphene

Liu, Y., Han, X., Zuo, W., Lv, H., Guo, J.: CTI-JE: A Joint E xtrac- tion Framework of Entities and Relations in Unstructured Cy ber Threat In- telligence. 27th International Conference on Computer Sup ported Coopera- tive Work in Design (CSCWD), pp. 2728-2733. IEEE, Tianjin, C hina (2024), https://doi.org/10.1109/CSCWD61410.2024.10580210

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/cscwd61410.2024.10580210 2024
[19]

Interpretability in mapping weeds and crops from drone images

Lv, H., Han, X., Cui, H., Wang, P., Zuo, W., Zhou, Z.: Joint Extrac- tion of Entities and Relationships from Cyber Threat Intell igence based on Task-speciﬁc Fourier Network. 2024 International Joint Conference on Neural Networks (IJCNN), pp. 1-8, IEEE, Yokohama, Japan, 2024, https://doi.org/10.1109/IJCNN60899.2024.10650942 16 I. Mouiche and S. Saad

work page doi:10.1109/ijcnn60899.2024.10650942 2024
[20]

in IEEE Interne t of Things Journal, pp

Zhu, F., Cheng, Z., Li, P., Xu, H.: ITIRel: Joint Entity an d Relation Extraction for Internet of Things Threat Intelligence. in IEEE Interne t of Things Journal, pp. 20867-20878, 2024, https://doi.org/10.1109/JIOT.2024.3373799

work page doi:10.1109/jiot.2024.3373799 2024
[21]

TechRxiv , 2024, https://doi.org/10.36227/techrxiv.174286575.55673704/v1

Mouiche, I., Saad, S.: TIJERE: A Novel Threat Intelligen ce Joint Ex- traction Model based on Analyst Expert Knowledge. TechRxiv , 2024, https://doi.org/10.36227/techrxiv.174286575.55673704/v1

work page doi:10.36227/techrxiv.174286575.55673704/v1 2024
[22]

Applied Sciences, vol

Gasmi, H., Laval, J., Bouras, A.: Information extractio n of cybersecurity concepts: an lstm approach. Applied Sciences, vol. 9, (2019)

work page 2019
[23]

In: In Proceedings of the 23rd international symposium on research in attacks, intrusion s and defenses (RAID 2020), pp

Zhao, J., Yan, Q., Liu, X., Li, B., Zuo, G.: Cyber threat in telligence modeling based on heterogeneous graph convolutional network. In: In Proceedings of the 23rd international symposium on research in attacks, intrusion s and defenses (RAID 2020), pp. 241–256, USENIX, San Sebastian (2020)

work page 2020
[24]

Computers & Security , vol

Jo, H., Lee, Y., Shin, S.:Vulcan: Automatic extraction a nd analysis of cyber threat intelligence from unstructured text. Computers & Security , vol. 120, (2022)

work page 2022
[25]

ARES ’23: Pro ceedings of the 18th International Conference on A vailability, Reliabili ty, and Security, 2023, doi: 10.1145/3600160.3600182

Marchiori, F., Conti, M., Verde, N., V.: STIXnet: A Novel and Modular Solution for Extracting All STIX Objects in CTI Reports. ARES ’23: Pro ceedings of the 18th International Conference on A vailability, Reliabili ty, and Security, 2023, doi: 10.1145/3600160.3600182

work page doi:10.1145/3600160.3600182 2023
[26]

, Extracting Information about Security Vulnerabilities from Web Text

Mulwad, V., Li, W., Joshi, A., Finin, T., Viswanathan, K. , Extracting Information about Security Vulnerabilities from Web Text. 2011 IEEE/WI C/ACM International Conferences on Web Intelligence and Intelligent Agent Tech nology, Lyon, France, pp. 257-260, 2011, doi: 10.1109/WI-IAT.2011.26

work page doi:10.1109/wi-iat.2011.26 2011
[27]

Securit y and Communication Networks, (2022), https://doi.org/10.1155/2022/8477260

Li, Y., Guo, Y., Fang, C., Liu, Y., Chen, Q.: A Novel Threat Intelligence Informa- tion Extraction System Combining Multiple Models. Securit y and Communication Networks, (2022), https://doi.org/10.1155/2022/8477260

work page doi:10.1155/2022/8477260 2022
[28]

in IEEE Access, vol

Piplai, A., Mittal, S., Joshi, A., Finin, T., Holt, J., Za k, R.: Creating Cybersecurity Knowledge Graphs From Malware After Action Reports. in IEEE Access, vol. 8, pp. 211691-211703, (2020)

work page 2020
[29]

ICML ’01: Proceedings of the Eighteenth International Conference on Machine Learning, pp

Laﬀerty, JJ., McCallum, A., Pereira, F.: Conditional Ra ndom Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. ICML ’01: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282-289, 2001

work page 2001
[30]

In Proceedings of the 21st In ternational Conference on Security and Cryptography, vol

Mouiche, I., Saad, S.: TI-NERmerger: Semi-automated Fr amework for Integrating NER Datasets in Cybersecurity. In Proceedings of the 21st In ternational Conference on Security and Cryptography, vol. 1, pp. 357–370, SciTePre ss, Dijon, France (2024)

work page 2024
[31]

OASIS OPEN, https://docs.oasis-open.org/cti/stix/v 2.1/cs02/stix-v2.1- cs02.html, last accessed 2025/02/10

work page 2025
[32]

et al.: Scikit-learn: Mach ine learning in Python

Pedregosa, F., Varoquaux, G. et al.: Scikit-learn: Mach ine learning in Python. The Journal of Machine Learning Research, vol. 12, PP. 2825–283 0, 2011

work page 2011
[33]

Proceedings of the AAAI Workshop on Artiﬁc ial Intelligence for Cyber Security, pp

Syed, Z., Padia, A., Finin, T., Mathews, L., Joshi, A.: UC O: A Uniﬁed Cyberse- curity Ontology. Proceedings of the AAAI Workshop on Artiﬁc ial Intelligence for Cyber Security, pp. 195–202, AAAI Press, (2016)

work page 2016

[1] [1]

MITRE: MITRE Response to Cyber Attack in One of Its R&D Netw orks, https://www.mitre.org/news-insights/news-release/mitre-response-cyber-attack- one-its-rd-networks (accessed June 02, 2024)

work page 2024

[2] [2]

Crumpton, L., Clancy, C.: Advanced Cyber Threats Impact E ven the Most Pre- pared, https://medium.com/mitre-engenuity/advanced-c yber-threats-impact-even- the-most-prepared-56444e980dc8 (accessed June 02, 2024)

work page 2024

[3] [3]

ASONAM ’19: Proceedings of the 2019 IEEE/ACM Internat ional Conference on Advances in Social Networks Analysis and Mining, vol

Pingle, A., Piplai, A., Mittal, S., Joshi, A., Holt, J., Za k, R.: RelExt: relation ex- traction using deep learning approaches for cybersecurity knowledge graph improve- ment. ASONAM ’19: Proceedings of the 2019 IEEE/ACM Internat ional Conference on Advances in Social Networks Analysis and Mining, vol. 2, p p. 879-886, ACM, Vancouver, British Columbia, C...

work page 2019

[4] [4]

Knowledge-Based Systems, vol

Sarhan, I., Spruit, M.: Open-CyKG: An Open Cyber Threat In telli- gence Knowledge Graph. Knowledge-Based Systems, vol. 233, (2021), https://doi.org/10.1016/j.knosys.2021.107524

work page doi:10.1016/j.knosys.2021.107524 2021

[5] [5]

2022 the 7th Inter national Conference on Big Data Analytics (ICBDA), pp

Zuo, J., Gao, Y., Li, X., Yuan, J.: An End-to-end Entity and Relation Joint Extrac- tion Model for Cyber Threat Intelligence. 2022 the 7th Inter national Conference on Big Data Analytics (ICBDA), pp. 204-209. IEEE, Guangzhou , China (2022), https://doi.org/10.1109/ICBDA55095.2022.9760342 CTiKG Framework 15

work page doi:10.1109/icbda55095.2022.9760342 2022

[6] [6]

50-61, 2021

Zhong, Z., Chen, D.: A Frustratingly Easy Approach for Ent ity and Relation Ex- traction, In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human La nguage Technologies, ACL, pp. 50-61, 2021

work page 2021

[7] [7]

Yan, Z., Jia, Z., Tu, K.: An Empirical Study of Pipeline vs. Joint approaches to Entity and Relation Extraction, In Proceedings of the 2nd Co nference of the Asia- Paciﬁc Chapter of the Association for Computational Lingui stics and the 12th In- ternational Joint Conference on Natural Language Processi ng, ACL, pp. 437-443, 2022

work page 2022

[8] [8]

Computers & Sec urity, vol

Ahmed, K., Khurshid, S., K., Hina, S.: CyberEntRel: Joint extraction of cyber entities and relations using deep learning. Computers & Sec urity, vol. 136, (2024), https://doi.org/10.1016/j.cose.2023.103579

work page doi:10.1016/j.cose.2023.103579 2024

[9] [9]

Information and Communications Security: 23rd Inte rnational Con- ference, ICICS 2021, pp

Guo, Y., Liu, Z., Huang, C., Liu, J., Jing, W., Wang, Z., Wan g, Y.: CyberRel: Joint Entity and Relation Extraction for Cyberse curity Con- cepts. Information and Communications Security: 23rd Inte rnational Con- ference, ICICS 2021, pp. 447—463, Springer, Chongqing, Chi na (2021), https://doi.org/10.1007/978-3-030-86890-1_25

work page doi:10.1007/978-3-030-86890-1_25 2021

[10] [10]

Computers & Security, vol

Mouiche, I., Saad, S.: Entity and relation extractions f or threat in- telligence knowledge graphs. Computers & Security, vol. 14 8, (2025), https://doi.org/10.1016/j.cose.2024.104120

work page doi:10.1016/j.cose.2024.104120 2025

[11] [11]

Hugging Face, https://huggingface.co/ehsanaghaei/SecureBERT_Plu, last ac- cessed 2025/02/25

work page 2025

[12] [12]

Security and Privacy in C ommunication Net- works, vol

Aghaei, E., Niu, X., Shadid, W., Al-Shaer, E.: SecureBER T: A Domain-Speciﬁc Language Model for Cybersecurity. Security and Privacy in C ommunication Net- works, vol. 462, (2023) https://doi.org/10.1109/TrustCom50675.2020.00083

work page doi:10.1109/trustcom50675.2020.00083 2023

[13] [13]

In Proceeding s of NAACL-HLT, 2016

Lample, G., Ballesteros, M., Subramanian, S., Kawakami , K., Dyer, C.: Neural Architectures for Named Entity Recognition. In Proceeding s of NAACL-HLT, 2016

work page 2016

[14] [14]

Security and Privacy in C ommunication Net- works, vol

Wang, X. et al.: DNRTI: A Large-Scale Dataset for Named En tity Recognition in Threat Intelligence. 2020 IEEE 19th Interna tional Con- ference on Trust, Security and Privacy in Computing and Comm uni- cations (TrustCom), pp. 1842-1848. IEEE, Guangzhou, China , (2020), https://doi.org/10.1109/TrustCom50675.2020.00252

work page doi:10.1109/trustcom50675.2020.00252 2020

[15] [15]

A., Jones, C

Bridges, R. A., Jones, C. L., Iannacone, M. D., Goodall, J , R.: Automatic Labeling for Entity Extraction in Cyber Security. The Third ASE Inter national Conference on Cyber Security 2014, (2014)

work page 2014

[16] [16]

Computers & Security, vol

Wang, X., Liu, Z., Liu, J.: Information extraction of cyb ersecurity concepts: an lstm approach. Computers & Security, vol. 144, (2024)

work page 2024

[17] [17]

Computers & Security, vol

Guo, Z., Liu, Z., Huang, C., Wang, N., Min, H., Guo, W., Liu , J.: A framework for threat intelligence extraction and fusion. Computers & Security, vol. 132, 2024, https://doi.org/10.1016/j.cose.2023.103371

work page doi:10.1016/j.cose.2023.103371 2024

[18] [18]

Bilayer-induced asymmetric quantum Hall effect in epitaxial graphene

Liu, Y., Han, X., Zuo, W., Lv, H., Guo, J.: CTI-JE: A Joint E xtrac- tion Framework of Entities and Relations in Unstructured Cy ber Threat In- telligence. 27th International Conference on Computer Sup ported Coopera- tive Work in Design (CSCWD), pp. 2728-2733. IEEE, Tianjin, C hina (2024), https://doi.org/10.1109/CSCWD61410.2024.10580210

work page internal anchor Pith review Pith/arXiv arXiv doi:10.1109/cscwd61410.2024.10580210 2024

[19] [19]

Interpretability in mapping weeds and crops from drone images

Lv, H., Han, X., Cui, H., Wang, P., Zuo, W., Zhou, Z.: Joint Extrac- tion of Entities and Relationships from Cyber Threat Intell igence based on Task-speciﬁc Fourier Network. 2024 International Joint Conference on Neural Networks (IJCNN), pp. 1-8, IEEE, Yokohama, Japan, 2024, https://doi.org/10.1109/IJCNN60899.2024.10650942 16 I. Mouiche and S. Saad

work page doi:10.1109/ijcnn60899.2024.10650942 2024

[20] [20]

in IEEE Interne t of Things Journal, pp

Zhu, F., Cheng, Z., Li, P., Xu, H.: ITIRel: Joint Entity an d Relation Extraction for Internet of Things Threat Intelligence. in IEEE Interne t of Things Journal, pp. 20867-20878, 2024, https://doi.org/10.1109/JIOT.2024.3373799

work page doi:10.1109/jiot.2024.3373799 2024

[21] [21]

TechRxiv , 2024, https://doi.org/10.36227/techrxiv.174286575.55673704/v1

Mouiche, I., Saad, S.: TIJERE: A Novel Threat Intelligen ce Joint Ex- traction Model based on Analyst Expert Knowledge. TechRxiv , 2024, https://doi.org/10.36227/techrxiv.174286575.55673704/v1

work page doi:10.36227/techrxiv.174286575.55673704/v1 2024

[22] [22]

Applied Sciences, vol

Gasmi, H., Laval, J., Bouras, A.: Information extractio n of cybersecurity concepts: an lstm approach. Applied Sciences, vol. 9, (2019)

work page 2019

[23] [23]

In: In Proceedings of the 23rd international symposium on research in attacks, intrusion s and defenses (RAID 2020), pp

Zhao, J., Yan, Q., Liu, X., Li, B., Zuo, G.: Cyber threat in telligence modeling based on heterogeneous graph convolutional network. In: In Proceedings of the 23rd international symposium on research in attacks, intrusion s and defenses (RAID 2020), pp. 241–256, USENIX, San Sebastian (2020)

work page 2020

[24] [24]

Computers & Security , vol

Jo, H., Lee, Y., Shin, S.:Vulcan: Automatic extraction a nd analysis of cyber threat intelligence from unstructured text. Computers & Security , vol. 120, (2022)

work page 2022

[25] [25]

ARES ’23: Pro ceedings of the 18th International Conference on A vailability, Reliabili ty, and Security, 2023, doi: 10.1145/3600160.3600182

Marchiori, F., Conti, M., Verde, N., V.: STIXnet: A Novel and Modular Solution for Extracting All STIX Objects in CTI Reports. ARES ’23: Pro ceedings of the 18th International Conference on A vailability, Reliabili ty, and Security, 2023, doi: 10.1145/3600160.3600182

work page doi:10.1145/3600160.3600182 2023

[26] [26]

, Extracting Information about Security Vulnerabilities from Web Text

Mulwad, V., Li, W., Joshi, A., Finin, T., Viswanathan, K. , Extracting Information about Security Vulnerabilities from Web Text. 2011 IEEE/WI C/ACM International Conferences on Web Intelligence and Intelligent Agent Tech nology, Lyon, France, pp. 257-260, 2011, doi: 10.1109/WI-IAT.2011.26

work page doi:10.1109/wi-iat.2011.26 2011

[27] [27]

Securit y and Communication Networks, (2022), https://doi.org/10.1155/2022/8477260

Li, Y., Guo, Y., Fang, C., Liu, Y., Chen, Q.: A Novel Threat Intelligence Informa- tion Extraction System Combining Multiple Models. Securit y and Communication Networks, (2022), https://doi.org/10.1155/2022/8477260

work page doi:10.1155/2022/8477260 2022

[28] [28]

in IEEE Access, vol

Piplai, A., Mittal, S., Joshi, A., Finin, T., Holt, J., Za k, R.: Creating Cybersecurity Knowledge Graphs From Malware After Action Reports. in IEEE Access, vol. 8, pp. 211691-211703, (2020)

work page 2020

[29] [29]

ICML ’01: Proceedings of the Eighteenth International Conference on Machine Learning, pp

Laﬀerty, JJ., McCallum, A., Pereira, F.: Conditional Ra ndom Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. ICML ’01: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 282-289, 2001

work page 2001

[30] [30]

In Proceedings of the 21st In ternational Conference on Security and Cryptography, vol

Mouiche, I., Saad, S.: TI-NERmerger: Semi-automated Fr amework for Integrating NER Datasets in Cybersecurity. In Proceedings of the 21st In ternational Conference on Security and Cryptography, vol. 1, pp. 357–370, SciTePre ss, Dijon, France (2024)

work page 2024

[31] [31]

OASIS OPEN, https://docs.oasis-open.org/cti/stix/v 2.1/cs02/stix-v2.1- cs02.html, last accessed 2025/02/10

work page 2025

[32] [32]

et al.: Scikit-learn: Mach ine learning in Python

Pedregosa, F., Varoquaux, G. et al.: Scikit-learn: Mach ine learning in Python. The Journal of Machine Learning Research, vol. 12, PP. 2825–283 0, 2011

work page 2011

[33] [33]

Proceedings of the AAAI Workshop on Artiﬁc ial Intelligence for Cyber Security, pp

Syed, Z., Padia, A., Finin, T., Mathews, L., Joshi, A.: UC O: A Uniﬁed Cyberse- curity Ontology. Proceedings of the AAAI Workshop on Artiﬁc ial Intelligence for Cyber Security, pp. 195–202, AAAI Press, (2016)

work page 2016