Recognition: unknown
From IOCs to Regex: Automating CTI Operationalization for SOC with LLMs
Pith reviewed 2026-05-10 16:03 UTC · model grok-4.3
The pith
IOCRegex-gen automatically converts indicators of compromise from cyber threat reports into regular expressions using large language models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
IOCRegex-gen converts IOCs extracted from CTI reports into regexes through a group-aware mechanism that identifies segments for capture or non-capture groups plus an iterative reasoning and multi-stage validation pipeline that enforces syntactic validity and semantic correctness, reaching 99.1 percent hit rate and 0.8 percent false-positive rate on thousands of real reports and MITRE ATT&CK ground-truth strings.
What carries the argument
IOCRegex-gen, an LLM-based pipeline that uses a group-aware mechanism to classify IOC segments into capture or non-capture groups and an iterative reasoning process with validation stages to produce usable regexes.
If this is right
- SOC teams can process growing volumes of CTI reports into operational regex rules without manual effort.
- Regex patterns generated this way can capture variations in log formats and attacker behaviors more reliably than plain IOC strings.
- Digital forensics and SIEM rule creation become faster and less error-prone at scale.
- The same pipeline could support repeated regeneration of regexes as threats evolve.
Where Pith is reading between the lines
- The approach could be extended to generate other detection artifacts such as YARA or Sigma rules directly from CTI text.
- Integration into existing security tools might allow near real-time operationalization of incoming threat reports.
- Over time, the system could reduce the need for specialized analysts on routine CTI-to-rule conversion tasks.
Load-bearing premise
The group-aware mechanism and iterative reasoning pipeline will keep producing semantically correct regexes that work across different log formats, system contexts, and changing attacker tactics without needing extra human fixes.
What would settle it
A set of new CTI reports containing IOCs in previously unseen log formats where the generated regexes either miss valid matches or trigger many false positives on real SOC log data.
Figures
read the original abstract
Cyber Threat Intelligence (CTI) reports contain Indicators of Compromise (IOCs) that are critical for security operations. To operationalize these IOCs across heterogeneous logs, analysts often convert them into regular expressions (regexes) for tasks such as digital forensics, log parsing, and SIEM rule creation. However, regex construction is still largely manual, requiring analysts to extract IOCs from CTI reports and transform them into syntactically valid and semantically precise patterns. This process is slow, error-prone, and increasingly impractical as CTI volumes grow. Although recent studies have applied Large Language Models (LLMs) to IOC extraction, they typically output plain strings rather than regexes, limiting practical deployment. Plain IOCs cannot effectively capture variations in system context, log format, or attacker behavior. To address this gap, we propose IOCRegex-gen, a fully automated LLM-based regex generation system that converts IOCs into regexes. The system introduces two key innovations: (i) a group-aware mechanism that identifies which IOC segments should be represented as capture or non-capture groups, and (ii) an iterative reasoning and multi-stage validation pipeline to ensure syntactic validity and semantic correctness. Experiments on over 3,000 real CTI reports and 2,400 ground-truth strings from the MITRE ATT&CK Evaluation framework show that IOCRegex-gen achieves an average hit rate of 99.1% and a false-positive rate of only 0.8%, demonstrating its effectiveness for large-scale CTI processing and automated regex generation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes IOCRegex-gen, an LLM-based system to automatically convert Indicators of Compromise (IOCs) extracted from Cyber Threat Intelligence (CTI) reports into regular expressions suitable for SOC tasks such as log parsing and SIEM rule creation. It introduces two innovations: a group-aware mechanism to decide capture versus non-capture groups for IOC segments, and an iterative reasoning plus multi-stage validation pipeline to enforce syntactic validity and semantic correctness. Evaluation is reported on more than 3,000 real CTI reports together with 2,400 ground-truth strings drawn from the MITRE ATT&CK Evaluation framework, yielding an average hit rate of 99.1% and false-positive rate of 0.8%.
Significance. If the reported metrics are shown to be robust under proper controls, the work would address a concrete operational bottleneck: the manual, error-prone conversion of raw IOC strings into regexes that tolerate log-format variation and attacker TTP evolution. Successful automation at this scale could materially improve the speed and consistency with which CTI is operationalized in security operations centers.
major comments (3)
- Abstract: the effectiveness claim rests on a 99.1% hit rate and 0.8% FPR, yet the abstract (and, by extension, the evaluation description) supplies no baselines, no ablation of the group-aware mechanism versus the iterative pipeline, no definition of how FPR is computed against negative examples, and no error analysis or edge-case handling; without these the numbers cannot be assessed as evidence of generalization.
- Evaluation (MITRE ATT&CK strings): the 2,400 ground-truth strings are static and drawn from a curated framework; the manuscript does not report out-of-distribution tests on raw heterogeneous logs, format drift, encoding differences, or post-2023 attacker IOC variations, leaving the central generalization claim for SOC deployment unsupported.
- Method (group-aware + iterative pipeline): no quantitative evidence is provided that the two innovations are necessary or sufficient; an ablation removing each component in turn would be required to establish that the reported performance is attributable to the proposed mechanisms rather than to the base LLM.
minor comments (2)
- Abstract: the phrase 'fully automated' is used while the pipeline description implies multiple LLM calls and validation stages; a brief clarification of what 'fully automated' means in practice would improve precision.
- The manuscript would benefit from an explicit statement of the exact prompt templates and temperature settings used for the LLM calls, as these are load-bearing for reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below, explaining our position and the changes we will make to improve the manuscript's clarity and rigor.
read point-by-point responses
-
Referee: Abstract: the effectiveness claim rests on a 99.1% hit rate and 0.8% FPR, yet the abstract (and, by extension, the evaluation description) supplies no baselines, no ablation of the group-aware mechanism versus the iterative pipeline, no definition of how FPR is computed against negative examples, and no error analysis or edge-case handling; without these the numbers cannot be assessed as evidence of generalization.
Authors: We agree that the abstract and evaluation description would be strengthened by including baselines, a definition of FPR, and error analysis. In the revised manuscript we will update the abstract to reference comparisons against direct LLM prompting and basic string-to-regex heuristics. We will add an explicit definition of the false-positive rate (generated regexes tested on negative log samples without the IOC, counting unintended matches) and include a new error-analysis subsection covering edge cases such as encoded IOCs, special characters, and ambiguous segments together with how the multi-stage validation mitigates them. revision: yes
-
Referee: Evaluation (MITRE ATT&CK strings): the 2,400 ground-truth strings are static and drawn from a curated framework; the manuscript does not report out-of-distribution tests on raw heterogeneous logs, format drift, encoding differences, or post-2023 attacker IOC variations, leaving the central generalization claim for SOC deployment unsupported.
Authors: The 3,000+ real CTI reports already introduce substantial heterogeneity in format, encoding, and context beyond the curated MITRE strings. Nevertheless, we acknowledge the value of more explicit out-of-distribution testing. In the revision we will add a dedicated experiment using a held-out set of post-2023 CTI reports and logs that exhibit format drift and encoding variations, reporting hit rate and FPR on this set to further substantiate generalization for SOC use. revision: partial
-
Referee: Method (group-aware + iterative pipeline): no quantitative evidence is provided that the two innovations are necessary or sufficient; an ablation removing each component in turn would be required to establish that the reported performance is attributable to the proposed mechanisms rather than to the base LLM.
Authors: We recognize that quantitative ablations are required to isolate the contribution of each component. We will perform and report two ablation experiments in the revised manuscript: (1) replacing the group-aware mechanism with default capture groups for all segments, and (2) disabling the iterative reasoning and multi-stage validation pipeline in favor of single-pass generation. The resulting hit rates and false-positive rates will demonstrate the necessity of both innovations relative to the base LLM. revision: yes
Circularity Check
No circularity: empirical results rest on external MITRE ground truth
full rationale
The paper describes an LLM-based engineering system (group-aware mechanism + iterative validation pipeline) whose central claims are validated by direct comparison to 2,400 external MITRE ATT&CK strings and 3,000 CTI reports. No equations, fitted parameters, self-citations, or internal definitions are used to derive the reported hit rate or FPR; the metrics are computed against independent ground-truth data. The derivation chain is therefore self-contained and externally falsifiable.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Threat intelligence global market report
“Threat intelligence global market report.” [Online]. Available: https://www.thebusinessresearchcompany.com/report/threat-intellige nce-global-market-report#: ∼:text=The%20threat%20intelligence%2 0market%20size%20is%20expected%20to%20see%20rapid,intellig ence%2C%20focus%20on%20cloud%20security
-
[2]
TINKER: A framework for Open source Cyberthreat Intelligence,
N. Rastogi, S. Dutta, A. Gittens, M. J. Zaki, and C. Aggarwal, “TINKER: A framework for Open source Cyberthreat Intelligence,” Tech. Rep
-
[3]
TTPDrill: Automatic and accurate extraction of threat actions from unstructured text of CTI Sources,
G. Husari, E. Al-Shaer, M. Ahmed, B. Chu, and X. Niu, “TTPDrill: Automatic and accurate extraction of threat actions from unstructured text of CTI Sources,” inACM International Conference Proceeding Series, vol. Part F132521. Association for Computing Machinery, dec 2017, pp. 103–115
2017
-
[4]
Llm-tikg: Threat intelligence knowledge graph construction utilizing large language model,
Y . Hu, F. Zou, J. Han, X. Sun, and Y . Wang, “Llm-tikg: Threat intelligence knowledge graph construction utilizing large language model,”Computers & Security, vol. 145, p. 103999, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S01674 04824003043
2024
-
[5]
Llmcloudhunter: Harnessing llms for automated extraction of detection rules from cloud-based cti,
Y . Schwartz, L. Benshimol, D. Mimran, Y . Elovici, and A. Shabtai, “Llmcloudhunter: Harnessing llms for automated extraction of detection rules from cloud-based cti,”Waikiki ’24: Annual Computer Security Applications Conference, December 09 ˆa•fi13, 2024, Waikiki, Hawaii, USA, vol. 1, 2024. [Online]. Available: http://arxiv.org/abs/2407.05194
-
[6]
Yucheng Zhou, Jihai Zhang, Guanjie Chen, Jianbing Shen, and Yu Cheng
M. Xu, H. Wang, J. Liu, Y . Lin, C. X. Y . Liu, H. W. Lim, and J. S. Dong, “Intelex: A llm-driven attack-level threat intelligence extraction framework,” 2024. [Online]. Available: http://arxiv.org/abs/2412.10872
-
[7]
Rulepilot: An llm-powered agent for security rule generation,
H. Wang, M. Xu, Y . Guo, W. Han, H. W. Lim, and J. S. Dong, “Rulepilot: An llm-powered agent for security rule generation,” inProceedings of the 48th IEEE/ACM International Conference on Software Engineering (ICSE ’26), 2026. [Online]. Available: https://doi.org/10.1145/3744916.3773249
-
[8]
Neural generation of regular expressions from natural language with minimal domain knowledge,
N. Locascio, K. Narasimhan, E. DeLeon, N. Kushman, and R. Barzilay, “Neural generation of regular expressions from natural language with minimal domain knowledge,” 2016. [Online]. Available: https://arxiv.org/abs/1608.03000
-
[9]
Multi- modal synthesis of regular expressions,
Q. Chen, X. Wang, X. Ye, G. Durrett, and I. Dillig, “Multi- modal synthesis of regular expressions,” 2020. [Online]. Available: https://arxiv.org/abs/1908.03316
-
[10]
Infere: Step-by-step regex generation via chain of inference,
S. Zhang, X. Gu, Y . Chen, and B. Shen, “Infere: Step-by-step regex generation via chain of inference,” 2023. [Online]. Available: https://arxiv.org/abs/2308.04041
-
[11]
Z. Tang, Y . Yan, R. Li, H. Dong, H. Chen, and H. Gao, “Enhancing multi-modal regular expression synthesis via large language models and semantic manipulations of sub-expressions,” inSETTA, 2024, pp. 122–141. [Online]. Available: https://doi.org/10.1007/978-981-9 6-0602-3 7
-
[12]
Mitre att&ck evaluation
M. Corporation, “Mitre att&ck evaluation.” [Online]. Available: https://attackevals.mitre-engenuity.org/
-
[13]
Tracking the activities of teamtnt
D. Fiser and A. Oliveira, “Tracking the activities of teamtnt.” [Online]. Available: https://documents.trendmicro.com/assets/white papers/wp-tracking-the-activities-of-teamTNT.pdf
-
[14]
Unveiling earth kapre aka redcurl’s cyberespionage tactics with trend micro mdr, threat intelligence
M. F. Buddy Tancio, Maria Emreen Viray, “Unveiling earth kapre aka redcurl’s cyberespionage tactics with trend micro mdr, threat intelligence.” [Online]. Available: https://www.trendmicro.com/en u s/research/24/c/unveiling-earth-kapre-aka-redcurls-cyberespionage-t actics-with-t.html
-
[15]
Triton attribution: Russian government-owned lab most likely built custom intrusion tools for triton attackers
F. Intelligence, “Triton attribution: Russian government-owned lab most likely built custom intrusion tools for triton attackers.” [Online]. Available: https://cloud.google.com/blog/topics/threat-intelligence/tr iton-attribution-russian-government-owned-lab-most-likely-built-too ls/
-
[16]
Redcurl hackers return to spy on ’major russian bank,’ australian company
D. Antoniuk, “Redcurl hackers return to spy on ’major russian bank,’ australian company.” [Online]. Available: https://therecord.me dia/redcurl-hackers-russian-bank-australian-company
-
[17]
Astaroth malware uses legitimate os and antivirus processes to steal passwords and personal data
E. Salem, “Astaroth malware uses legitimate os and antivirus processes to steal passwords and personal data.” [Online]. Available: https://www.cybereason.com/blog/information-stealing-malware-tar geting-brazil-full-research
-
[18]
Mitre att&ck
Mitre, “Mitre att&ck.” [Online]. Available: https://attack.mitre.org/
-
[19]
Trend micro threat encyclopedia
T. Micro, “Trend micro threat encyclopedia.” [Online]. Available: https://www.trendmicro.com/vinfo/us/threat-encyclopedia#
-
[20]
What is the pyramid of pain
D. J. Bianco, “What is the pyramid of pain.” [Online]. Available: https://www.attackiq.com/glossary/pyramid-of-pain/
-
[21]
Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence,
P. Gao, F. Shao, X. Liu, X. Xiao, Z. Qin, F. Xu, P. Mittal, S. R. Kulkarni, and D. Song, “Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence,” Tech. Rep
-
[22]
Llm-tikg: Threat intelligence knowledge graph construction utilizing large language model,
Y . Hu, F. Zou, J. Han, X. Sun, and Y . Wang, “Llm-tikg: Threat intelligence knowledge graph construction utilizing large language model,”Computers & Security, vol. 145, p. 103999, 2024
2024
-
[23]
Ctikg: Llm-powered knowledge graph construction from cyber threat intelligence,
L. Huang and X. Xiao, “Ctikg: Llm-powered knowledge graph construction from cyber threat intelligence,” inFirst Conference on Language Modeling, 2024
2024
-
[24]
Constructing knowledge graph from cyber threat intelligence using large language model,
J. Liu and J. Zhan, “Constructing knowledge graph from cyber threat intelligence using large language model,” in2023 IEEE International Conference on Big Data (BigData). IEEE, 2023, pp. 516–521
2023
-
[25]
Towards effective identification of attack techniques in cyber threat intelligence reports using large language models,
H. Cuong Nguyen, S. Tariq, M. Baruwal Chhetri, and B. Quoc V o, “Towards effective identification of attack techniques in cyber threat intelligence reports using large language models,” inCompanion Proceedings of the ACM on Web Conference 2025, 2025, pp. 942– 946
2025
-
[26]
Actionable cyber threat intelligence using knowledge graphs and large language models,
R. Fieblinger, M. T. Alam, and N. Rastogi, “Actionable cyber threat intelligence using knowledge graphs and large language models,” in 2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW). IEEE, 2024, pp. 100–111
2024
-
[27]
Search reference-rex
Splunk, “Search reference-rex.” [Online]. Available: https://docs.spl unk.com/Documentation/Splunk/latest/SearchReference/rex
-
[28]
Regular expression syntax
Elastic, “Regular expression syntax.” [Online]. Available: https://ww w.elastic.co/docs/reference/query-languages/query-dsl/regexp-syntax
-
[29]
Common regular expressions
IBM, “Common regular expressions.” [Online]. Available: https://ww w.ibm.com/docs/en/dsm?topic=qradar-common-regular-expressions
-
[30]
Splunk security content
S. community, “Splunk security content.” [Online]. Available: https://github.com/rapdev-io/Threat Detection Ruleset-SPLUNK?ta b=readme-ov-file
-
[31]
What is fileless malware
Fortinet, “What is fileless malware.” [Online]. Available: https: //www.fortinet.com/resources/cyberglossary/fileless-malware
-
[32]
regex101
F. Dib, “regex101.” [Online]. Available: https://regex101.com/
-
[33]
ReAct: Synergizing Reasoning and Acting in Language Models
S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “React: Synergizing reasoning and acting in language models,” 2023. [Online]. Available: https://arxiv.org/abs/2210.03629
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[34]
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou, “Chain-of-thought prompting elicits reasoning in large language models,” 2023. [Online]. Available: https://arxiv.org/abs/2201.11903
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[35]
Regex+: Synthesizing regular expressions from positive examples,
E. Pertseva, M. Barbone, J. Rudek, and N. Polikarpova, “Regex+: Synthesizing regular expressions from positive examples,”11TH Workshop on Synthesis. [Online]. Available: https://par.nsf.gov/bibl io/10336574
-
[36]
Transregex: Multi-modal regular expression synthesis by generate-and-repair,
Y . Li, S. Li, Z. Xu, J. Cao, Z. Chen, Y . Hu, H. Chen, and S.-C. Cheung, “Transregex: Multi-modal regular expression synthesis by generate-and-repair,” in2021 IEEE/ACM 43rd International Confer- ence on Software Engineering (ICSE), 2021, pp. 1210–1222
2021
-
[37]
Self-consistency improves chain of thought reasoning in language models,
X. Wang, J. Wei, D. Schuurmans, Q. V . Le, E. H. Chi, S. Narang, A. Chowdhery, and D. Zhou, “Self-consistency improves chain of thought reasoning in language models,” inThe Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=1PL1NIMMrw
2023
-
[38]
Actionable cyber threat intelligence using knowledge graphs and large language models,
R. Fieblinger, M. T. Alam, and N. Rastogi, “Actionable cyber threat intelligence using knowledge graphs and large language models,”
-
[39]
Available: https://arxiv.org/abs/2407.02528
[Online]. Available: https://arxiv.org/abs/2407.02528
-
[40]
Microsoft security incident prediction,
S. Freitas, J. Kalajdjieski, A. Gharib, and R. McCann, “Microsoft security incident prediction,” 2024. [Online]. Available: https: //www.kaggle.com/dsv/8929038
-
[41]
Loghub: A large collection of system log datasets towards automated log analytics,
J. Zhu, S. He, P. He, J. Liu, and M. R. Lyu, “Loghub: A large collection of system log datasets for ai-driven log analytics,” 2023. [Online]. Available: https://arxiv.org/abs/2008.06448
-
[42]
F. P. Miller, A. F. Vandome, and J. McBrewster,Levenshtein Distance: Information theory, Computer science, String (computer science), String metric, Damerau?Levenshtein distance, Spell checker, Ham- ming distance. Alpha Press, 2009
2009
-
[43]
Playing regex golf with genetic programming,
A. Bartoli, A. De Lorenzo, E. Medvet, and F. Tarlao, “Playing regex golf with genetic programming,” inProceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, ser. GECCO ’14. New York, NY , USA: Association for Computing Machinery, 2014, p. 1063–1070. [Online]. Available: https://doi.org/10.1145/2576768.2598333
-
[44]
Regex-based entity extraction with active learning and genetic programming,
——, “Regex-based entity extraction with active learning and genetic programming,”SIGAPP Appl. Comput. Rev., vol. 16, no. 2, p. 7–15, Aug. 2016. [Online]. Available: https://doi.org/10.1145/2993231.29 93232
-
[45]
A regular expression generator based on css selectors for efficient extraction from html pages,
E. Uzun, “A regular expression generator based on css selectors for efficient extraction from html pages,”Turkish Journal of Electrical Engineering and Computer Sciences, vol. 28, no. 6, pp. 3389–3401, 2020
2020
-
[46]
Sketch-driven regular expression generation from natural language and examples,
X. Ye, Q. Chen, X. Wang, I. Dillig, and G. Durrett, “Sketch-driven regular expression generation from natural language and examples,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 679–694, 2020
2020
-
[47]
M. L. Siddiq, J. Zhang, and J. C. D. S. Santos, “Understanding regular expression denial of service (redos): Insights from llm- generated regexes and developer forums,” inProceedings of the 32nd IEEE/ACM International Conference on Program Comprehension, ser. ICPC ’24. New York, NY , USA: Association for Computing Machinery, 2024, p. 190–201. [Online]. Av...
-
[48]
From examples to patterns: Llm-generated regular expressions for entity extraction in czech clinical texts
P. Zelina, “From examples to patterns: Llm-generated regular expressions for entity extraction in czech clinical texts.” [Online]. Available: http://nlp.fi.muni.cz/raslan/2024/paper6.pdf
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.