arxiv: 2604.21051 · v1 · submitted 2026-04-22 · 💻 cs.SE · cs.CR

Recognition: unknown

Residual Risk Analysis in Benign Code: How Far Are We? A Multi-Model Semantic and Structural Similarity Approach

Mohammad Farhad , Shuvalaxmi Dass

Authors on Pith no claims yet

Pith reviewed 2026-05-09 23:27 UTC · model grok-4.3

classification 💻 cs.SE cs.CR

keywords residualsimilaritycoderiskanalysisbenignsecuritysemantic

0 comments

The pith

Patched functions often remain similar to vulnerable ones, and a new multi-model similarity scoring system identifies residual issues like null pointer dereferences in 61% of high-risk cases from the PrimeVul dataset.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses a gap in software security where patched code is assumed to be safe, but may still have residual risks. Researchers examined pairs of vulnerable and patched functions from the PrimeVul benchmark dataset. They employed multiple code language models to measure semantic similarity through embeddings, which capture the meaning of the code. Additionally, they used Tree-sitter to build abstract syntax trees and analyze structural similarities in the code's organization. These two signals, along with how much the different models agree, are combined into a Residual Risk Scoring system called RRS. This score estimates how likely the patched code still carries security problems. Their analysis revealed that many patched functions are still very similar to the original vulnerable code in both meaning and structure. This similarity suggests that the patch may not have fully addressed the underlying issue. To validate, they looked at the pairs with high RRS scores using popular static analysis tools: Cppcheck, Clang-Tidy, and Facebook-Infer. About 61 percent of these high-risk pairs showed one of 13 different types of residual problems, such as null pointer dereferences or unsafe ways of allocating memory. The study concludes that using code similarity as a signal can help prioritize which patched code needs extra inspection, making security assessments more reliable in large software projects. This approach combines modern AI techniques with traditional code analysis to tackle an underexplored area in vulnerability management.

Core claim

Our analysis shows that benign functions often remain highly similar to their vulnerable counterparts both semantically and structurally, indicating potential persistence of residual risk. We further find that approximately 61% of high-RRS code pairs exhibit 13 distinct categories of residual issues (e.g., null pointer dereferences, unsafe memory allocation), validated using state-of-the-art static analysis tools including Cppcheck, Clang-Tidy, and Facebook-Infer.

Load-bearing premise

That high semantic and structural similarity between vulnerable and patched functions reliably signals the presence of residual security risks detectable and classifiable by static analysis tools like Cppcheck.

Figures

Figures reproduced from arXiv: 2604.21051 by Mohammad Farhad, Shuvalaxmi Dass.

**Figure 2.** Figure 2: Residual risk analysis pipeline combining semantic similarity ( [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Semantic and structural similarity analysis across vulnerable–benign func [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

**Figure 4.** Figure 4: Semantic vs. structural similarity for vulnerable–benign pairs under dif [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗

**Figure 5.** Figure 5: Visualization of embedding similarity, localized AST similarity, and cross-model agreement across a subset of vulnerable–benign function pairs. (a) α = 0.6, β = 0.2 (b) α = 0.2, β = 0.6 [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗

**Figure 7.** Figure 7: Dummy helpers used for static analysis [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗

read the original abstract

Software security relies on effective vulnerability detection and patching, yet determining whether a patch fully eliminates risk remains an underexplored challenge. Existing vulnerability benchmarks often treat patched functions as inherently benign, overlooking the possibility of residual security risks. In this work, we analyze vulnerable-benign function pairs from the PrimeVul, a benchmark dataset using multiple code language models (Code LMs) to capture semantic similarity, complemented by Tree-sitter-based abstract syntax tree (AST) analysis for structural similarity. Building on these signals, we propose Residual Risk Scoring (RRS), a unified framework that integrates embedding-based semantic similarity, localized AST-based structural similarity, and cross-model agreement to estimate residual risk in code. Our analysis shows that benign functions often remain highly similar to their vulnerable counterparts both semantically and structurally, indicating potential persistence of residual risk. We further find that approximately $61\%$ of high-RRS code pairs exhibit $13$ distinct categories of residual issues (e.g., null pointer dereferences, unsafe memory allocation), validated using state-of-the-art static analysis tools including Cppcheck, Clang-Tidy, and Facebook-Infer. These results demonstrate that code-level similarity provides a practical signal for prioritizing post-patch inspection, enabling more reliable and scalable security assessment in real-world open-source software pipelines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's 61% finding on residual issues lacks a needed baseline comparison against low-RRS code.

read the letter

The paper's main result on residual risks in high-similarity patched code lacks the baseline comparison needed to support it. The approach combines semantic similarity from multiple code language models with structural similarity from AST analysis to create a Residual Risk Score for benign-looking functions after patching. They apply this to pairs from the PrimeVul dataset and find that about 61% of high-RRS pairs contain issues in 13 categories, spotted by tools like Cppcheck, Clang-Tidy, and Infer. This is new in how it unifies those signals specifically for residual risk estimation rather than just initial vulnerability detection. It does a decent job of showing that patched code can still be close to the original vulnerable version in both meaning and structure. The main weakness is the lack of a control group. Running the static tools only on the high-RRS cases means we don't know if 61% is unusually high or just typical for these tools on C/C++ functions. That leaves open whether the similarity is actually selecting for real residual problems or just code that triggers the analyzers at their usual rate. Details on the exact RRS calculation and threshold choice are also thin in the abstract, which affects reproducibility. The citations look standard for this area, building on prior code embedding and vuln work without obvious gaps. This is aimed at people in software security research or tool development who care about post-patch verification in open source projects. A reader looking for practical methods to prioritize code inspection would get value, though they'd want to see more validation. I would send it to peer review. The problem is important and the method is a reasonable extension, but it needs that baseline experiment and clearer methods to stand up.

Referee Report

2 major / 2 minor

Summary. The manuscript analyzes residual security risks in patched functions by measuring semantic similarity with multiple code language models and structural similarity with AST parsing on vulnerable-benign pairs from PrimeVul. It introduces Residual Risk Scoring (RRS) combining these, reports that benign functions are often similar to vulnerable ones, and finds that 61% of high-RRS pairs have 13 categories of issues like null pointer dereferences detected by Cppcheck, Clang-Tidy, and Infer, proposing similarity as a signal for prioritizing inspections.

Significance. If the central result is confirmed with appropriate controls, this work could have significant practical impact by providing an automated, scalable method to detect potential incomplete patches in open-source code using readily available tools. The multi-model approach and validation with static analyzers add credibility to the similarity-based risk estimation.

major comments (2)

[Abstract] The claim that approximately 61% of high-RRS code pairs exhibit 13 distinct categories of residual issues lacks support from a baseline analysis. The static analysis tools were run only on high-RRS pairs, without reporting the rate of findings in low-RRS patched functions or a control sample. This omission means the 61% cannot be distinguished from the background rate at which these tools flag issues in typical C/C++ code, weakening the link between high RRS and residual risk.
[Method] The definition of the Residual Risk Scoring (RRS) formula, including the weighting for cross-model agreement and the threshold for classifying high-RRS, is not fully specified with concrete values or sensitivity analysis. Since these are free parameters, the robustness of the reported 61% figure to different choices is unclear and requires explicit documentation for reproducibility.

minor comments (2)

[Abstract] The abstract mentions 'benign code' but the analysis is on patched functions; consistent terminology would improve clarity.
Consider adding a table summarizing the 13 categories of residual issues with examples for better reader understanding.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which highlight important aspects for strengthening the manuscript. We address each major comment below and are prepared to make the necessary revisions to improve clarity, reproducibility, and evidential support.

read point-by-point responses

Referee: [Abstract] The claim that approximately 61% of high-RRS code pairs exhibit 13 distinct categories of residual issues lacks support from a baseline analysis. The static analysis tools were run only on high-RRS pairs, without reporting the rate of findings in low-RRS patched functions or a control sample. This omission means the 61% cannot be distinguished from the background rate at which these tools flag issues in typical C/C++ code, weakening the link between high RRS and residual risk.

Authors: We agree that a baseline analysis is required to isolate the contribution of high RRS from the general rate at which static analyzers flag issues in C/C++ code. In the revised manuscript we will add results from applying Cppcheck, Clang-Tidy, and Infer to (i) the low-RRS patched functions in PrimeVul and (ii) a control sample of unrelated, non-patched C/C++ functions drawn from the same repositories. This will allow direct comparison of finding rates and will clarify whether the observed 61% rate in high-RRS pairs exceeds the background rate. revision: yes
Referee: [Method] The definition of the Residual Risk Scoring (RRS) formula, including the weighting for cross-model agreement and the threshold for classifying high-RRS, is not fully specified with concrete values or sensitivity analysis. Since these are free parameters, the robustness of the reported 61% figure to different choices is unclear and requires explicit documentation for reproducibility.

Authors: We will revise the Methods section to present the complete RRS formula, including the exact numerical weights assigned to semantic similarity, structural similarity, and cross-model agreement, as well as the numerical threshold used to designate high-RRS pairs. We will also add a sensitivity analysis that varies these weights and the threshold over plausible ranges and reports the resulting variation in the 61% statistic, thereby demonstrating robustness and supporting reproducibility. revision: yes

Circularity Check

0 steps flagged

No significant circularity; RRS defined from independent metrics and validated externally

full rationale

The paper constructs RRS from semantic embeddings of pre-trained code LMs and Tree-sitter AST structural similarity, then applies independent static-analysis tools (Cppcheck, Clang-Tidy, Infer) to count issues in the high-RRS subset. No parameters are fitted to the 61% outcome, no self-citation chain supports the central claim, and the derivation does not reduce any result to its own inputs by construction. The approach remains self-contained against external benchmarks and tools.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 1 invented entities

The claim rests on the domain assumption that code similarity indicates residual risk and introduces RRS as a new scoring construct with likely tunable parameters for combining signals; no machine-checked proofs or external benchmarks are mentioned.

free parameters (2)

RRS high-risk threshold
Used to select the subset of pairs for the 61% residual issue validation; chosen or fitted to produce the reported statistic.
Cross-model agreement weighting
Parameter controlling how model agreement contributes to the overall RRS score.

axioms (2)

domain assumption Embedding similarity from code LMs captures security-relevant semantic properties of vulnerabilities
Invoked when using multiple Code LMs to measure semantic similarity as a risk signal.
domain assumption Localized AST structural similarity correlates with residual code flaws
Basis for the Tree-sitter component of RRS.

invented entities (1)

Residual Risk Scoring (RRS) no independent evidence
purpose: Unified score integrating semantic embeddings, AST structure, and model agreement to estimate residual risk
Newly defined framework whose validity is not independently verified outside this analysis.

pith-pipeline@v0.9.0 · 5536 in / 1717 out tokens · 61771 ms · 2026-05-09T23:27:34.830801+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 17 canonical work pages · 2 internal anchors

[1]

Meta.https://www.meta.com/(2004), accessed: 02-12-2026

2004
[2]

net/(2007), accessed: 02-13-2026

Cppcheck: A tool for static c/c++ code analysis.http://cppcheck.sourceforge. net/(2007), accessed: 02-13-2026

2007
[3]

Clang-tidy: Llvm/clang-based static analyzer.https://clang.llvm.org/extra/ clang-tidy/(2013), accessed: 02-10-2026

2013
[4]

Infer: A static analyzer for java, c, c++, and objective-c.https://fbinfer.com/ (2015), accessed: 02-12-2026

2015
[5]

In: Proceedings of the 2012 ACM Conference on Com- puter and Communications Security

Bilge, L., Dumitraş, T.: Before we knew it: an empirical study of zero-day at- tacks in the real world. In: Proceedings of the 2012 ACM Conference on Com- puter and Communications Security. p. 833–844. CCS ’12, Association for Comput- ing Machinery, New York, NY, USA (2012).https://doi.org/10.1145/2382196. 2382284,https://doi.org/10.1145/2382196.2382284

work page doi:10.1145/2382196 2012
[6]

ACM Trans

Böhme, M.: Stads: Software testing as species discovery. ACM Trans. Softw. Eng. Methodol.27(2) (Jun 2018).https://doi.org/10.1145/3210309,https://doi. org/10.1145/3210309

work page doi:10.1145/3210309 2018
[7]

In: Proceedings of the 31st ACM SIGSOFT international symposium on software testing and analysis

Cheng, X., Zhang, G., Wang, H., Sui, Y.: Path-sensitive code embedding via con- trastive learning for software vulnerability detection. In: Proceedings of the 31st ACM SIGSOFT international symposium on software testing and analysis. pp. 519–531 (2022)

2022
[8]

In: Proceed- ings of the 15th ACM/IEEE international symposium on empirical software engi- neering and measurement (ESEM)

Croft, R., Newlands, D., Chen, Z., Babar, M.A.: An empirical study of rule-based and learning-based approaches for static application security testing. In: Proceed- ings of the 15th ACM/IEEE international symposium on empirical software engi- neering and measurement (ESEM). pp. 1–12 (2021)

2021
[9]

semanticscholar.org/CorpusID:8242220

Dijkstra, E.W.: Notes on structured programming (1970),https://api. semanticscholar.org/CorpusID:8242220

1970
[10]

Ding, Y., Fu, Y., Ibrahim, O., Sitawarin, C., Chen, X., Alomair, B., Wagner, D., Ray, B., Chen, Y.: Vulnerability detection with code language models: How far are we? In: 2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE).pp.1729–1741(2025).https://doi.org/10.1109/ICSE55347.2025.00038

work page doi:10.1109/icse55347.2025.00038 2025
[11]

In: Proceedings of the 14th Interna- tional Conference on Recent Advances in Natural Language Processing

Ebrahim, F., Joy, M.: Source code plagiarism detection with pre-trained model embeddings and automated machine learning. In: Proceedings of the 14th Interna- tional Conference on Recent Advances in Natural Language Processing. pp. 301– 309 (2023)

2023
[12]

In: Proceedings of the 29th ACM/IEEE international conference on Automated software engineering

Falleri, J.R., Morandat, F., Blanc, X., Martinez, M., Monperrus, M.: Fine-grained and accurate source code differencing. In: Proceedings of the 29th ACM/IEEE international conference on Automated software engineering. pp. 313–324 (2014)

2014
[13]

HYDRA: A Hybrid Heuristic-Guided Deep Representation Architecture for Predicting Latent Zero-Day Vulnerabilities in Patched Functions

Farhad, M., Rahman, S., Dass, S.: Hydra: A hybrid heuristic-guided deep rep- resentation architecture for predicting latent zero-day vulnerabilities in patched functions. arXiv preprint arXiv:2511.06220 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[14]

Farhad et al

Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong, M., Shou, L., Qin, B., Liu, T., Jiang, D., Zhou, M.: Codebert: A pre-trained model for programming and natural languages (2020) 18 M. Farhad et al

2020
[15]

In: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Fu, M., Tantithamthavorn, C., Le, T., Nguyen, V., Phung, D.: Vulrepair: a t5-based automated software vulnerability repair. In: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. p. 935–947. ESEC/FSE 2022, Association for Comput- ing Machinery, New York, NY, USA (2022).https...

work page doi:10.1145/3540250 2022
[16]

Unixcoder: Unified cross-modal pre-training for code representation,

Guo, D., Lu, S., Duan, N., Wang, Y., Zhou, M., Yin, J.: Unixcoder: Unified cross- modal pre-training for code representation. arXiv preprint arXiv:2203.03850 (2022)

work page arXiv 2022
[17]

GraphCodeBERT: Pre-training Code Representations with Data Flow

Guo, D., Ren, S., Lu, S., Feng, Z., Tang, D., Liu, S., Zhou, L., Duan, N., Svy- atkovskiy, A., Fu, S., et al.: Graphcodebert: Pre-training code representations with data flow. arXiv preprint arXiv:2009.08366 (2020)

work page internal anchor Pith review arXiv 2009
[18]

Leveraging rag-enhanced large language model for semi-supervised log anomaly detection

Han, M., Wang, L., Chang, J., Li, B., Zhang, C.: Learning graph-based patch rep- resentations for identifying and assessing silent vulnerability fixes. In: 2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE). pp. 120–131 (2024).https://doi.org/10.1109/ISSRE62328.2024.00022

work page doi:10.1109/issre62328.2024.00022 2024
[19]

Hugging Face, Inc.: Hugging face model hub.https://huggingface.co/, accessed: 11-20-2025

2025
[20]

In: 2025 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

Katz, K., Moshtari, S., Mujhid, I., Mirakhorli, M., Garcia, D.: Siexvults: Sensi- tive information exposure vulnerability detection system using transformer models and static analysis. In: 2025 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). pp. 230–241. IEEE (2025)

2025
[21]

arXiv preprint arXiv:2512.18261 (2025)

Kholoosi, M.M., Le, T.H.M., Babar, M.A.: Software vulnerability management in the era of artificial intelligence: An industry perspective. arXiv preprint arXiv:2512.18261 (2025)

work page arXiv 2025
[22]

Le, T.H.M., Babar, M.A.: Automatic data labeling for software vulnerability pre- diction models: How far are we? In: Proceedings of the 18th ACM/IEEE Inter- national Symposium on Empirical Software Engineering and Measurement. pp. 131–142 (2024)

2024
[23]

In: Proceedings of the 48th IEEE/ACM International Conference on Software Engineering, ICSE

Lee, S., Böhme, M.: Dependency-aware residual risk analysis. In: Proceedings of the 48th IEEE/ACM International Conference on Software Engineering, ICSE. vol. 26 (2026)

2026
[24]

Martinez-Gil,J.:Evaluatingsmall-scalecodemodelsforcodeclonedetection.arXiv preprint arXiv:2506.10995 (2025)

work page arXiv 2025
[25]

In: Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

Nguyen, A.T., Le, T.H.M., Babar, M.A.: Automated code-centric software vulnera- bility assessment: How far are we? an empirical study in c/c++. In: Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. pp. 72–83 (2024)

2024
[26]

arXiv preprint arXiv:2509.09714 (2025)

Nikiema, S.L., Djire, A.E., Bonkoungou, A.A., Moumoula, M.B., Samhi, J., Ka- bore, A.K., Klein, J., Bissyande, T.F.: How small transformation expose the weak- ness of semantic similarity measures. arXiv preprint arXiv:2509.09714 (2025)

work page arXiv 2025
[27]

Selvaraj, M., Uddin, G.: Does collaborative editing help mitigate security vulnera- bilities in crowd-shared iot code examples? In: Proceedings of the 16th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. pp. 92–102 (2022)

2022
[28]

Empirical Software Engineering28(6), 135 (2023)

Shi, E., Wang, Y., Du, L., Zhang, H., Han, S., Zhang, D., Sun, H.: Cocoast: repre- senting source code via hierarchical splitting and reconstruction of abstract syntax trees. Empirical Software Engineering28(6), 135 (2023)

2023
[29]

In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Song, Y., Lothritz, C., Tang, X., Bissyandé, T., Klein, J.: Revisiting code similarity evaluation with abstract syntax tree edit distance. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). pp. 38–46 (2024) Residual Risk Analysis in Benign Code: How Far Are We? 19

2024
[30]

Sun, W., Fang, C., Miao, Y., You, Y., Yuan, M., Chen, Y., Zhang, Q., Guo, A., Chen, X., Liu, Y., et al.: Abstract syntax tree for programming language under- standing and representation: How far are we? arXiv preprint arXiv:2312.00413 (2023)

work page arXiv 2023
[31]

arXiv preprint arXiv:2308.15233 (2023)

Tang, X., Ezzini, S., Tian, H., Song, Y., Klein, J., Bissyande, T.F., et al.: Multilevel semantic embedding of software patches: a fine-to-coarse grained approach towards security patch detection. arXiv preprint arXiv:2308.15233 (2023)

work page arXiv 2023
[32]

Tree-sitter Contributors: Tree-sitter: A parser generator tool and incremental pars- ing library.https://tree-sitter.github.io/tree-sitter/(2024), accessed: 01- 21-2026

2024
[33]

In: Proceedings of the 30th IEEE/ACM international conference on program comprehension

Wang, K., Yan, M., Zhang, H., Hu, H.: Unified abstract syntax tree represen- tation learning for cross-language program classification. In: Proceedings of the 30th IEEE/ACM international conference on program comprehension. pp. 390– 400 (2022)

2022
[34]

In: 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

Wang, S., Wen, M., Chen, L., Yi, X., Mao, X.: How different is it between machine- generated and developer-provided patches?: An empirical study on the correct patches generated by automated program repair techniques. In: 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). pp. 1–12. IEEE (2019)

2019
[35]

In: 44th IEEE Symposium on Security and Privacy, SP 2023, San Francisco, CA, USA, May 21-25, 2023

Wang, S., Wang, X., Sun, K., Jajodia, S., Wang, H., Li, Q.: GraphSPD: Graph-Based Security Patch Detection with Enriched Code Semantics . In: 2023 IEEE Symposium on Security and Privacy (SP). pp. 2409–2426. IEEE Computer Society, Los Alamitos, CA, USA (May 2023).https://doi.org/ 10.1109/SP46215.2023.10179479,https://doi.ieeecomputersociety.org/10. 1109/SP...

work page doi:10.1109/sp46215.2023.10179479 2023
[36]

arXiv preprint (2023)

Wang, Y., Le, H., Gotmare, A.D., Bui, N.D., Li, J., Hoi, S.C.H.: Codet5+: Open code large language models for code understanding and generation. arXiv preprint (2023)

2023
[37]

Wang, Y., Wang, W., Joty, S., Hoi, S.C.H.: Codet5: Identifier-aware unified pre- trained encoder-decoder models for code understanding and generation (2021)

2021
[38]

arXiv preprint arXiv:2509.19117 (2025)

Weissberg, F., Pirch, L., Imgrund, E., Möller, J., Eisenhofer, T., Rieck, K.: Llm- based vulnerability discovery through the lens of code metrics. arXiv preprint arXiv:2509.19117 (2025)

work page arXiv 2025
[39]

In: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering

Xie, Z., Wen, M., Wei, Z., Jin, H.: Unveiling the characteristics and impact of security patch evolution. In: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering. pp. 1094–1106 (2024)

2024
[40]

arXiv preprint arXiv:2512.20203 (2025)

Ye, Z., Sun, X., Cao, S., Bo, L., Li, B.: Well begun is half done: Location- aware and trace-guided iterative automated vulnerability repair. arXiv preprint arXiv:2512.20203 (2025)

work page arXiv 2025
[41]

Yi, G., Nong, Y., Li, M., Cai, H.: Exploring and improving real-world vulnerability data generation via prompting large language models (2026)

2026
[42]

Reliability Engineering & System Safety152, 137–150 (2016).https://doi.org/ https://doi.org/10.1016/j.ress.2016.02.009,https://www.sciencedirect

Zio, E.: Challenges in the vulnerability and risk analysis of critical infrastructures. Reliability Engineering & System Safety152, 137–150 (2016).https://doi.org/ https://doi.org/10.1016/j.ress.2016.02.009,https://www.sciencedirect. com/science/article/pii/S0951832016000508 A Extended Results 20 M. Farhad et al. Fig.5:Visualizationofembeddingsim- ilarity...

work page doi:10.1016/j.ress.2016.02.009 2016