RAVEN: Agentic RAG for Automated Vulnerability Repair

Alexandra Dmitrienko; Varun Gadey; Zijie Liu

arxiv: 2606.22647 · v1 · pith:KZRQ6HUFnew · submitted 2026-06-21 · 💻 cs.CR · cs.LG· cs.SE

RAVEN: Agentic RAG for Automated Vulnerability Repair

Varun Gadey , Zijie Liu , Alexandra Dmitrienko This is my paper

Pith reviewed 2026-06-26 09:55 UTC · model grok-4.3

classification 💻 cs.CR cs.LGcs.SE

keywords automated vulnerability repairagentic RAGCVEcross-file dependenciesopen-source LLMssoftware securityiterative patchinggeneralization

0 comments

The pith

RAVEN's agentic RAG framework repairs 83.13% of 160 real-world CVEs across vulnerability types and languages using only open-source local LLMs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents RAVEN as an autonomous system that pairs retrieval-augmented generation with agentic control and iterative patching to fix software vulnerabilities. A multi-faceted retrieval step pulls historically similar fixes while a separate Curator Agent gathers cross-file dependencies that local code alone cannot supply. The resulting pipeline runs entirely on open-source models in a local setting and is tested on 160 CVEs that include unseen categories, two languages, and out-of-distribution cases. It reports an 83.13% overall repair rate that exceeds prior frameworks while keeping compute cost low. A reader would care because existing repair tools stay confined to narrow vulnerability classes and often require proprietary models or single-language data.

Core claim

RAVEN integrates an agentic RAG pipeline with controlled iterative repair, utilizing open-source LLMs in a fully locally deployable setting with limited GPU requirements, a multi-faceted retrieval pipeline to retrieve historically relevant vulnerability fixes, and a dedicated Curator Agent that retrieves cross-file dependencies from the target repository to fix complex vulnerabilities that cannot be addressed using local vulnerable code alone. Evaluated on 160 real-world CVE vulnerabilities across diverse vulnerability types, two programming languages, unseen CWE categories, and out-of-distribution settings, RAVEN achieves an overall repair success rate of 83.13%, outperforming all existing

What carries the argument

Curator Agent that retrieves cross-file dependencies together with the multi-faceted retrieval pipeline that surfaces historical fixes to steer patch generation.

If this is right

RAVEN generalizes to unseen CWE categories and out-of-distribution vulnerability settings.
It delivers the reported performance using only open-source LLMs without proprietary models.
Repair cost stays negligible across the evaluated cases.
The approach handles vulnerabilities in two programming languages within one framework.
Complex cases that need repository-wide context are addressed by the dedicated Curator Agent.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the retrieval step continues to locate useful prior fixes for entirely new vulnerability patterns, the framework could shorten the time from disclosure to patch in production codebases.
Local-only execution would allow deployment inside regulated or air-gapped environments where external model calls are disallowed.
The same retrieval-plus-curator structure might transfer to non-security code repair tasks such as API migration or performance tuning with only changes to the retrieval corpus.
Adding dynamic execution traces to the Curator Agent could further raise success on vulnerabilities whose patches depend on runtime behavior.

Load-bearing premise

The multi-faceted retrieval pipeline and Curator Agent can reliably surface historically relevant fixes and cross-file dependencies sufficient to guide patch generation for complex vulnerabilities beyond local code.

What would settle it

Evaluating the same 160 CVEs after removing all historical repair examples from the retrieval index would show whether success rate remains above 70% or collapses.

Figures

Figures reproduced from arXiv: 2606.22647 by Alexandra Dmitrienko, Varun Gadey, Zijie Liu.

**Figure 1.** Figure 1: CWE-862 Missing Authorization issue occurs at line 4, where moveAttachment(...) is directly invoked without any prior authorization check on the requesting user. The surrounding lines primarily manage execution flow, including setting the user context (lines 1–2), tracking progress (lines 3, 5, 7, 9), but do not perform access control checks before the sensitive operation is executed. Exploitation of this … view at source ↗

**Figure 3.** Figure 3: RAVEN’s Repair Workflow with its four phases. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Semantic Retriever Module [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Prompt template for Selection Agent 15 [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: Prompt template for context retrieval 16 [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: Prompt template for Patch Generator 17 [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗

read the original abstract

Automated vulnerability repair has emerged as a promising direction to mitigate the growing number of software vulnerabilities. Recent advances in Large Language Models (LLMs) have further accelerated research in automated repair. However, existing frameworks remain largely restricted to memory-related vulnerabilities and locally repairable vulnerability settings, leaving generalization to unseen vulnerability types underexplored. Their evaluations are often limited to a single programming language, and largely rely on proprietary models. In this paper, we propose RAVEN, a scalable, efficient and autonomous framework that integrates an agentic retrieval-augmented generation (RAG) pipeline with controlled iterative repair in a unified framework. The framework utilizes open-source LLMs in a fully locally deployable setting with limited GPU requirements, while building a multi-faceted retrieval pipeline to retrieve historically relevant vulnerability fixes and guide the patch generation. In addition, RAVEN introduces a dedicated Curator Agent that retrieves cross-file dependencies from the target repository, to fix complex vulnerabilities that cannot be addressed using local vulnerable code alone. We evaluate RAVEN on 160 real-world CVE vulnerabilities across diverse vulnerability types, two programming languages, unseen CWE categories, and out-of-distribution settings. RAVEN achieves an overall repair success rate of 83.13%, outperforming all existing state-of-the-art repair frameworks, while also demonstrating strong generalization capabilities and maintaining the repair cost negligible.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RAVEN's abstract claims 83% success on 160 CVEs via agentic RAG plus a Curator Agent for cross-file fixes, but supplies no methods, baselines, or retrieval metrics to back it up.

read the letter

RAVEN claims an 83% success rate repairing real CVEs with a new agentic RAG framework and Curator Agent for cross-file context, all while staying local and cheap. That is the main takeaway from the abstract.

The paper builds on retrieval and LLM repair by adding the Curator to handle dependencies outside the local code and tests across more vulnerability types and languages than usual. It does a good job focusing on practical deployment with open models and low resources.

The soft spots are the missing pieces in the abstract: no baselines, no retrieval performance numbers, no error analysis, and no breakdown of the results. The high success rate and generalization claims rest on the retrieval working well for complex cases, but nothing in the summary shows that it does. This matches the stress-test concern exactly.

This paper is for researchers in automated vulnerability repair using LLMs. Readers who follow that area will find the framework idea worth checking if the full paper has solid experiments behind the numbers. It deserves peer review because the problem is real and the proposal is concrete, even though the abstract alone does not let us verify the claims.

I would not cite it based on this. I might discuss it in a reading group if the methods section holds up. Send it to referees.

Referee Report

3 major / 2 minor

Summary. The paper proposes RAVEN, an agentic RAG framework for automated vulnerability repair that combines multi-faceted retrieval of historical fixes with a dedicated Curator Agent to handle cross-file dependencies. It evaluates the system on 160 real-world CVEs across two languages and unseen CWEs, claiming an 83.13% overall repair success rate that outperforms prior frameworks while using only open-source LLMs in a locally deployable setting with negligible repair cost.

Significance. If the performance and generalization claims are substantiated with supporting retrieval metrics and failure analysis, the work would be significant for extending automated repair beyond memory-related and locally fixable vulnerabilities to more complex, cross-file cases using accessible models.

major comments (3)

[Evaluation] Evaluation section: the reported 83.13% success rate on 160 CVEs is presented without retrieval-precision metrics (e.g., precision@K or recall for the multi-faceted pipeline) or a breakdown of how many cases required the Curator Agent, which directly undermines assessment of whether the agentic-RAG components are responsible for the claimed generalization to unseen CWEs and complex vulnerabilities.
[§4] §4 (Curator Agent description): no quantitative evaluation is supplied on the Curator Agent's success rate in surfacing non-local dependencies, despite this being the load-bearing mechanism for the claim that RAVEN handles vulnerabilities "that cannot be addressed using local vulnerable code alone."
[Results] Results tables (e.g., Table 2 or equivalent): the outperformance claim over SOTA lacks per-CWE or per-language breakdowns that would allow verification of the "strong generalization capabilities" assertion, especially given the stress-test concern around cross-file retrieval reliability.

minor comments (2)

[Abstract] The abstract states "repair cost negligible" without defining the cost metric (tokens, wall-clock time, or GPU hours); this should be clarified with explicit numbers in the evaluation.
[§3] Notation for the multi-faceted retrieval pipeline (e.g., how the facets are combined) is introduced without a formal diagram or pseudocode, reducing reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that the evaluation would be strengthened by additional metrics and breakdowns, and we will incorporate them in the revision.

read point-by-point responses

Referee: [Evaluation] Evaluation section: the reported 83.13% success rate on 160 CVEs is presented without retrieval-precision metrics (e.g., precision@K or recall for the multi-faceted pipeline) or a breakdown of how many cases required the Curator Agent, which directly undermines assessment of whether the agentic-RAG components are responsible for the claimed generalization to unseen CWEs and complex vulnerabilities.

Authors: We agree that retrieval-precision metrics and a breakdown of Curator Agent usage are absent from the current evaluation section. These additions would allow clearer attribution of results to the agentic-RAG components. In the revised manuscript we will report precision@K and recall for the multi-faceted retrieval pipeline as well as the number and percentage of cases that invoked the Curator Agent. revision: yes
Referee: [§4] §4 (Curator Agent description): no quantitative evaluation is supplied on the Curator Agent's success rate in surfacing non-local dependencies, despite this being the load-bearing mechanism for the claim that RAVEN handles vulnerabilities "that cannot be addressed using local vulnerable code alone."

Authors: We acknowledge that §4 provides no quantitative success rate for the Curator Agent on non-local dependencies. We will add such an evaluation in the revision, including accuracy figures obtained via manual inspection of the 160 CVEs to quantify how often the agent correctly surfaces cross-file dependencies. revision: yes
Referee: [Results] Results tables (e.g., Table 2 or equivalent): the outperformance claim over SOTA lacks per-CWE or per-language breakdowns that would allow verification of the "strong generalization capabilities" assertion, especially given the stress-test concern around cross-file retrieval reliability.

Authors: The current tables report only aggregate success rates. We agree that per-CWE and per-language breakdowns would better support the generalization claims. We will expand the results tables in the revised version to include these stratified results. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical framework evaluated on external CVE benchmark

full rationale

The paper describes an agentic RAG framework for vulnerability repair and reports an 83.13% success rate on 160 real-world CVEs across languages and unseen CWEs. No equations, fitted parameters, or derivation chain appear in the abstract or described claims. Performance is presented as a direct experimental outcome rather than a quantity derived from or equivalent to the framework's own inputs by construction. No self-citation load-bearing steps or ansatz smuggling are indicated. The central claim rests on retrieval and agent behavior, which the skeptic correctly flags as an assumption, but this is an empirical risk, not a circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no technical sections, equations, or implementation details are provided to identify free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5777 in / 1134 out tokens · 25092 ms · 2026-06-26T09:55:47.589100+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

78 extracted references · 15 canonical work pages

[1]

46 Vulnerability Statistics 2026: Key Trends in Discovery, Exploitation, and Risk,

“46 Vulnerability Statistics 2026: Key Trends in Discovery, Exploitation, and Risk,” 2026. [Online]. Available: https://security boulevard.com/2026/03/46-vulnerability-statistics-2026-key-trends-i n-discovery-exploitation-and-risk/

2026
[2]

Context- aware patch generation for better automated program repair,

M. Wen, J. Chen, R. Wu, D. Hao, and S.-C. Cheung, “Context- aware patch generation for better automated program repair,” in Proceedings of the 40th International Conference on Software 11 Engineering, ser. ICSE ’18. New York, NY , USA: Association for Computing Machinery, 2018, p. 1–11. [Online]. Available: https://doi.org/10.1145/3180155.3180233

work page doi:10.1145/3180155.3180233 2018
[3]

Better code search and reuse for better pro- gram repair,

Q. Xin and S. Reiss, “Better code search and reuse for better pro- gram repair,” in2019 IEEE/ACM International Workshop on Genetic Improvement (GI), 2019, pp. 10–17

2019
[4]

Sosrepair: Expressive semantic search for real-world program re- pair,

A. Afzal, M. Motwani, K. T. Stolee, Y . Brun, and C. Le Goues, “Sosrepair: Expressive semantic search for real-world program re- pair,”IEEE Transactions on Software Engineering, vol. 47, no. 10, pp. 2162–2181, 2021

2021
[5]

Automatic program repair using formal verification and expression templates,

T.-T. Nguyen, Q.-T. Ta, and W.-N. Chin, “Automatic program repair using formal verification and expression templates,” inVerification, Model Checking, and Abstract Interpretation, C. Enea and R. Piskac, Eds. Cham: Springer International Publishing, 2019, pp. 70–91

2019
[6]

Automatic patch generation learned from human-written patches,

D. Kim, J. Nam, J. Song, and S. Kim, “Automatic patch generation learned from human-written patches,” in2013 35th International Conference on Software Engineering (ICSE), 2013, pp. 802–811

2013
[7]

You cannot fix what you cannot find! an investiga- tion of fault localization bias in benchmarking automated program repair systems,

K. Liu, A. Koyuncu, T. F. Bissyand ´e, D. Kim, J. Klein, and Y . Le Traon, “You cannot fix what you cannot find! an investiga- tion of fault localization bias in benchmarking automated program repair systems,” in2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), 2019, pp. 102–113

2019
[8]

Avatar: Fixing semantic bugs with fix patterns of static analysis violations,

K. Liu, A. Koyuncu, D. Kim, and T. F. Bissyand ´e, “Avatar: Fixing semantic bugs with fix patterns of static analysis violations,”2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 1–12, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:201043038

2019
[9]

Sequencer: Sequence-to-sequence learning for end-to-end program repair,

Z. Chen, S. Kommrusch, M. Tufano, L.-N. Pouchet, D. Poshyvanyk, and M. Monperrus, “Sequencer: Sequence-to-sequence learning for end-to-end program repair,”IEEE Transactions on Software Engi- neering, vol. 47, no. 9, pp. 1943–1959, 2021

1943
[10]

Sorting and transforming program repair ingredients via deep learning code similarities,

M. White, M. Tufano, M. Mart ´ınez, M. Monperrus, and D. Poshy- vanyk, “Sorting and transforming program repair ingredients via deep learning code similarities,” in2019 IEEE 26th International Confer- ence on Software Analysis, Evolution and Reengineering (SANER), 2019, pp. 479–490

2019
[11]

Coconut: combining context-aware neural translation models using ensemble for program repair,

T. Lutellier, H. V . Pham, L. Pang, Y . Li, M. Wei, and L. Tan, “Coconut: combining context-aware neural translation models using ensemble for program repair,” ser. ISSTA 2020. New York, NY , USA: Association for Computing Machinery, 2020, p. 101–114. [Online]. Available: https://doi.org/10.1145/3395363.3397369

work page doi:10.1145/3395363.3397369 2020
[12]

Neural transfer learning for repairing security vulnerabilities in c code,

Z. Chen, S. Kommrusch, and M. Monperrus, “Neural transfer learning for repairing security vulnerabilities in c code,”IEEE Transactions on Software Engineering, vol. 49, no. 1, pp. 147–165, 2023

2023
[13]

Seqtrans: Automatic vulnerability fix via sequence to sequence learning,

J. Chi, Y . Qu, T. Liu, Q. Zheng, and H. Yin, “Seqtrans: Automatic vulnerability fix via sequence to sequence learning,”IEEE Transac- tions on Software Engineering, vol. 49, no. 2, pp. 564–585, 2023

2023
[14]

Vulrepair: a t5-based automated software vulnerability repair,

M. Fu, C. Tantithamthavorn, T. Le, V . Nguyen, and D. Phung, “Vulrepair: a t5-based automated software vulnerability repair,” ser. ESEC/FSE 2022. New York, NY , USA: Association for Computing Machinery, 2022, p. 935–947. [Online]. Available: https://doi.org/10.1145/3540250.3549098

work page doi:10.1145/3540250.3549098 2022
[15]

Out of sight, out of mind: Better automatic vulnerability repair by broadening input ranges and sources,

X. Zhou, K. Kim, B. Xu, D. Han, and D. Lo, “Out of sight, out of mind: Better automatic vulnerability repair by broadening input ranges and sources,” inProceedings of the IEEE/ACM 46th International Conference on Software Engineering, ser. ICSE ’24. New York, NY , USA: Association for Computing Machinery, 2024. [Online]. Available: https://doi.org/10.1145...

work page doi:10.1145/3597503.3639222 2024
[16]

Ex- amining zero-shot vulnerability repair with large language models,

H. Pearce, B. Tan, B. Ahmad, R. Karri, and B. Dolan-Gavitt, “Ex- amining zero-shot vulnerability repair with large language models,” in2023 IEEE Symposium on Security and Privacy (SP), 2023, pp. 2339–2356

2023
[17]

Rap-gen: Retrieval- augmented patch generation with codet5 for automatic program repair,

W. Wang, Y . Wang, S. Joty, and S. C. Hoi, “Rap-gen: Retrieval- augmented patch generation with codet5 for automatic program repair,” ser. ESEC/FSE 2023. New York, NY , USA: Association for Computing Machinery, 2023, p. 146–158. [Online]. Available: https://doi.org/10.1145/3611643.3616256

work page doi:10.1145/3611643.3616256 2023
[18]

Y . Kim, S. Shin, H. Kim, and J. Yoon,Logs in, patches out: auto- mated vulnerability repair via tree-of-thought LLM analysis. USA: USENIX Association, 2025

2025
[19]

Patchagent: a practical program repair agent mimicking human ex- pertise,

Z. Yu, Z. Guo, Y . Wu, J. Yu, M. Xu, D. Mu, Y . Chen, and X. Xing, “Patchagent: a practical program repair agent mimicking human ex- pertise,” inProceedings of the 34th USENIX Conference on Security Symposium, ser. SEC ’25. USA: USENIX Association, 2025

2025
[20]

Appatch: au- tomated adaptive prompting large language models for real-world software vulnerability patching,

Y . Nong, H. Yang, L. Cheng, H. Hu, and H. Cai, “Appatch: au- tomated adaptive prompting large language models for real-world software vulnerability patching,” inProceedings of the 34th USENIX Conference on Security Symposium, ser. SEC ’25. USA: USENIX Association, 2025

2025
[21]

Beyond tests: Program vulnerability repair via crash constraint extraction,

X. Gao, B. Wang, G. J. Duck, R. Ji, Y . Xiong, and A. Roychoudhury, “Beyond tests: Program vulnerability repair via crash constraint extraction,”ACM Trans. Softw. Eng. Methodol., vol. 30, no. 2, Feb
[22]

Available: https://doi.org/10.1145/3418461

[Online]. Available: https://doi.org/10.1145/3418461

work page doi:10.1145/3418461
[23]

Gemma 4,

Google DeepMind, “Gemma 4,” 2026. [Online]. Available: https: //deepmind.google/models/gemma/gemma-4/

2026
[24]

Nemotron-3-30b,

NVIDIA Corporation, “Nemotron-3-30b,” 2025. [Online]. Available: https://build.nvidia.com/nvidia/nemotron-3-nano-30b-a3b/modelcard

2025
[25]

CVE-2023-37910,

“CVE-2023-37910,” 2023. [Online]. Available: https://www.cve.org/ CVERecord?id=CVE-2023-37910

2023
[26]

XWiki Platform: An Open-Source Enterprise Wiki Framework

“XWiki Platform: An Open-Source Enterprise Wiki Framework.”
[27]

CWE-862: Missing Authorization

“CWE-862: Missing Authorization.” [Online]. Available: https: //cwe.mitre.org/data/definitions/862.html
[28]

CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software,

G. Bhandari, A. Naseer, and L. Moonen, “Cvefixes: automated collection of vulnerabilities and their fixes from open-source software,” inProceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering, ser. PROMISE 2021. New York, NY , USA: Association for Computing Machinery, 2021, p. 30–39. [Online]. Avail...

work page doi:10.1145/3475960.3475985 2021
[29]

Joern: A tool for parsing and analyzing source code,

J. Developers, “Joern: A tool for parsing and analyzing source code,” 2017. [Online]. Available: https://joern.io

2017
[30]

The probabilistic relevance frame- work: Bm25 and beyond,

S. Robertson and H. Zaragoza, “The probabilistic relevance frame- work: Bm25 and beyond,”Foundations and Trends in Information Retrieval, vol. 3, pp. 333–389, 09 2009

2009
[31]

Semgrep: Lightweight static analysis for many lan- guages,

Semgrep, Inc., “Semgrep: Lightweight static analysis for many lan- guages,” https://github.com/semgrep/semgrep
[32]

Codebleu: a method for automatic evaluation of code synthesis,

S. Ren, D. Guo, S. Lu, L. Zhou, S. Liu, D. Tang, N. Sundaresan, M. Zhou, A. Blanco, and S. Ma, “Codebleu: a method for automatic evaluation of code synthesis,” 2020. [Online]. Available: https://arxiv.org/abs/2009.10297

Pith/arXiv arXiv 2020
[33]

CWE-787: Out-of-bounds Write

“CWE-787: Out-of-bounds Write.” [Online]. Available: https: //cwe.mitre.org/data/definitions/787.html
[34]

CWE-125: Out-of-bounds Read

“CWE-125: Out-of-bounds Read.” [Online]. Available: https: //cwe.mitre.org/data/definitions/125.html
[35]

CWE-79: Improper Neutralization of Input During Web Page Generation (’Cross-site Scripting’)

“CWE-79: Improper Neutralization of Input During Web Page Generation (’Cross-site Scripting’).” [Online]. Available: https: //cwe.mitre.org/data/definitions/79.html
[36]

CWE-89: Improper Neutralization of Special Elements used in an SQL Command (’SQL Injection’)

“CWE-89: Improper Neutralization of Special Elements used in an SQL Command (’SQL Injection’).” [Online]. Available: https://cwe.mitre.org/data/definitions/89.html
[37]

CWE-94: Improper Control of Generation of Code (’Code Injection’)

“CWE-94: Improper Control of Generation of Code (’Code Injection’).” [Online]. Available: https://cwe.mitre.org/data/definitio ns/94.html
[38]

CWE-22: Improper Limitation of a Pathname to a Restricted Directory (’Path Traversal’)

“CWE-22: Improper Limitation of a Pathname to a Restricted Directory (’Path Traversal’).” [Online]. Available: https://cwe.mitre. org/data/definitions/22.html
[39]

CWE-502: Deserialization of Untrusted Data

“CWE-502: Deserialization of Untrusted Data.” [Online]. Available: https://cwe.mitre.org/data/definitions/502.html 12
[40]

CWE-352: Cross-Site Request Forgery (CSRF)

“CWE-352: Cross-Site Request Forgery (CSRF).” [Online]. Available: https://cwe.mitre.org/data/definitions/352.html
[41]

CWE-863: Incorrect Authorization

“CWE-863: Incorrect Authorization.” [Online]. Available: https: //cwe.mitre.org/data/definitions/863.html
[42]

Vul4j: A dataset of reproducible java vulnerabilities geared towards the study of program repair techniques,

Q.-C. Bui, R. Scandariato, and N. E. D. Ferreyra, “Vul4j: A dataset of reproducible java vulnerabilities geared towards the study of program repair techniques,” in2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR), 2022, pp. 464–468

2022
[43]

CWE-835: Loop with Unreachable Exit Condition (’Infinite Loop’)

“CWE-835: Loop with Unreachable Exit Condition (’Infinite Loop’).” [Online]. Available: https://cwe.mitre.org/data/definitions/835.html
[44]

CWE-828: Signal Handler with Functionality that is not Asynchronous-Safe

“ CWE-828: Signal Handler with Functionality that is not Asynchronous-Safe.” [Online]. Available: https://cwe.mitre.org/data /definitions/828.html
[45]

CWE-770: Allocation of Resources Without Limits or Throttling

“ CWE-770: Allocation of Resources Without Limits or Throttling.” [Online]. Available: https://cwe.mitre.org/data/definitions/770.html
[46]

CWE-476: NULL Pointer Dereference

“CWE-476: NULL Pointer Dereference.” [Online]. Available: https://cwe.mitre.org/data/definitions/476.html
[47]

CWE-416: Use After Free

“CWE-416: Use After Free.” [Online]. Available: https://cwe.mitre. org/data/definitions/416.html
[48]

CWE-281: Improper Preservation of Permissions

“CWE-281: Improper Preservation of Permissions.” [Online]. Available: https://cwe.mitre.org/data/definitions/281.html
[49]

CWE-96: Improper Neutralization of Directives in Statically Saved Code (’Static Code Injection’)

“CWE-96: Improper Neutralization of Directives in Statically Saved Code (’Static Code Injection’).” [Online]. Available: https: //cwe.mitre.org/data/definitions/96.html
[50]

Program vulnerability repair via inductive inference,

Y . Zhang, X. Gao, G. J. Duck, and A. Roychoudhury, “Program vulnerability repair via inductive inference,” inProceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA 2022. New York, NY , USA: Association for Computing Machinery, 2022, p. 691–702. [Online]. Available: https://doi.org/10.1145/3533767.3534387

work page doi:10.1145/3533767.3534387 2022
[51]

Varfix: balancing edit expressiveness and search effectiveness in automated program repair,

C.-P. Wong, P. Santiesteban, C. K ¨astner, and C. Le Goues, “Varfix: balancing edit expressiveness and search effectiveness in automated program repair,” inProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ser. ESEC/FSE 2021. New York, NY , USA: Association for C...

work page doi:10.1145/3468264.3468600 2021
[52]

An analysis of the search spaces for generate and validate patch generation systems,

F. Long and M. Rinard, “An analysis of the search spaces for generate and validate patch generation systems,” inProceedings of the 38th International Conference on Software Engineering, ser. ICSE ’16. New York, NY , USA: Association for Computing Machinery, 2016, p. 702–713. [Online]. Available: https://doi.org/10 .1145/2884781.2884872

arXiv 2016
[53]

A survey of symbolic execution techniques,

R. Baldoni, E. Coppa, D. C. D’elia, C. Demetrescu, and I. Finocchi, “A survey of symbolic execution techniques,” vol. 51, no. 3, May
[54]

Available: https://doi.org/10.1145/3182657

[Online]. Available: https://doi.org/10.1145/3182657

work page doi:10.1145/3182657
[55]

relifix: automated repair of software regressions,

S. H. Tan and A. Roychoudhury, “relifix: automated repair of software regressions,” inProceedings of the 37th International Conference on Software Engineering - Volume 1, ser. ICSE ’15. IEEE Press, 2015, p. 471–482

2015
[56]

Automatic inference of code transforms for patch generation,

F. Long, P. Amidon, and M. Rinard, “Automatic inference of code transforms for patch generation,” inProceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ser. ESEC/FSE 2017. New York, NY , USA: Association for Computing Machinery, 2017, p. 727–739. [Online]. Available: https://doi.org/10.1145/3106237.3106253

work page doi:10.1145/3106237.3106253 2017
[57]

Tbar: revisiting template-based automated program repair,

K. Liu, A. Koyuncu, D. Kim, and T. F. Bissyand ´e, “Tbar: revisiting template-based automated program repair,” inProceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA 2019. New York, NY , USA: Association for Computing Machinery, 2019, p. 31–42. [Online]. Available: https://doi.org/10.1145/3293882.3330577

work page doi:10.1145/3293882.3330577 2019
[58]

Dlfix: context-based code transformation learning for automated program repair,

Y . Li, S. Wang, and T. N. Nguyen, “Dlfix: context-based code transformation learning for automated program repair,” in Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ser. ICSE ’20. New York, NY , USA: Association for Computing Machinery, 2020, p. 602–614. [Online]. Available: https://doi.org/10.1145/3377811.3380345

work page doi:10.1145/3377811.3380345 2020
[59]

Review4repair: Code review aided automatic program repairing,

F. Huq, M. Hasan, M. M. A. Haque, S. Mahbub, A. Iqbal, and T. Ahmed, “Review4repair: Code review aided automatic program repairing,”Inf. Softw. Technol., vol. 143, no. C, Mar. 2022. [Online]. Available: https://doi.org/10.1016/j.infsof.2021.106765

work page doi:10.1016/j.infsof.2021.106765 2022
[60]

How effective are neural networks for fixing security vulnerabilities,

Y . Wu, N. Jiang, H. V . Pham, T. Lutellier, J. Davis, L. Tan, P. Babkin, and S. Shah, “How effective are neural networks for fixing security vulnerabilities,” inProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA 2023. New York, NY , USA: Association for Computing Machinery, 2023, p. 1282–1294. [Online...

work page doi:10.1145/3597926.3598135 2023
[61]

Retrieval-augmented generation for knowledge- intensive nlp tasks,

P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. Kuttler, M. Lewis, W.-t. Yih, T. Rocktaschel, S. Riedel, and D. Kiela, “Retrieval-augmented generation for knowledge- intensive nlp tasks,” inAdvances in Neural Information Processing Systems, vol. 33, 2020, pp. 9459–9474. [Online]. Available: https://proceedings.neurips.cc/paper/2020/...

2020
[62]

Retrieval-augmented generation for large language models: A survey,

Y . Gao, Y . Xiong, X. Gao, K. Jia, J. Pan, Y . Bi, Y . Dai, J. Sun, M. Wang, and H. Wang, “Retrieval-augmented generation for large language models: A survey,”arXiv preprint arXiv:2312.10997, 2023. [Online]. Available: https://arxiv.org/abs/2312.10997 Appendix This appendix provides background on Retrieval Aug- mented Generation, prompt templates and rep...

Pith/arXiv arXiv 2023
[63]

Retrieval Augmented Generation Retrieval Augmented Generation (RAG) incorporates external knowledge at inference time by combining the internal knowledge of an LLM with external sources such as vector databases, knowledge bases, and code reposito- ries [59], [60]. This is particularly beneficial for knowledge- intensive tasks where information changes dyn...

2015
[64]

Performance of RA VEN on C based CWEs The results in table 7 further demonstrate the effective- ness of RA VEN on C-based vulnerabilities across memory- safety and access-control categories. Formemory-based vul- nerabilities(CWE-125), RA VEN successfully repairs17out of20instances, achieving a repair success rate of85.00%, which shows its ability to mitig...
[65]

Select the single candidate patch whose fix strategy is the best semantic referencefor repairing the vulnerable code

Prompt Templates 14 Selector Agent Prompt Template: You are a < language > code vulnerability patch selection agent . Select the single candidate patch whose fix strategy is the best semantic referencefor repairing the vulnerable code . You are given : - a vulnerable code snippet - a CWE description - several candidate patches Your task is to choose the c...
[66]

Prioritize the underlying vulnerability cause over file names , function names , or CVE IDs
[67]

Prefer the candidate with the most similar mitigation strategy and security checks
[68]

Focus on the security logic introduced by the patch (forexample : bounds validation , length checks , null checks , state validation , safe iteration limits )
[69]

Do not rely on superficial textual overlap alone
[70]

If the CWE description and vulnerable code appear inconsistent , prioritize the actual vulnerable code pattern and the implied root cause in the code
[71]

If a candidate appears to already match logic present in the vulnerable code , treat it cautiously ; it may indicate dataset noise or a pre - patched snippet
[72]

Assume candidate patches may come from related or unrelated files ; file identity is not a deciding factor . Return exactly : Candidate x vulnerable_code : < vulnerable_code > cwe_description : < cwe_description > given_patches : Candidate 1 CVE : < candidate_1_cve_id > Filename : < candidate_1_filename > Patch diff : < candidate_1_patch_diff > --- Candid...
[73]

{ r ef e re n c e_ g u id a n ce }

Prioritize the concrete root cause visible in the target code . { r ef e re n c e_ g u id a n ce }
[74]

Apply the smallest targeted fix that prevents the unsafe behaviorwhilepreserving existing logic
[75]

Do not refactor , rename symbols , change formatting unnecessarily , or modify unrelated behavior
[76]

Do not introduce new helper functions , macros , types , or dependencies unless clearly implied by the provided code context
[77]

Reuse existing validation style , control flow , and error handling patterns when possible
[78]

Output rules : - Return only one```{ code_fence }```code block and no other text

Ensure the result is syntactically valid and consistent with the surrounding code . Output rules : - Return only one```{ code_fence }```code block and no other text . - Inside the code block , provide only the minimal fixed code snippet ( s )forthe target file . - Do not output a unified diff unless explicitly requested . { r e f e r e n c e _ o u t p u t...

[1] [1]

46 Vulnerability Statistics 2026: Key Trends in Discovery, Exploitation, and Risk,

“46 Vulnerability Statistics 2026: Key Trends in Discovery, Exploitation, and Risk,” 2026. [Online]. Available: https://security boulevard.com/2026/03/46-vulnerability-statistics-2026-key-trends-i n-discovery-exploitation-and-risk/

2026

[2] [2]

Context- aware patch generation for better automated program repair,

M. Wen, J. Chen, R. Wu, D. Hao, and S.-C. Cheung, “Context- aware patch generation for better automated program repair,” in Proceedings of the 40th International Conference on Software 11 Engineering, ser. ICSE ’18. New York, NY , USA: Association for Computing Machinery, 2018, p. 1–11. [Online]. Available: https://doi.org/10.1145/3180155.3180233

work page doi:10.1145/3180155.3180233 2018

[3] [3]

Better code search and reuse for better pro- gram repair,

Q. Xin and S. Reiss, “Better code search and reuse for better pro- gram repair,” in2019 IEEE/ACM International Workshop on Genetic Improvement (GI), 2019, pp. 10–17

2019

[4] [4]

Sosrepair: Expressive semantic search for real-world program re- pair,

A. Afzal, M. Motwani, K. T. Stolee, Y . Brun, and C. Le Goues, “Sosrepair: Expressive semantic search for real-world program re- pair,”IEEE Transactions on Software Engineering, vol. 47, no. 10, pp. 2162–2181, 2021

2021

[5] [5]

Automatic program repair using formal verification and expression templates,

T.-T. Nguyen, Q.-T. Ta, and W.-N. Chin, “Automatic program repair using formal verification and expression templates,” inVerification, Model Checking, and Abstract Interpretation, C. Enea and R. Piskac, Eds. Cham: Springer International Publishing, 2019, pp. 70–91

2019

[6] [6]

Automatic patch generation learned from human-written patches,

D. Kim, J. Nam, J. Song, and S. Kim, “Automatic patch generation learned from human-written patches,” in2013 35th International Conference on Software Engineering (ICSE), 2013, pp. 802–811

2013

[7] [7]

You cannot fix what you cannot find! an investiga- tion of fault localization bias in benchmarking automated program repair systems,

K. Liu, A. Koyuncu, T. F. Bissyand ´e, D. Kim, J. Klein, and Y . Le Traon, “You cannot fix what you cannot find! an investiga- tion of fault localization bias in benchmarking automated program repair systems,” in2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), 2019, pp. 102–113

2019

[8] [8]

Avatar: Fixing semantic bugs with fix patterns of static analysis violations,

K. Liu, A. Koyuncu, D. Kim, and T. F. Bissyand ´e, “Avatar: Fixing semantic bugs with fix patterns of static analysis violations,”2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 1–12, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:201043038

2019

[9] [9]

Sequencer: Sequence-to-sequence learning for end-to-end program repair,

Z. Chen, S. Kommrusch, M. Tufano, L.-N. Pouchet, D. Poshyvanyk, and M. Monperrus, “Sequencer: Sequence-to-sequence learning for end-to-end program repair,”IEEE Transactions on Software Engi- neering, vol. 47, no. 9, pp. 1943–1959, 2021

1943

[10] [10]

Sorting and transforming program repair ingredients via deep learning code similarities,

M. White, M. Tufano, M. Mart ´ınez, M. Monperrus, and D. Poshy- vanyk, “Sorting and transforming program repair ingredients via deep learning code similarities,” in2019 IEEE 26th International Confer- ence on Software Analysis, Evolution and Reengineering (SANER), 2019, pp. 479–490

2019

[11] [11]

Coconut: combining context-aware neural translation models using ensemble for program repair,

T. Lutellier, H. V . Pham, L. Pang, Y . Li, M. Wei, and L. Tan, “Coconut: combining context-aware neural translation models using ensemble for program repair,” ser. ISSTA 2020. New York, NY , USA: Association for Computing Machinery, 2020, p. 101–114. [Online]. Available: https://doi.org/10.1145/3395363.3397369

work page doi:10.1145/3395363.3397369 2020

[12] [12]

Neural transfer learning for repairing security vulnerabilities in c code,

Z. Chen, S. Kommrusch, and M. Monperrus, “Neural transfer learning for repairing security vulnerabilities in c code,”IEEE Transactions on Software Engineering, vol. 49, no. 1, pp. 147–165, 2023

2023

[13] [13]

Seqtrans: Automatic vulnerability fix via sequence to sequence learning,

J. Chi, Y . Qu, T. Liu, Q. Zheng, and H. Yin, “Seqtrans: Automatic vulnerability fix via sequence to sequence learning,”IEEE Transac- tions on Software Engineering, vol. 49, no. 2, pp. 564–585, 2023

2023

[14] [14]

Vulrepair: a t5-based automated software vulnerability repair,

M. Fu, C. Tantithamthavorn, T. Le, V . Nguyen, and D. Phung, “Vulrepair: a t5-based automated software vulnerability repair,” ser. ESEC/FSE 2022. New York, NY , USA: Association for Computing Machinery, 2022, p. 935–947. [Online]. Available: https://doi.org/10.1145/3540250.3549098

work page doi:10.1145/3540250.3549098 2022

[15] [15]

Out of sight, out of mind: Better automatic vulnerability repair by broadening input ranges and sources,

X. Zhou, K. Kim, B. Xu, D. Han, and D. Lo, “Out of sight, out of mind: Better automatic vulnerability repair by broadening input ranges and sources,” inProceedings of the IEEE/ACM 46th International Conference on Software Engineering, ser. ICSE ’24. New York, NY , USA: Association for Computing Machinery, 2024. [Online]. Available: https://doi.org/10.1145...

work page doi:10.1145/3597503.3639222 2024

[16] [16]

Ex- amining zero-shot vulnerability repair with large language models,

H. Pearce, B. Tan, B. Ahmad, R. Karri, and B. Dolan-Gavitt, “Ex- amining zero-shot vulnerability repair with large language models,” in2023 IEEE Symposium on Security and Privacy (SP), 2023, pp. 2339–2356

2023

[17] [17]

Rap-gen: Retrieval- augmented patch generation with codet5 for automatic program repair,

W. Wang, Y . Wang, S. Joty, and S. C. Hoi, “Rap-gen: Retrieval- augmented patch generation with codet5 for automatic program repair,” ser. ESEC/FSE 2023. New York, NY , USA: Association for Computing Machinery, 2023, p. 146–158. [Online]. Available: https://doi.org/10.1145/3611643.3616256

work page doi:10.1145/3611643.3616256 2023

[18] [18]

Y . Kim, S. Shin, H. Kim, and J. Yoon,Logs in, patches out: auto- mated vulnerability repair via tree-of-thought LLM analysis. USA: USENIX Association, 2025

2025

[19] [19]

Patchagent: a practical program repair agent mimicking human ex- pertise,

Z. Yu, Z. Guo, Y . Wu, J. Yu, M. Xu, D. Mu, Y . Chen, and X. Xing, “Patchagent: a practical program repair agent mimicking human ex- pertise,” inProceedings of the 34th USENIX Conference on Security Symposium, ser. SEC ’25. USA: USENIX Association, 2025

2025

[20] [20]

Appatch: au- tomated adaptive prompting large language models for real-world software vulnerability patching,

Y . Nong, H. Yang, L. Cheng, H. Hu, and H. Cai, “Appatch: au- tomated adaptive prompting large language models for real-world software vulnerability patching,” inProceedings of the 34th USENIX Conference on Security Symposium, ser. SEC ’25. USA: USENIX Association, 2025

2025

[21] [21]

Beyond tests: Program vulnerability repair via crash constraint extraction,

X. Gao, B. Wang, G. J. Duck, R. Ji, Y . Xiong, and A. Roychoudhury, “Beyond tests: Program vulnerability repair via crash constraint extraction,”ACM Trans. Softw. Eng. Methodol., vol. 30, no. 2, Feb

[22] [22]

Available: https://doi.org/10.1145/3418461

[Online]. Available: https://doi.org/10.1145/3418461

work page doi:10.1145/3418461

[23] [23]

Gemma 4,

Google DeepMind, “Gemma 4,” 2026. [Online]. Available: https: //deepmind.google/models/gemma/gemma-4/

2026

[24] [24]

Nemotron-3-30b,

NVIDIA Corporation, “Nemotron-3-30b,” 2025. [Online]. Available: https://build.nvidia.com/nvidia/nemotron-3-nano-30b-a3b/modelcard

2025

[25] [25]

CVE-2023-37910,

“CVE-2023-37910,” 2023. [Online]. Available: https://www.cve.org/ CVERecord?id=CVE-2023-37910

2023

[26] [26]

XWiki Platform: An Open-Source Enterprise Wiki Framework

“XWiki Platform: An Open-Source Enterprise Wiki Framework.”

[27] [27]

CWE-862: Missing Authorization

“CWE-862: Missing Authorization.” [Online]. Available: https: //cwe.mitre.org/data/definitions/862.html

[28] [28]

CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software,

G. Bhandari, A. Naseer, and L. Moonen, “Cvefixes: automated collection of vulnerabilities and their fixes from open-source software,” inProceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering, ser. PROMISE 2021. New York, NY , USA: Association for Computing Machinery, 2021, p. 30–39. [Online]. Avail...

work page doi:10.1145/3475960.3475985 2021

[29] [29]

Joern: A tool for parsing and analyzing source code,

J. Developers, “Joern: A tool for parsing and analyzing source code,” 2017. [Online]. Available: https://joern.io

2017

[30] [30]

The probabilistic relevance frame- work: Bm25 and beyond,

S. Robertson and H. Zaragoza, “The probabilistic relevance frame- work: Bm25 and beyond,”Foundations and Trends in Information Retrieval, vol. 3, pp. 333–389, 09 2009

2009

[31] [31]

Semgrep: Lightweight static analysis for many lan- guages,

Semgrep, Inc., “Semgrep: Lightweight static analysis for many lan- guages,” https://github.com/semgrep/semgrep

[32] [32]

Codebleu: a method for automatic evaluation of code synthesis,

S. Ren, D. Guo, S. Lu, L. Zhou, S. Liu, D. Tang, N. Sundaresan, M. Zhou, A. Blanco, and S. Ma, “Codebleu: a method for automatic evaluation of code synthesis,” 2020. [Online]. Available: https://arxiv.org/abs/2009.10297

Pith/arXiv arXiv 2020

[33] [33]

CWE-787: Out-of-bounds Write

“CWE-787: Out-of-bounds Write.” [Online]. Available: https: //cwe.mitre.org/data/definitions/787.html

[34] [34]

CWE-125: Out-of-bounds Read

“CWE-125: Out-of-bounds Read.” [Online]. Available: https: //cwe.mitre.org/data/definitions/125.html

[35] [35]

CWE-79: Improper Neutralization of Input During Web Page Generation (’Cross-site Scripting’)

“CWE-79: Improper Neutralization of Input During Web Page Generation (’Cross-site Scripting’).” [Online]. Available: https: //cwe.mitre.org/data/definitions/79.html

[36] [36]

CWE-89: Improper Neutralization of Special Elements used in an SQL Command (’SQL Injection’)

“CWE-89: Improper Neutralization of Special Elements used in an SQL Command (’SQL Injection’).” [Online]. Available: https://cwe.mitre.org/data/definitions/89.html

[37] [37]

CWE-94: Improper Control of Generation of Code (’Code Injection’)

“CWE-94: Improper Control of Generation of Code (’Code Injection’).” [Online]. Available: https://cwe.mitre.org/data/definitio ns/94.html

[38] [38]

CWE-22: Improper Limitation of a Pathname to a Restricted Directory (’Path Traversal’)

“CWE-22: Improper Limitation of a Pathname to a Restricted Directory (’Path Traversal’).” [Online]. Available: https://cwe.mitre. org/data/definitions/22.html

[39] [39]

CWE-502: Deserialization of Untrusted Data

“CWE-502: Deserialization of Untrusted Data.” [Online]. Available: https://cwe.mitre.org/data/definitions/502.html 12

[40] [40]

CWE-352: Cross-Site Request Forgery (CSRF)

“CWE-352: Cross-Site Request Forgery (CSRF).” [Online]. Available: https://cwe.mitre.org/data/definitions/352.html

[41] [41]

CWE-863: Incorrect Authorization

“CWE-863: Incorrect Authorization.” [Online]. Available: https: //cwe.mitre.org/data/definitions/863.html

[42] [42]

Vul4j: A dataset of reproducible java vulnerabilities geared towards the study of program repair techniques,

Q.-C. Bui, R. Scandariato, and N. E. D. Ferreyra, “Vul4j: A dataset of reproducible java vulnerabilities geared towards the study of program repair techniques,” in2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR), 2022, pp. 464–468

2022

[43] [43]

CWE-835: Loop with Unreachable Exit Condition (’Infinite Loop’)

“CWE-835: Loop with Unreachable Exit Condition (’Infinite Loop’).” [Online]. Available: https://cwe.mitre.org/data/definitions/835.html

[44] [44]

CWE-828: Signal Handler with Functionality that is not Asynchronous-Safe

“ CWE-828: Signal Handler with Functionality that is not Asynchronous-Safe.” [Online]. Available: https://cwe.mitre.org/data /definitions/828.html

[45] [45]

CWE-770: Allocation of Resources Without Limits or Throttling

“ CWE-770: Allocation of Resources Without Limits or Throttling.” [Online]. Available: https://cwe.mitre.org/data/definitions/770.html

[46] [46]

CWE-476: NULL Pointer Dereference

“CWE-476: NULL Pointer Dereference.” [Online]. Available: https://cwe.mitre.org/data/definitions/476.html

[47] [47]

CWE-416: Use After Free

“CWE-416: Use After Free.” [Online]. Available: https://cwe.mitre. org/data/definitions/416.html

[48] [48]

CWE-281: Improper Preservation of Permissions

“CWE-281: Improper Preservation of Permissions.” [Online]. Available: https://cwe.mitre.org/data/definitions/281.html

[49] [49]

CWE-96: Improper Neutralization of Directives in Statically Saved Code (’Static Code Injection’)

“CWE-96: Improper Neutralization of Directives in Statically Saved Code (’Static Code Injection’).” [Online]. Available: https: //cwe.mitre.org/data/definitions/96.html

[50] [50]

Program vulnerability repair via inductive inference,

Y . Zhang, X. Gao, G. J. Duck, and A. Roychoudhury, “Program vulnerability repair via inductive inference,” inProceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA 2022. New York, NY , USA: Association for Computing Machinery, 2022, p. 691–702. [Online]. Available: https://doi.org/10.1145/3533767.3534387

work page doi:10.1145/3533767.3534387 2022

[51] [51]

Varfix: balancing edit expressiveness and search effectiveness in automated program repair,

C.-P. Wong, P. Santiesteban, C. K ¨astner, and C. Le Goues, “Varfix: balancing edit expressiveness and search effectiveness in automated program repair,” inProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ser. ESEC/FSE 2021. New York, NY , USA: Association for C...

work page doi:10.1145/3468264.3468600 2021

[52] [52]

An analysis of the search spaces for generate and validate patch generation systems,

F. Long and M. Rinard, “An analysis of the search spaces for generate and validate patch generation systems,” inProceedings of the 38th International Conference on Software Engineering, ser. ICSE ’16. New York, NY , USA: Association for Computing Machinery, 2016, p. 702–713. [Online]. Available: https://doi.org/10 .1145/2884781.2884872

arXiv 2016

[53] [53]

A survey of symbolic execution techniques,

R. Baldoni, E. Coppa, D. C. D’elia, C. Demetrescu, and I. Finocchi, “A survey of symbolic execution techniques,” vol. 51, no. 3, May

[54] [54]

Available: https://doi.org/10.1145/3182657

[Online]. Available: https://doi.org/10.1145/3182657

work page doi:10.1145/3182657

[55] [55]

relifix: automated repair of software regressions,

S. H. Tan and A. Roychoudhury, “relifix: automated repair of software regressions,” inProceedings of the 37th International Conference on Software Engineering - Volume 1, ser. ICSE ’15. IEEE Press, 2015, p. 471–482

2015

[56] [56]

Automatic inference of code transforms for patch generation,

F. Long, P. Amidon, and M. Rinard, “Automatic inference of code transforms for patch generation,” inProceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, ser. ESEC/FSE 2017. New York, NY , USA: Association for Computing Machinery, 2017, p. 727–739. [Online]. Available: https://doi.org/10.1145/3106237.3106253

work page doi:10.1145/3106237.3106253 2017

[57] [57]

Tbar: revisiting template-based automated program repair,

K. Liu, A. Koyuncu, D. Kim, and T. F. Bissyand ´e, “Tbar: revisiting template-based automated program repair,” inProceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA 2019. New York, NY , USA: Association for Computing Machinery, 2019, p. 31–42. [Online]. Available: https://doi.org/10.1145/3293882.3330577

work page doi:10.1145/3293882.3330577 2019

[58] [58]

Dlfix: context-based code transformation learning for automated program repair,

Y . Li, S. Wang, and T. N. Nguyen, “Dlfix: context-based code transformation learning for automated program repair,” in Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ser. ICSE ’20. New York, NY , USA: Association for Computing Machinery, 2020, p. 602–614. [Online]. Available: https://doi.org/10.1145/3377811.3380345

work page doi:10.1145/3377811.3380345 2020

[59] [59]

Review4repair: Code review aided automatic program repairing,

F. Huq, M. Hasan, M. M. A. Haque, S. Mahbub, A. Iqbal, and T. Ahmed, “Review4repair: Code review aided automatic program repairing,”Inf. Softw. Technol., vol. 143, no. C, Mar. 2022. [Online]. Available: https://doi.org/10.1016/j.infsof.2021.106765

work page doi:10.1016/j.infsof.2021.106765 2022

[60] [60]

How effective are neural networks for fixing security vulnerabilities,

Y . Wu, N. Jiang, H. V . Pham, T. Lutellier, J. Davis, L. Tan, P. Babkin, and S. Shah, “How effective are neural networks for fixing security vulnerabilities,” inProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA 2023. New York, NY , USA: Association for Computing Machinery, 2023, p. 1282–1294. [Online...

work page doi:10.1145/3597926.3598135 2023

[61] [61]

Retrieval-augmented generation for knowledge- intensive nlp tasks,

P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. Kuttler, M. Lewis, W.-t. Yih, T. Rocktaschel, S. Riedel, and D. Kiela, “Retrieval-augmented generation for knowledge- intensive nlp tasks,” inAdvances in Neural Information Processing Systems, vol. 33, 2020, pp. 9459–9474. [Online]. Available: https://proceedings.neurips.cc/paper/2020/...

2020

[62] [62]

Retrieval-augmented generation for large language models: A survey,

Y . Gao, Y . Xiong, X. Gao, K. Jia, J. Pan, Y . Bi, Y . Dai, J. Sun, M. Wang, and H. Wang, “Retrieval-augmented generation for large language models: A survey,”arXiv preprint arXiv:2312.10997, 2023. [Online]. Available: https://arxiv.org/abs/2312.10997 Appendix This appendix provides background on Retrieval Aug- mented Generation, prompt templates and rep...

Pith/arXiv arXiv 2023

[63] [63]

Retrieval Augmented Generation Retrieval Augmented Generation (RAG) incorporates external knowledge at inference time by combining the internal knowledge of an LLM with external sources such as vector databases, knowledge bases, and code reposito- ries [59], [60]. This is particularly beneficial for knowledge- intensive tasks where information changes dyn...

2015

[64] [64]

Performance of RA VEN on C based CWEs The results in table 7 further demonstrate the effective- ness of RA VEN on C-based vulnerabilities across memory- safety and access-control categories. Formemory-based vul- nerabilities(CWE-125), RA VEN successfully repairs17out of20instances, achieving a repair success rate of85.00%, which shows its ability to mitig...

[65] [65]

Select the single candidate patch whose fix strategy is the best semantic referencefor repairing the vulnerable code

Prompt Templates 14 Selector Agent Prompt Template: You are a < language > code vulnerability patch selection agent . Select the single candidate patch whose fix strategy is the best semantic referencefor repairing the vulnerable code . You are given : - a vulnerable code snippet - a CWE description - several candidate patches Your task is to choose the c...

[66] [66]

Prioritize the underlying vulnerability cause over file names , function names , or CVE IDs

[67] [67]

Prefer the candidate with the most similar mitigation strategy and security checks

[68] [68]

Focus on the security logic introduced by the patch (forexample : bounds validation , length checks , null checks , state validation , safe iteration limits )

[69] [69]

Do not rely on superficial textual overlap alone

[70] [70]

If the CWE description and vulnerable code appear inconsistent , prioritize the actual vulnerable code pattern and the implied root cause in the code

[71] [71]

If a candidate appears to already match logic present in the vulnerable code , treat it cautiously ; it may indicate dataset noise or a pre - patched snippet

[72] [72]

Assume candidate patches may come from related or unrelated files ; file identity is not a deciding factor . Return exactly : Candidate x vulnerable_code : < vulnerable_code > cwe_description : < cwe_description > given_patches : Candidate 1 CVE : < candidate_1_cve_id > Filename : < candidate_1_filename > Patch diff : < candidate_1_patch_diff > --- Candid...

[73] [73]

{ r ef e re n c e_ g u id a n ce }

Prioritize the concrete root cause visible in the target code . { r ef e re n c e_ g u id a n ce }

[74] [74]

Apply the smallest targeted fix that prevents the unsafe behaviorwhilepreserving existing logic

[75] [75]

Do not refactor , rename symbols , change formatting unnecessarily , or modify unrelated behavior

[76] [76]

Do not introduce new helper functions , macros , types , or dependencies unless clearly implied by the provided code context

[77] [77]

Reuse existing validation style , control flow , and error handling patterns when possible

[78] [78]

Output rules : - Return only one```{ code_fence }```code block and no other text

Ensure the result is syntactically valid and consistent with the surrounding code . Output rules : - Return only one```{ code_fence }```code block and no other text . - Inside the code block , provide only the minimal fixed code snippet ( s )forthe target file . - Do not output a unified diff unless explicitly requested . { r e f e r e n c e _ o u t p u t...