Neuro-symbolic Static Analysis with LLM-generated Vulnerability Patterns

Changhua Luo; Jianjia Yu; Josef Sarfati Korich; Junfeng Yang; Penghui Li; Songchen Yao; Yinzhi Cao

arxiv: 2504.16057 · v5 · submitted 2025-04-22 · 💻 cs.CR

Neuro-symbolic Static Analysis with LLM-generated Vulnerability Patterns

Penghui Li , Songchen Yao , Josef Sarfati Korich , Changhua Luo , Jianjia Yu , Yinzhi Cao , Junfeng Yang This is my paper

Pith reviewed 2026-05-22 17:52 UTC · model grok-4.3

classification 💻 cs.CR

keywords neuro-symbolic static analysisLLM-generated vulnerability patternsautomated pattern refinementtrace-driven validationsoftware vulnerability detectionmulti-language static analysis

0 comments

The pith

A neuro-symbolic framework uses LLMs to generate vulnerability patterns that match expert static analysis performance after only hours of automated refinement.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MoCQ to combine large language models with symbolic static analysis for automatically producing vulnerability detection patterns. This matters because expert-written patterns require weeks of manual labor while the new method completes the task in hours and surfaces additional patterns and vulnerabilities that humans missed. A sympathetic reader would see this as a way to scale precise code scanning across languages without expanding the pool of security specialists. The approach extracts a domain-specific language for patterns and then applies an iterative loop that uses execution traces to validate and correct LLM outputs. If the loop works reliably, static analysis tools could incorporate far more detection rules than are currently feasible by hand.

Core claim

MoCQ extracts domain-specific languages for expressing vulnerability patterns and employs an iterative refinement loop with trace-driven symbolic validation that provides precise feedback for pattern correction. Evaluated on 12 vulnerability types across C/C++, Java, PHP, and JavaScript, the system reaches detection performance comparable to expert-developed patterns, requires only hours instead of weeks, uncovers 46 new patterns missed by experts, and finds 25 previously unknown vulnerabilities in real-world applications.

What carries the argument

The iterative refinement loop with trace-driven symbolic validation, which feeds execution traces back to the LLM to correct and improve generated patterns until they pass symbolic checks.

If this is right

Static analysis coverage can expand to new vulnerability classes without proportional growth in expert authoring time.
Detection rules become feasible for languages or frameworks that currently lack dedicated manual patterns.
Security teams could periodically regenerate and refresh pattern sets to track evolving codebases and threat models.
The same neuro-symbolic loop could be applied to generate patterns for other analysis goals such as performance or correctness bugs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the validation loop generalizes cleanly, similar methods might accelerate development of static checkers for domain-specific languages or new programming paradigms.
Widespread use could lower the cost of maintaining high-quality static analyzers and allow smaller teams to produce enterprise-grade security tools.
A practical next test would measure how often the loop converges within a fixed number of iterations on previously unseen vulnerability categories.
The method might reveal systematic blind spots in current manual pattern libraries by surfacing the 46 additional patterns it discovered.

Load-bearing premise

The iterative refinement loop supplies precise enough feedback from symbolic validation to fix LLM-generated patterns without systematically overlooking subtle vulnerabilities or creating false positives that would degrade real-world detection rates.

What would settle it

Run both MoCQ-generated patterns and expert patterns on the same large open-source codebase containing known ground-truth vulnerabilities and count whether MoCQ misses any that experts catch or reports substantially more false positives on clean code.

Figures

Figures reproduced from arXiv: 2504.16057 by Changhua Luo, Jianjia Yu, Josef Sarfati Korich, Junfeng Yang, Penghui Li, Songchen Yao, Yinzhi Cao.

**Figure 2.** Figure 2: The workflow of MoCQ. high false negatives, whereas overly general patterns (i.e., underfitting) can produce excessive false positives. Achieving this balance between precision and recall remains a fundamental challenge. 3 MoCQ We design MoCQ, a novel static neuro-symbolic system. Its workflow is outlined in [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Simplified BNF grammar for Joern, extracted by [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

read the original abstract

In this work, we present MoCQ, a neuro-symbolic static analysis framework that leverages large language models (LLMs) to automatically generate vulnerability detection patterns. This approach combines the precision and scalability of pattern-based static analysis with the semantic understanding and automation capabilities of LLMs. MoCQ extracts the domain-specific languages for expressing vulnerability patterns and employs an iterative refinement loop with trace-driven symbolic validation that provides precise feedback for pattern correction. We evaluated MoCQ on 12 vulnerability types across four languages (C/C++, Java, PHP, JavaScript). MoCQ achieves detection performance comparable to expert-developed patterns while requiring only hours of generation versus weeks of manual effort. Notably, MoCQ uncovered 46 new vulnerability patterns that security experts had missed and discovered 25 previously unknown vulnerabilities in real-world applications. MoCQ also outperforms prior approaches with stronger analysis capabilities and broader applicability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MoCQ offers a practical neuro-symbolic method for generating vulnerability patterns but the supporting evidence needs more detail to fully back the performance and novelty claims.

read the letter

The punchline on this paper is that MoCQ uses LLMs to generate static analysis patterns for vulnerabilities and refines them with an iterative symbolic validation loop based on traces. It reports matching expert performance across multiple languages while uncovering new patterns and real bugs that were missed before. The new element seems to be the combination of LLM pattern synthesis with trace-driven correction in a neuro-symbolic setup. This lets the system produce patterns in hours rather than the weeks experts usually need. The evaluation on 12 vulnerability types in C/C++, Java, PHP, and JavaScript shows broader applicability than some prior work. What works here is the practical framing. Reducing manual effort in rule maintenance is a real problem in the field, and the paper gives a concrete way to tackle it with the refinement loop providing feedback for corrections. The soft spots center on the results. Claims of comparable performance and 46 new patterns plus 25 unknown vulnerabilities are interesting, but the abstract does not include numbers for precision or recall or explain how the new findings were confirmed independently. The stress-test worry about incomplete traces missing path conditions or causing loose patterns is a fair one to check, as it could affect both the detection rates and the validity of the discoveries. This kind of paper is for researchers and engineers working on security tools and program analysis. Someone looking for ways to scale up vulnerability detection with modern AI assistance would get something out of it. I think it deserves peer review. The idea is grounded enough and addresses a clear need, so referees can help strengthen the experimental side.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces MoCQ, a neuro-symbolic static analysis framework that uses LLMs to automatically generate vulnerability detection patterns. It extracts domain-specific languages for patterns and employs an iterative refinement loop with trace-driven symbolic validation to correct patterns. Evaluated on 12 vulnerability types across C/C++, Java, PHP, and JavaScript, the paper claims detection performance comparable to expert-developed patterns (but in hours rather than weeks), the discovery of 46 new patterns missed by experts, and 25 previously unknown vulnerabilities in real-world applications, while also outperforming prior approaches.

Significance. If the quantitative claims are substantiated, the work would be significant for the field of software security and static analysis. It offers a practical method to scale vulnerability pattern creation beyond manual expert effort, leveraging LLM semantic capabilities alongside symbolic precision to potentially improve coverage and reduce development time for detection rules.

major comments (3)

[Abstract] Abstract: The central claims of 'comparable performance' to expert patterns and the discovery of 46 new patterns plus 25 unknown vulnerabilities are stated without any quantitative metrics, precision/recall values, baselines, statistical significance, or description of independent verification procedures for the new findings.
[Method and Evaluation] Iterative refinement loop (method and evaluation sections): The trace-driven symbolic validation is presented as supplying precise corrective feedback, but the manuscript does not address how the approach handles incomplete traces (e.g., missing data-dependent branches, aliasing, or external library calls), which risks accepting overly narrow or loose patterns and directly affects both the performance comparability claim and the validity of the newly reported vulnerabilities.
[Evaluation] Real-world evaluation: The report of 25 previously unknown vulnerabilities requires explicit details on the tested applications, the exact verification process used to confirm they are true positives not caught by existing tools, and controls to rule out false positives introduced by the LLM-generated patterns.

minor comments (2)

[Related Work] The paper would benefit from a dedicated related-work section that positions MoCQ against recent LLM-based code analysis and vulnerability detection efforts with explicit comparisons.
[Method] Notation for the extracted domain-specific languages and the feedback signals in the refinement loop could be formalized with small examples or pseudocode for improved clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight opportunities to strengthen the presentation of quantitative results, clarify methodological limitations, and provide greater transparency in the real-world evaluation. We address each major comment below and indicate the revisions we will make to the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: The central claims of 'comparable performance' to expert patterns and the discovery of 46 new patterns plus 25 unknown vulnerabilities are stated without any quantitative metrics, precision/recall values, baselines, statistical significance, or description of independent verification procedures for the new findings.

Authors: We agree that the abstract would be strengthened by including key quantitative metrics. The full paper reports per-vulnerability precision, recall, and F1 scores (Table 2), direct comparisons to expert-written patterns, and baseline results against prior neuro-symbolic and LLM-based approaches. The 25 new vulnerabilities were verified through manual expert review and cross-validation against existing tools. In the revision we will update the abstract to include average precision/recall across the 12 types, a brief statement on the expert comparison, and a note that new vulnerabilities were independently confirmed by security researchers. revision: yes
Referee: [Method and Evaluation] Iterative refinement loop (method and evaluation sections): The trace-driven symbolic validation is presented as supplying precise corrective feedback, but the manuscript does not address how the approach handles incomplete traces (e.g., missing data-dependent branches, aliasing, or external library calls), which risks accepting overly narrow or loose patterns and directly affects both the performance comparability claim and the validity of the newly reported vulnerabilities.

Authors: Section 3.3 explains that the iterative loop collects multiple concrete execution traces from test suites and uses symbolic validation to check pattern soundness across those traces. This design reduces the impact of any single incomplete trace. We acknowledge, however, that the manuscript lacks an explicit discussion of residual risks from aliasing, data-dependent branches, and external calls. We will add a dedicated limitations paragraph in the discussion section that describes these cases, explains how the multi-trace requirement and symbolic over-approximation provide safeguards, and notes that patterns are rejected if they fail on any available trace. This addition will not alter the reported performance numbers but will qualify the claims appropriately. revision: partial
Referee: [Evaluation] Real-world evaluation: The report of 25 previously unknown vulnerabilities requires explicit details on the tested applications, the exact verification process used to confirm they are true positives not caught by existing tools, and controls to rule out false positives introduced by the LLM-generated patterns.

Authors: Section 5.4 already lists the ten open-source applications evaluated (specific versions of projects in C/C++, Java, PHP, and JavaScript) and describes the verification workflow: running MoCQ patterns, comparing outputs against expert patterns and commercial tools, followed by manual inspection by two independent security researchers. To fully address the comment we will expand this section with a summary table that enumerates each application, the number of new vulnerabilities found per project, the step-by-step verification procedure (including false-positive controls on a held-out set of known-safe code), and confirmation that none were reported by existing tools. These additions will make the evidence for the 25 new vulnerabilities fully transparent. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external symbolic validation and empirical testing.

full rationale

The paper's core chain—LLM pattern generation followed by iterative trace-driven symbolic validation and real-world evaluation—does not reduce any claimed result to its inputs by construction. Performance comparability and discovery of new patterns/vulnerabilities are measured against independent expert baselines and external codebases rather than fitted parameters or self-referential definitions. No load-bearing self-citations, uniqueness theorems, or ansatzes from prior author work are invoked to force the outcomes. The approach is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that LLMs can be steered to produce correct patterns via symbolic feedback and that the resulting patterns generalize beyond the evaluated cases.

axioms (1)

domain assumption LLMs supplied with appropriate prompts and symbolic feedback can generate vulnerability patterns whose precision and recall match those of expert-written rules.
This premise is required for the performance-comparison claim in the abstract.

pith-pipeline@v0.9.0 · 5692 in / 1205 out tokens · 52543 ms · 2026-05-22T17:52:07.114210+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

MoCQ extracts the domain-specific languages for expressing vulnerability patterns and employs an iterative refinement loop with trace-driven symbolic validation
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

MoCQ uncovered 46 new vulnerability patterns ... discovered 25 previously unknown vulnerabilities

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Do Skill Descriptions Tell the Truth? Detecting Undisclosed Security Behaviors in Code-Backed LLM Skills
cs.CR 2026-05 conditional novelty 7.0

SKILLSCOPE detects undisclosed security behaviors in LLM skill implementations via security property graphs and taxonomy-based consistency checking, identifying confirmed inconsistencies in 9.4% of 4,556 evaluated ski...
Generating Complex Code Analyzers from Natural Language Questions
cs.SE 2026-05 unverdicted novelty 7.0

Merlin generates CodeQL queries from natural language questions via RAG-based iteration and a self-test technique using assistive queries, achieving 3.8x higher task accuracy and 31% less completion time in user studi...
Less Is More: Measuring How LLM Involvement affects Chatbot Accuracy in Static Analysis
cs.SE 2026-04 unverdicted novelty 6.0

A structured JSON intermediate representation for LLM-generated static analysis queries outperforms both direct generation and agentic tool use, with gains of 15-25 percentage points on large models.
BugScope: Learn to Find Bugs Like Human
cs.SE 2025-07 conditional novelty 6.0

BugScope structures LLM bug detection into three human-mirroring steps and distills guidelines from examples, reaching 0.87 F1 on 33 real bugs while outperforming Claude and Cursor tools and uncovering 184 new issues ...
Neuro-Symbolic AI for Cybersecurity: State of the Art, Challenges, and Opportunities
cs.CR 2025-09 unverdicted novelty 5.0

A systematic review of neuro-symbolic AI in cybersecurity finds that deeper integration and causal reasoning improve performance across intrusion detection and vulnerability tasks, while identifying barriers and a res...

Reference graph

Works this paper leans on

76 extracted references · 76 canonical work pages · cited by 5 Pith papers

[1]

, author =

2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages", author = "Feng, Zhangyin and Guo, Daya and Tang, Duyu and Duan, Nan and Feng, Xiaocheng and Gong, Ming and Shou, Linjun and Qin, Bing and Liu, Ting and Jiang, Daxin and Zhou, Ming. InFindings of the Association for Computational Linguistics: EMNLP 2020. Online

work page 2020
[2]

Joern for Prototype Pollution

2023. Joern for Prototype Pollution. https://github.com/Tobiasfro/joern/commits/ master/

work page 2023
[3]

CodeQL 2.14.2 Change Log

2024. CodeQL 2.14.2 Change Log. https://codeql.github.com/docs/codeql- overview/codeql-changelog/codeql-cli-2.14.2/#javascript-typescript

work page 2024
[4]

Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs

2025. Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs. https://github.com/joernio/joern

work page 2025
[5]

Michael Backes, Konrad Rieck, Malte Skoruppa, Ben Stock, and Fabian Yamaguchi

work page
[6]

In Pro- ceedings of the 2nd IEEE European Symposium on Security and Privacy (EuroS&P)

Efficient and flexible discovery of php application vulnerabilities. In Pro- ceedings of the 2nd IEEE European Symposium on Security and Privacy (EuroS&P) . Paris, France

work page
[7]

Roberto Baldoni, Emilio Coppa, Daniele Cono D’elia, Camil Demetrescu, and Irene Finocchi. 2018. A survey of symbolic execution techniques.ACM Computing Surveys (CSUR) 51, 3 (2018), 1–39

work page 2018
[8]

Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, and Baishakhi Ray. 2021. Deep learning based vulnerability detection: Are we there yet? IEEE Transactions on Software Engineering 48, 9 (2021), 3280–3296

work page 2021
[9]

Wei Chang, Chunyang Ye, and Hui Zhou. 2024. Fine-Tuning Pre-trained Model with Optimizable Prompt Learning for Code Vulnerability Detection. In 2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE) . 108–119

work page 2024
[10]

CodeQL. 2024. CodeQL. https://codeql.github.com/

work page 2024
[11]

CodeQL. 2024. CodeQL Hardware Requirements. https://docs.github. com/en/code-security/code-scanning/creating-an-advanced-setup-for-code- scanning/recommended-hardware-resources-for-running-codeql

work page 2024
[12]

Eric Cornelissen, Mikhail Shcherbakov, and Musard Balliu. 2024. GHunter: Universal Prototype Pollution Gadgets in JavaScript Runtimes. In Proceedings of the 33th USENIX Security Symposium (Security) . Philadelphia, PA, USA

work page 2024
[13]

Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages . 238–252

work page 1977
[14]

Johannes Dahse and Thorsten Holz. 2014. Static detection of 𝑆𝑒𝑐𝑜𝑛𝑑 − 𝑂𝑟𝑑𝑒𝑟 vulnerabilities in web applications. In 23rd USENIX Security Symposium (USENIX Security 14). 989–1003

work page 2014
[15]

Yangruibo Ding, Saikat Chakraborty, Luca Buratti, Saurabh Pujar, Alessandro Morari, Gail Kaiser, and Baishakhi Ray. 2023. CONCORD: Clone-Aware Con- trastive Learning for Source Code. In Proceedings of the 32nd ACM SIGSOFT Inter- national Symposium on Software Testing and Analysis (ISSTA 2023). Association for Computing Machinery, New York, NY, USA, 26–38....

work page doi:10.1145/3597926.3598035 2023
[16]

Yangruibo Ding, Yanjun Fu, Omniyyah Ibrahim, Chawin Sitawarin, Xinyun Chen, Basel Alomair, David Wagner, Baishakhi Ray, and Yizheng Chen. 2025. Vulnera- bility detection with code language models: How far are we?. In Proceedings of the 47th International Conference on Software Engineering (ICSE) . Ottawa, Ontario, Canada

work page 2025
[17]

IA Dudina and AA Belevantsev. 2017. Using static symbolic execution to detect buffer overflows. Programming and Computer Software 43, 5 (2017), 277–288

work page 2017
[18]

2023.Detection of Prototype Pollution Using Joern: Joern’s Detection Capability Compared to CodeQL’s

Tobias Fröberg. 2023.Detection of Prototype Pollution Using Joern: Joern’s Detection Capability Compared to CodeQL’s. Master’s thesis

work page 2023
[19]

Osman Hasan and Sofiene Tahar. 2015. Formal verification methods. In Encyclo- pedia of Information Science and Technology, Third Edition . IGI Global Scientific Publishing, 7162–7170

work page 2015
[20]

Yuchen Ji, Ting Dai, Zhichao Zhou, Yutian Tang, and Jingzhu He. 2025. Artemis: Toward Accurate Detection of Server-Side Request Forgeries through LLM- Assisted Inter-Procedural Path-Sensitive Taint Analysis. InProceedings of the 2025 Annual ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Singapore

work page 2025
[21]

Joern. 2024. Joern Hardware Requirements. https://docs.joern.io/installation/ #configuring-the-jvm-for-handling-large-codebases

work page 2024
[22]

Zifeng Kang, Song Li, and Yinzhi Cao. 2022. Probe the Proto: Measuring Client- Side Prototype Pollution Vulnerabilities of One Million Real-world Websites.. In Proceedings of the 2022 Annual Network and Distributed System Security Sympo- sium (NDSS). San Diego, CA, USA

work page 2022
[23]

Gwangmu Lee, Woochul Shim, and Byoungyoung Lee. 2021. Constraint-guided directed greybox fuzzing. In Proceedings of the 30th USENIX Security Symposium (Security). Virtual event

work page 2021
[24]

Haonan Li, Yu Hao, Yizhuo Zhai, and Zhiyun Qian. 2024. Enhancing Static Analysis for Practical Bug Detection: An LLM-Integrated Approach. Proc. ACM Program. Lang. 8, OOPSLA1 (April 2024). doi:10.1145/3649828

work page doi:10.1145/3649828 2024
[25]

Penghui Li and Wei Meng. 2021. LChecker: Detecting Loose Comparison Bugs in PHP. In Proceedings of the Web Conference (WWW) . Ljubljana, Slovenia

work page 2021
[26]

Song Li, Mingqing Kang, Jianwei Hou, and Yinzhi Cao. 2021. Detecting Node. js prototype pollution vulnerabilities via object lookup analysis. In Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) . Athens, Greece

work page 2021
[27]

Song Li, Mingqing Kang, Jianwei Hou, and Yinzhi Cao. 2022. Mining node. js vulnerabilities via object dependence graph and query. In 31st USENIX Security Symposium (USENIX Security 22) . 143–160

work page 2022
[28]

Tao Li, Gang Li, Zhiwei Deng, Bryan Wang, and Yang Li. 2023. A Zero-Shot Language Agent for Computer Control with Structured Reflection. In Findings of the Association for Computational Linguistics: EMNLP 2023 . Association for Computational Linguistics, Singapore. https://aclanthology.org/2023.findings- emnlp.753/

work page 2023
[29]

Ziyang Li, Saikat Dutta, and Mayur Naik. 2025. Llm-assisted static analysis for detecting security vulnerabilities. In Proceedings of the 13th International Conference on Learning Representations (ICLR) . Singapore

work page 2025
[30]

Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Yawei Zhu, and Zhaoxuan Chen. 2021. Sysevr: A framework for using deep learning to detect software vulnerabilities. IEEE Transactions on Dependable and Secure Computing 19, 4 (2021), 2244–2258

work page 2021
[31]

Zhengyu Liu, Kecheng An, and Yinzhi Cao. 2024. Undefined-oriented Program- ming: Detecting and Chaining Prototype Pollution Gadgets in Node.js Template Engines for Malicious Consequences. In 2024 IEEE Symposium on Security and Privacy (SP). San Francisco, CA, USA

work page 2024
[32]

Guilong Lu, Xiaolin Ju, Xiang Chen, Wenlong Pei, and Zhilong Cai. 2024. GRACE: Empowering LLM-based software vulnerability detection with graph structure and in-context learning. Journal of Systems and Software (2024)

work page 2024
[33]

Changhua Luo, Penghui Li, and Wei Meng. 2022. TChecker: Precise Static Inter- Procedural Analysis for Detecting Taint-Style Vulnerabilities in PHP Applications. In Proceedings of the 29th ACM Conference on Computer and Communications Security (CCS). Los Angeles, CA, USA

work page 2022
[34]

Qiheng Mao, Zhenhao Li, Xing Hu, Kui Liu, Xin Xia, and Jianling Sun. 2024. Towards Explainable Vulnerability Detection with Large Language Models. https: //api.semanticscholar.org/CorpusID:270521866

work page 2024
[35]

Daniel D McCracken and Edwin D Reilly. 2003. Backus-naur form (bnf). In Encyclopedia of computer science . 129–131

work page 2003
[36]

Neves, and Miguel Correia

Ibéria Medeiros, Nuno F. Neves, and Miguel Correia. 2011. Automatic detection and correction of web application vulnerabilities using data mining to predict false positives. In Proceedings of the 21st International World Wide Web Conference (WWW). Seoul, Korea

work page 2011
[37]

MITRE Corporation. 2021. CVE-2021-44228: Apache Log4j2 Remote Code Execu- tion Vulnerability. https://nvd.nist.gov/vuln/detail/CVE-2021-44228. Accessed: 2025-06-22

work page 2021
[38]

Mohammad Mahdi Mohajer, Reem Aleithan, Nima Shiri Harzevili, Moshi Wei, Alvine Boaye Belle, Hung Viet Pham, and Song Wang. 2024. Effectiveness of ChatGPT for Static Analysis: How Far Are We?. In Proceedings of the 1st ACM In- ternational Conference on AI-Powered Software (AIware 2024). Association for Com- puting Machinery, New York, NY, USA, 151–160. doi...

work page doi:10.1145/3664646.3664777 2024
[39]

N/A. 2024. CodeQL 2.16.3 Change Log. https://codeql.github.com/docs/codeql- overview/codeql-changelog/codeql-cli-2.16.3/#javascript-typescript

work page 2024
[40]

N/A. 2025. Content management system for eCommerce apps created on Sylius platform. Built with Sylius code quality, flexibility, BDD. https://github.com/ 11 Conference’17, July 2017, Washington, DC, USA Penghui Li, Songchen Yao, Josef Sarfati Korich, Changhua Luo, Jianjia Yu, Yinzhi Cao, and Junfeng Yang BitBagCommerce/SyliusCmsPlugin

work page 2025
[41]

N/A. 2025. Create and parse HTTP Content-Type header. https://www.npmjs. com/package/content-type

work page 2025
[42]

N/A. 2025. ES2015-proxy. https://www.npmjs.com/package/es2015-proxy? activeTab=readme

work page 2025
[43]

N/A. 2025. Fork is an easy to use open source CMS using Symfony Components. https://github.com/forkcms/forkcms

work page 2025
[44]

N/A. 2025. Gracefully parse ECMAScript static imports. https://www.npmjs. com/package/parse-static-imports

work page 2025
[45]

N/A. 2025. Joern Documentation: Node-Type Steps. https://docs.joern.io/cpgql/ reference-card/

work page 2025
[46]

N/A. 2025. Native cross-platform Web Workers. Works in published npm modules. https://www.npmjs.com/package/web-worker

work page 2025
[47]

N/A. 2025. Static file serving middleware. https://www.npmjs.com/package/koa- send

work page 2025
[48]

OpenAI. 2024. GPT-4o Technical Report. https://openai.com/index/gpt-4o. Ac- cessed: 2025-04-07

work page 2024
[49]

Terence Parr. 2013. The definitive ANTLR 4 reference. (2013)

work page 2013
[50]

Marcel Pham, Van Thuan, Manh Dung Nguyen, and Abhik Roychoudhury. 2017. Directed greybox fuzzing. In Proceedings of the 24th ACM Conference on Computer and Communications Security (CCS) . Dallas, TX

work page 2017
[51]

The Chromium Project. 2025. CodeQL Support in Chromium. https://chromium.googlesource.com/chromium/src/+/refs/tags/126.0.6436. 1/tools/codeql/README.md

work page 2025
[52]

Matthew Renze and Erhan Guven. 2024. The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models. In2024 2nd International Conference on Foundation and Large Language Models (FLLM)

work page 2024
[53]

Rebecca Russell, Louis Kim, Lei Hamilton, Tomo Lazovich, Jacob Harer, Onur Ozdemir, Paul Ellingwood, and Marc McConley. 2018. Automated vulnerability detection in source code using deep representation learning. In 2018 17th IEEE international conference on machine learning and applications (ICMLA) . IEEE, 757–762

work page 2018
[54]

Bernhard Scholz, Herbert Jordan, Pavle Subotić, and Till Westmann. 2016. On fast large-scale program analysis in datalog. In Proceedings of the 25th International Conference on Compiler Construction . 196–206

work page 2016
[55]

Apache Logging Services. 2021. Log4Shell: RCE Vulnerability in Log4j (CVE-2021- 44228). https://logging.apache.org/log4j/2.x/security.html. Accessed: 2025-06-22

work page 2021
[56]

Amazon Web Services. 2025. Amazon CodeGuru. https://aws.amazon.com/ codeguru/

work page 2025
[57]

Mikhail Shcherbakov, Musard Balliu, and Cristian-Alexandru Staicu. 2023. Silent spring: Prototype pollution leads to remote code execution in Node. js. In Pro- ceedings of the 32nd USENIX Security Symposium (Security) . Anaheim, CA, USA

work page 2023
[58]

Youkun Shi, Yuan Zhang, Tianhao Bai, Lei Zhang, Xin Tan, and Min Yang. 2024. RecurScan: Detecting Recurring Vulnerabilities in PHP Web Applications. In Proceedings of the Web Conference (WWW) . Singapore

work page 2024
[59]

Nima Shiri Harzevili, Alvine Boaye Belle, Junjie Wang, Song Wang, Zhen Ming Jiang, and Nachiappan Nagappan. 2024. A systematic literature review on auto- mated software vulnerability detection using machine learning. Comput. Surveys 57, 3 (2024), 1–36

work page 2024
[60]

Wei Su, Yifei Liu, Gomathi Ganesan, Gerard Holzmann, Scott Smolka, Erez Zadok, and Geoff Kuenning. 2021. Model-checking support for file system development. In Proceedings of the 13th ACM Workshop on Hot Topics in Storage and File Systems. 103–110

work page 2021
[61]

Yulei Sui and Jingling Xue. 2016. SVF: interprocedural static value-flow analysis in LLVM. In Proceedings of the 25th international conference on compiler construction

work page 2016
[62]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yas- mine El Hage, Baptiste Roziere, Jie Ren, Laurent Sifre, Jean-Rémi King, Thomas Scialom, Gabriel Synnaeve, Nicolas Usunier, Hervé Jégou, and Edouard Grave

work page
[63]

https://github.com/meta- llama/codellama

Code Llama: Open Foundation Models for Code. https://github.com/meta- llama/codellama

work page
[64]

Erik Trickel, Fabio Pagani, Chang Zhu, Lukas Dresel, Giovanni Vigna, Christopher Kruegel, Ruoyu Wang, Tiffany Bao, Yan Shoshitaishvili, and Adam Doupé. 2023. Toss a fault to your witcher: Applying grey-box coverage-guided mutational fuzzing to detect sql and command injection vulnerabilities. In Proceedings of the 44th IEEE Symposium on Security and Priva...

work page 2023
[65]

Coskun, and Gianluca Stringhini

Saad Ullah, Mingji Han, Saurabh Pujar, Hammond Pearce, Ayse K. Coskun, and Gianluca Stringhini. 2024. LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks. In Proceedings of the 45th IEEE Symposium on Security and Privacy (S&P). San Francisco, CA, USA

work page 2024
[66]

Saurous, and Yoon Kim

Bailin Wang, Zi Wang, Xuezhi Wang, Yuan Cao, Rif A. Saurous, and Yoon Kim

work page
[67]

In Proceedings of the 37th Annual Conference on Neural Information Processing Systems (NeurIPS)

Grammar Prompting for Domain-Specific Language Generation with Large Language Models. In Proceedings of the 37th Annual Conference on Neural Information Processing Systems (NeurIPS) . New Orleans, LA, USA

work page
[68]

Chengpeng Wang, Yifei Gao, Wuqi Zhang, Xuwei Liu, Qingkai Shi, and Xi- angyu Zhang. 2024. LLMSA: A Compositional Neuro-Symbolic Approach to Compilation-free and Customizable Static Analysis. In Proceedings of the 2024 Empirical Methods in Natural Language Processing (EMNLP) . Miami, FL, USA

work page 2024
[69]

Rongcun Wang, Senlei Xu, Yuan Tian, Xingyu Ji, Xiaobing Sun, and Shujuang Jiang. 2024. SCL-CVD: Supervised contrastive learning for code vulnerability detection via GraphCodeBERT. Computers & Security (2024). doi:10.1016/j.cose. 2024.103994

work page doi:10.1016/j.cose 2024
[70]

Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi D. Q. Bui, Junnan Li, and Steven C. H. Hoi. 2023. CodeT5+: Open Code Large Language Models for Code Understanding and Generation. In Conference on Empirical Methods in Natural Language Processing. https://api.semanticscholar.org/CorpusID:258685677

work page 2023
[71]

Joty, and Steven C

Yue Wang, Weishi Wang, Shafiq R. Joty, and Steven C. H. Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Under- standing and Generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing . Online and Punta Cana, Dominican Republic

work page 2021
[72]

Fabian Yamaguchi, Nico Golde, Daniel Arp, and Konrad Rieck. 2014. Modeling and discovering vulnerabilities with code property graphs. In Proceedings of the 35th IEEE Symposium on Security and Privacy (S&P) . San Jose, CA, USA

work page 2014
[73]

Hua Yan, Yulei Sui, Shiping Chen, and Jingling Xue. 2018. Spatio-temporal context reduction: A pointer-analysis-based static approach for detecting use-after-free vulnerabilities. In Proceedings of the 40th International Conference on Software Engineering. 327–337

work page 2018
[74]

Junfeng Yang, Paul Twohey, Dawson Engler, and Madanlal Musuvathi. 2006. Using model checking to find serious file system errors. ACM Transactions on Computer Systems (TOCS) 24, 4 (2006), 393–423

work page 2006
[75]

Xin Zhou, Ting Zhang, and David Lo. 2024. Large language model for vulner- ability detection: Emerging results and future directions. In Proceedings of the 46th International Conference on Software Engineering (ICSE) . Lisbon, Portugal

work page 2024
[76]

Jin Zhu, Hui Ge, Yun Zhou, Xiao Jin, Rui Luo, and Yanchen Sun. 2024. Detecting Source Code Vulnerabilities Using Fine-Tuned Pre-Trained LLMs. In2024 IEEE 17th International Conference on Signal Processing (ICSP) . 238–242. 12

work page 2024

[1] [1]

, author =

2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages", author = "Feng, Zhangyin and Guo, Daya and Tang, Duyu and Duan, Nan and Feng, Xiaocheng and Gong, Ming and Shou, Linjun and Qin, Bing and Liu, Ting and Jiang, Daxin and Zhou, Ming. InFindings of the Association for Computational Linguistics: EMNLP 2020. Online

work page 2020

[2] [2]

Joern for Prototype Pollution

2023. Joern for Prototype Pollution. https://github.com/Tobiasfro/joern/commits/ master/

work page 2023

[3] [3]

CodeQL 2.14.2 Change Log

2024. CodeQL 2.14.2 Change Log. https://codeql.github.com/docs/codeql- overview/codeql-changelog/codeql-cli-2.14.2/#javascript-typescript

work page 2024

[4] [4]

Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs

2025. Open-source code analysis platform for C/C++/Java/Binary/Javascript/Python/Kotlin based on code property graphs. https://github.com/joernio/joern

work page 2025

[5] [5]

Michael Backes, Konrad Rieck, Malte Skoruppa, Ben Stock, and Fabian Yamaguchi

work page

[6] [6]

In Pro- ceedings of the 2nd IEEE European Symposium on Security and Privacy (EuroS&P)

Efficient and flexible discovery of php application vulnerabilities. In Pro- ceedings of the 2nd IEEE European Symposium on Security and Privacy (EuroS&P) . Paris, France

work page

[7] [7]

Roberto Baldoni, Emilio Coppa, Daniele Cono D’elia, Camil Demetrescu, and Irene Finocchi. 2018. A survey of symbolic execution techniques.ACM Computing Surveys (CSUR) 51, 3 (2018), 1–39

work page 2018

[8] [8]

Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, and Baishakhi Ray. 2021. Deep learning based vulnerability detection: Are we there yet? IEEE Transactions on Software Engineering 48, 9 (2021), 3280–3296

work page 2021

[9] [9]

Wei Chang, Chunyang Ye, and Hui Zhou. 2024. Fine-Tuning Pre-trained Model with Optimizable Prompt Learning for Code Vulnerability Detection. In 2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE) . 108–119

work page 2024

[10] [10]

CodeQL. 2024. CodeQL. https://codeql.github.com/

work page 2024

[11] [11]

CodeQL. 2024. CodeQL Hardware Requirements. https://docs.github. com/en/code-security/code-scanning/creating-an-advanced-setup-for-code- scanning/recommended-hardware-resources-for-running-codeql

work page 2024

[12] [12]

Eric Cornelissen, Mikhail Shcherbakov, and Musard Balliu. 2024. GHunter: Universal Prototype Pollution Gadgets in JavaScript Runtimes. In Proceedings of the 33th USENIX Security Symposium (Security) . Philadelphia, PA, USA

work page 2024

[13] [13]

Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of programming languages . 238–252

work page 1977

[14] [14]

Johannes Dahse and Thorsten Holz. 2014. Static detection of 𝑆𝑒𝑐𝑜𝑛𝑑 − 𝑂𝑟𝑑𝑒𝑟 vulnerabilities in web applications. In 23rd USENIX Security Symposium (USENIX Security 14). 989–1003

work page 2014

[15] [15]

Yangruibo Ding, Saikat Chakraborty, Luca Buratti, Saurabh Pujar, Alessandro Morari, Gail Kaiser, and Baishakhi Ray. 2023. CONCORD: Clone-Aware Con- trastive Learning for Source Code. In Proceedings of the 32nd ACM SIGSOFT Inter- national Symposium on Software Testing and Analysis (ISSTA 2023). Association for Computing Machinery, New York, NY, USA, 26–38....

work page doi:10.1145/3597926.3598035 2023

[16] [16]

Yangruibo Ding, Yanjun Fu, Omniyyah Ibrahim, Chawin Sitawarin, Xinyun Chen, Basel Alomair, David Wagner, Baishakhi Ray, and Yizheng Chen. 2025. Vulnera- bility detection with code language models: How far are we?. In Proceedings of the 47th International Conference on Software Engineering (ICSE) . Ottawa, Ontario, Canada

work page 2025

[17] [17]

IA Dudina and AA Belevantsev. 2017. Using static symbolic execution to detect buffer overflows. Programming and Computer Software 43, 5 (2017), 277–288

work page 2017

[18] [18]

2023.Detection of Prototype Pollution Using Joern: Joern’s Detection Capability Compared to CodeQL’s

Tobias Fröberg. 2023.Detection of Prototype Pollution Using Joern: Joern’s Detection Capability Compared to CodeQL’s. Master’s thesis

work page 2023

[19] [19]

Osman Hasan and Sofiene Tahar. 2015. Formal verification methods. In Encyclo- pedia of Information Science and Technology, Third Edition . IGI Global Scientific Publishing, 7162–7170

work page 2015

[20] [20]

Yuchen Ji, Ting Dai, Zhichao Zhou, Yutian Tang, and Jingzhu He. 2025. Artemis: Toward Accurate Detection of Server-Side Request Forgeries through LLM- Assisted Inter-Procedural Path-Sensitive Taint Analysis. InProceedings of the 2025 Annual ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Singapore

work page 2025

[21] [21]

Joern. 2024. Joern Hardware Requirements. https://docs.joern.io/installation/ #configuring-the-jvm-for-handling-large-codebases

work page 2024

[22] [22]

Zifeng Kang, Song Li, and Yinzhi Cao. 2022. Probe the Proto: Measuring Client- Side Prototype Pollution Vulnerabilities of One Million Real-world Websites.. In Proceedings of the 2022 Annual Network and Distributed System Security Sympo- sium (NDSS). San Diego, CA, USA

work page 2022

[23] [23]

Gwangmu Lee, Woochul Shim, and Byoungyoung Lee. 2021. Constraint-guided directed greybox fuzzing. In Proceedings of the 30th USENIX Security Symposium (Security). Virtual event

work page 2021

[24] [24]

Haonan Li, Yu Hao, Yizhuo Zhai, and Zhiyun Qian. 2024. Enhancing Static Analysis for Practical Bug Detection: An LLM-Integrated Approach. Proc. ACM Program. Lang. 8, OOPSLA1 (April 2024). doi:10.1145/3649828

work page doi:10.1145/3649828 2024

[25] [25]

Penghui Li and Wei Meng. 2021. LChecker: Detecting Loose Comparison Bugs in PHP. In Proceedings of the Web Conference (WWW) . Ljubljana, Slovenia

work page 2021

[26] [26]

Song Li, Mingqing Kang, Jianwei Hou, and Yinzhi Cao. 2021. Detecting Node. js prototype pollution vulnerabilities via object lookup analysis. In Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) . Athens, Greece

work page 2021

[27] [27]

Song Li, Mingqing Kang, Jianwei Hou, and Yinzhi Cao. 2022. Mining node. js vulnerabilities via object dependence graph and query. In 31st USENIX Security Symposium (USENIX Security 22) . 143–160

work page 2022

[28] [28]

Tao Li, Gang Li, Zhiwei Deng, Bryan Wang, and Yang Li. 2023. A Zero-Shot Language Agent for Computer Control with Structured Reflection. In Findings of the Association for Computational Linguistics: EMNLP 2023 . Association for Computational Linguistics, Singapore. https://aclanthology.org/2023.findings- emnlp.753/

work page 2023

[29] [29]

Ziyang Li, Saikat Dutta, and Mayur Naik. 2025. Llm-assisted static analysis for detecting security vulnerabilities. In Proceedings of the 13th International Conference on Learning Representations (ICLR) . Singapore

work page 2025

[30] [30]

Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Yawei Zhu, and Zhaoxuan Chen. 2021. Sysevr: A framework for using deep learning to detect software vulnerabilities. IEEE Transactions on Dependable and Secure Computing 19, 4 (2021), 2244–2258

work page 2021

[31] [31]

Zhengyu Liu, Kecheng An, and Yinzhi Cao. 2024. Undefined-oriented Program- ming: Detecting and Chaining Prototype Pollution Gadgets in Node.js Template Engines for Malicious Consequences. In 2024 IEEE Symposium on Security and Privacy (SP). San Francisco, CA, USA

work page 2024

[32] [32]

Guilong Lu, Xiaolin Ju, Xiang Chen, Wenlong Pei, and Zhilong Cai. 2024. GRACE: Empowering LLM-based software vulnerability detection with graph structure and in-context learning. Journal of Systems and Software (2024)

work page 2024

[33] [33]

Changhua Luo, Penghui Li, and Wei Meng. 2022. TChecker: Precise Static Inter- Procedural Analysis for Detecting Taint-Style Vulnerabilities in PHP Applications. In Proceedings of the 29th ACM Conference on Computer and Communications Security (CCS). Los Angeles, CA, USA

work page 2022

[34] [34]

Qiheng Mao, Zhenhao Li, Xing Hu, Kui Liu, Xin Xia, and Jianling Sun. 2024. Towards Explainable Vulnerability Detection with Large Language Models. https: //api.semanticscholar.org/CorpusID:270521866

work page 2024

[35] [35]

Daniel D McCracken and Edwin D Reilly. 2003. Backus-naur form (bnf). In Encyclopedia of computer science . 129–131

work page 2003

[36] [36]

Neves, and Miguel Correia

Ibéria Medeiros, Nuno F. Neves, and Miguel Correia. 2011. Automatic detection and correction of web application vulnerabilities using data mining to predict false positives. In Proceedings of the 21st International World Wide Web Conference (WWW). Seoul, Korea

work page 2011

[37] [37]

MITRE Corporation. 2021. CVE-2021-44228: Apache Log4j2 Remote Code Execu- tion Vulnerability. https://nvd.nist.gov/vuln/detail/CVE-2021-44228. Accessed: 2025-06-22

work page 2021

[38] [38]

Mohammad Mahdi Mohajer, Reem Aleithan, Nima Shiri Harzevili, Moshi Wei, Alvine Boaye Belle, Hung Viet Pham, and Song Wang. 2024. Effectiveness of ChatGPT for Static Analysis: How Far Are We?. In Proceedings of the 1st ACM In- ternational Conference on AI-Powered Software (AIware 2024). Association for Com- puting Machinery, New York, NY, USA, 151–160. doi...

work page doi:10.1145/3664646.3664777 2024

[39] [39]

N/A. 2024. CodeQL 2.16.3 Change Log. https://codeql.github.com/docs/codeql- overview/codeql-changelog/codeql-cli-2.16.3/#javascript-typescript

work page 2024

[40] [40]

N/A. 2025. Content management system for eCommerce apps created on Sylius platform. Built with Sylius code quality, flexibility, BDD. https://github.com/ 11 Conference’17, July 2017, Washington, DC, USA Penghui Li, Songchen Yao, Josef Sarfati Korich, Changhua Luo, Jianjia Yu, Yinzhi Cao, and Junfeng Yang BitBagCommerce/SyliusCmsPlugin

work page 2025

[41] [41]

N/A. 2025. Create and parse HTTP Content-Type header. https://www.npmjs. com/package/content-type

work page 2025

[42] [42]

N/A. 2025. ES2015-proxy. https://www.npmjs.com/package/es2015-proxy? activeTab=readme

work page 2025

[43] [43]

N/A. 2025. Fork is an easy to use open source CMS using Symfony Components. https://github.com/forkcms/forkcms

work page 2025

[44] [44]

N/A. 2025. Gracefully parse ECMAScript static imports. https://www.npmjs. com/package/parse-static-imports

work page 2025

[45] [45]

N/A. 2025. Joern Documentation: Node-Type Steps. https://docs.joern.io/cpgql/ reference-card/

work page 2025

[46] [46]

N/A. 2025. Native cross-platform Web Workers. Works in published npm modules. https://www.npmjs.com/package/web-worker

work page 2025

[47] [47]

N/A. 2025. Static file serving middleware. https://www.npmjs.com/package/koa- send

work page 2025

[48] [48]

OpenAI. 2024. GPT-4o Technical Report. https://openai.com/index/gpt-4o. Ac- cessed: 2025-04-07

work page 2024

[49] [49]

Terence Parr. 2013. The definitive ANTLR 4 reference. (2013)

work page 2013

[50] [50]

Marcel Pham, Van Thuan, Manh Dung Nguyen, and Abhik Roychoudhury. 2017. Directed greybox fuzzing. In Proceedings of the 24th ACM Conference on Computer and Communications Security (CCS) . Dallas, TX

work page 2017

[51] [51]

The Chromium Project. 2025. CodeQL Support in Chromium. https://chromium.googlesource.com/chromium/src/+/refs/tags/126.0.6436. 1/tools/codeql/README.md

work page 2025

[52] [52]

Matthew Renze and Erhan Guven. 2024. The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models. In2024 2nd International Conference on Foundation and Large Language Models (FLLM)

work page 2024

[53] [53]

Rebecca Russell, Louis Kim, Lei Hamilton, Tomo Lazovich, Jacob Harer, Onur Ozdemir, Paul Ellingwood, and Marc McConley. 2018. Automated vulnerability detection in source code using deep representation learning. In 2018 17th IEEE international conference on machine learning and applications (ICMLA) . IEEE, 757–762

work page 2018

[54] [54]

Bernhard Scholz, Herbert Jordan, Pavle Subotić, and Till Westmann. 2016. On fast large-scale program analysis in datalog. In Proceedings of the 25th International Conference on Compiler Construction . 196–206

work page 2016

[55] [55]

Apache Logging Services. 2021. Log4Shell: RCE Vulnerability in Log4j (CVE-2021- 44228). https://logging.apache.org/log4j/2.x/security.html. Accessed: 2025-06-22

work page 2021

[56] [56]

Amazon Web Services. 2025. Amazon CodeGuru. https://aws.amazon.com/ codeguru/

work page 2025

[57] [57]

Mikhail Shcherbakov, Musard Balliu, and Cristian-Alexandru Staicu. 2023. Silent spring: Prototype pollution leads to remote code execution in Node. js. In Pro- ceedings of the 32nd USENIX Security Symposium (Security) . Anaheim, CA, USA

work page 2023

[58] [58]

Youkun Shi, Yuan Zhang, Tianhao Bai, Lei Zhang, Xin Tan, and Min Yang. 2024. RecurScan: Detecting Recurring Vulnerabilities in PHP Web Applications. In Proceedings of the Web Conference (WWW) . Singapore

work page 2024

[59] [59]

Nima Shiri Harzevili, Alvine Boaye Belle, Junjie Wang, Song Wang, Zhen Ming Jiang, and Nachiappan Nagappan. 2024. A systematic literature review on auto- mated software vulnerability detection using machine learning. Comput. Surveys 57, 3 (2024), 1–36

work page 2024

[60] [60]

Wei Su, Yifei Liu, Gomathi Ganesan, Gerard Holzmann, Scott Smolka, Erez Zadok, and Geoff Kuenning. 2021. Model-checking support for file system development. In Proceedings of the 13th ACM Workshop on Hot Topics in Storage and File Systems. 103–110

work page 2021

[61] [61]

Yulei Sui and Jingling Xue. 2016. SVF: interprocedural static value-flow analysis in LLVM. In Proceedings of the 25th international conference on compiler construction

work page 2016

[62] [62]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yas- mine El Hage, Baptiste Roziere, Jie Ren, Laurent Sifre, Jean-Rémi King, Thomas Scialom, Gabriel Synnaeve, Nicolas Usunier, Hervé Jégou, and Edouard Grave

work page

[63] [63]

https://github.com/meta- llama/codellama

Code Llama: Open Foundation Models for Code. https://github.com/meta- llama/codellama

work page

[64] [64]

Erik Trickel, Fabio Pagani, Chang Zhu, Lukas Dresel, Giovanni Vigna, Christopher Kruegel, Ruoyu Wang, Tiffany Bao, Yan Shoshitaishvili, and Adam Doupé. 2023. Toss a fault to your witcher: Applying grey-box coverage-guided mutational fuzzing to detect sql and command injection vulnerabilities. In Proceedings of the 44th IEEE Symposium on Security and Priva...

work page 2023

[65] [65]

Coskun, and Gianluca Stringhini

Saad Ullah, Mingji Han, Saurabh Pujar, Hammond Pearce, Ayse K. Coskun, and Gianluca Stringhini. 2024. LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks. In Proceedings of the 45th IEEE Symposium on Security and Privacy (S&P). San Francisco, CA, USA

work page 2024

[66] [66]

Saurous, and Yoon Kim

Bailin Wang, Zi Wang, Xuezhi Wang, Yuan Cao, Rif A. Saurous, and Yoon Kim

work page

[67] [67]

In Proceedings of the 37th Annual Conference on Neural Information Processing Systems (NeurIPS)

Grammar Prompting for Domain-Specific Language Generation with Large Language Models. In Proceedings of the 37th Annual Conference on Neural Information Processing Systems (NeurIPS) . New Orleans, LA, USA

work page

[68] [68]

Chengpeng Wang, Yifei Gao, Wuqi Zhang, Xuwei Liu, Qingkai Shi, and Xi- angyu Zhang. 2024. LLMSA: A Compositional Neuro-Symbolic Approach to Compilation-free and Customizable Static Analysis. In Proceedings of the 2024 Empirical Methods in Natural Language Processing (EMNLP) . Miami, FL, USA

work page 2024

[69] [69]

Rongcun Wang, Senlei Xu, Yuan Tian, Xingyu Ji, Xiaobing Sun, and Shujuang Jiang. 2024. SCL-CVD: Supervised contrastive learning for code vulnerability detection via GraphCodeBERT. Computers & Security (2024). doi:10.1016/j.cose. 2024.103994

work page doi:10.1016/j.cose 2024

[70] [70]

Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi D. Q. Bui, Junnan Li, and Steven C. H. Hoi. 2023. CodeT5+: Open Code Large Language Models for Code Understanding and Generation. In Conference on Empirical Methods in Natural Language Processing. https://api.semanticscholar.org/CorpusID:258685677

work page 2023

[71] [71]

Joty, and Steven C

Yue Wang, Weishi Wang, Shafiq R. Joty, and Steven C. H. Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Under- standing and Generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing . Online and Punta Cana, Dominican Republic

work page 2021

[72] [72]

Fabian Yamaguchi, Nico Golde, Daniel Arp, and Konrad Rieck. 2014. Modeling and discovering vulnerabilities with code property graphs. In Proceedings of the 35th IEEE Symposium on Security and Privacy (S&P) . San Jose, CA, USA

work page 2014

[73] [73]

Hua Yan, Yulei Sui, Shiping Chen, and Jingling Xue. 2018. Spatio-temporal context reduction: A pointer-analysis-based static approach for detecting use-after-free vulnerabilities. In Proceedings of the 40th International Conference on Software Engineering. 327–337

work page 2018

[74] [74]

Junfeng Yang, Paul Twohey, Dawson Engler, and Madanlal Musuvathi. 2006. Using model checking to find serious file system errors. ACM Transactions on Computer Systems (TOCS) 24, 4 (2006), 393–423

work page 2006

[75] [75]

Xin Zhou, Ting Zhang, and David Lo. 2024. Large language model for vulner- ability detection: Emerging results and future directions. In Proceedings of the 46th International Conference on Software Engineering (ICSE) . Lisbon, Portugal

work page 2024

[76] [76]

Jin Zhu, Hui Ge, Yun Zhou, Xiao Jin, Rui Luo, and Yanchen Sun. 2024. Detecting Source Code Vulnerabilities Using Fine-Tuned Pre-Trained LLMs. In2024 IEEE 17th International Conference on Signal Processing (ICSP) . 238–242. 12

work page 2024