Neuro-symbolic Static Analysis with LLM-generated Vulnerability Patterns

· 2025 · cs.CR · arXiv 2504.16057

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

open full Pith review browse 5 citing papers arXiv PDF

abstract

In this work, we present MoCQ, a neuro-symbolic static analysis framework that leverages large language models (LLMs) to automatically generate vulnerability detection patterns. This approach combines the precision and scalability of pattern-based static analysis with the semantic understanding and automation capabilities of LLMs. MoCQ extracts the domain-specific languages for expressing vulnerability patterns and employs an iterative refinement loop with trace-driven symbolic validation that provides precise feedback for pattern correction. We evaluated MoCQ on 12 vulnerability types across four languages (C/C++, Java, PHP, JavaScript). MoCQ achieves detection performance comparable to expert-developed patterns while requiring only hours of generation versus weeks of manual effort. Notably, MoCQ uncovered 46 new vulnerability patterns that security experts had missed and discovered 25 previously unknown vulnerabilities in real-world applications. MoCQ also outperforms prior approaches with stronger analysis capabilities and broader applicability.

citation-role summary

background 2

citation-polarity summary

background 1 support 1

representative citing papers

Do Skill Descriptions Tell the Truth? Detecting Undisclosed Security Behaviors in Code-Backed LLM Skills

cs.CR · 2026-05-13 · conditional · novelty 7.0

SKILLSCOPE detects undisclosed security behaviors in LLM skill implementations via security property graphs and taxonomy-based consistency checking, identifying confirmed inconsistencies in 9.4% of 4,556 evaluated skills with 84.8% precision and 96.5% recall against human review.

Generating Complex Code Analyzers from Natural Language Questions

cs.SE · 2026-05-10 · unverdicted · novelty 7.0

Merlin generates CodeQL queries from natural language questions via RAG-based iteration and a self-test technique using assistive queries, achieving 3.8x higher task accuracy and 31% less completion time in user studies while finding additional software issues.

Less Is More: Measuring How LLM Involvement affects Chatbot Accuracy in Static Analysis

cs.SE · 2026-04-23 · unverdicted · novelty 6.0

A structured JSON intermediate representation for LLM-generated static analysis queries outperforms both direct generation and agentic tool use, with gains of 15-25 percentage points on large models.

BugScope: Learn to Find Bugs Like Human

cs.SE · 2025-07-21 · conditional · novelty 6.0

BugScope structures LLM bug detection into three human-mirroring steps and distills guidelines from examples, reaching 0.87 F1 on 33 real bugs while outperforming Claude and Cursor tools and uncovering 184 new issues in production code.

Neuro-Symbolic AI for Cybersecurity: State of the Art, Challenges, and Opportunities

cs.CR · 2025-09-08 · unverdicted · novelty 5.0

A systematic review of neuro-symbolic AI in cybersecurity finds that deeper integration and causal reasoning improve performance across intrusion detection and vulnerability tasks, while identifying barriers and a research roadmap.

citing papers explorer

Showing 5 of 5 citing papers.

Do Skill Descriptions Tell the Truth? Detecting Undisclosed Security Behaviors in Code-Backed LLM Skills cs.CR · 2026-05-13 · conditional · none · ref 28 · internal anchor
SKILLSCOPE detects undisclosed security behaviors in LLM skill implementations via security property graphs and taxonomy-based consistency checking, identifying confirmed inconsistencies in 9.4% of 4,556 evaluated skills with 84.8% precision and 96.5% recall against human review.
Generating Complex Code Analyzers from Natural Language Questions cs.SE · 2026-05-10 · unverdicted · none · ref 25 · internal anchor
Merlin generates CodeQL queries from natural language questions via RAG-based iteration and a self-test technique using assistive queries, achieving 3.8x higher task accuracy and 31% less completion time in user studies while finding additional software issues.
Less Is More: Measuring How LLM Involvement affects Chatbot Accuracy in Static Analysis cs.SE · 2026-04-23 · unverdicted · none · ref 12 · internal anchor
A structured JSON intermediate representation for LLM-generated static analysis queries outperforms both direct generation and agentic tool use, with gains of 15-25 percentage points on large models.
BugScope: Learn to Find Bugs Like Human cs.SE · 2025-07-21 · conditional · none · ref 22 · internal anchor
BugScope structures LLM bug detection into three human-mirroring steps and distills guidelines from examples, reaching 0.87 F1 on 33 real bugs while outperforming Claude and Cursor tools and uncovering 184 new issues in production code.
Neuro-Symbolic AI for Cybersecurity: State of the Art, Challenges, and Opportunities cs.CR · 2025-09-08 · unverdicted · none · ref 146 · internal anchor
A systematic review of neuro-symbolic AI in cybersecurity finds that deeper integration and causal reasoning improve performance across intrusion detection and vulnerability tasks, while identifying barriers and a research roadmap.

Neuro-symbolic Static Analysis with LLM-generated Vulnerability Patterns

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer