pith. sign in

Neuro-symbolic Static Analysis with LLM-generated Vulnerability Patterns

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it
abstract

In this work, we present MoCQ, a neuro-symbolic static analysis framework that leverages large language models (LLMs) to automatically generate vulnerability detection patterns. This approach combines the precision and scalability of pattern-based static analysis with the semantic understanding and automation capabilities of LLMs. MoCQ extracts the domain-specific languages for expressing vulnerability patterns and employs an iterative refinement loop with trace-driven symbolic validation that provides precise feedback for pattern correction. We evaluated MoCQ on 12 vulnerability types across four languages (C/C++, Java, PHP, JavaScript). MoCQ achieves detection performance comparable to expert-developed patterns while requiring only hours of generation versus weeks of manual effort. Notably, MoCQ uncovered 46 new vulnerability patterns that security experts had missed and discovered 25 previously unknown vulnerabilities in real-world applications. MoCQ also outperforms prior approaches with stronger analysis capabilities and broader applicability.

citation-role summary

background 2

citation-polarity summary

fields

cs.SE 3 cs.CR 2

years

2026 3 2025 2

roles

background 2

polarities

background 1 support 1

representative citing papers

Generating Complex Code Analyzers from Natural Language Questions

cs.SE · 2026-05-10 · unverdicted · novelty 7.0

Merlin generates CodeQL queries from natural language questions via RAG-based iteration and a self-test technique using assistive queries, achieving 3.8x higher task accuracy and 31% less completion time in user studies while finding additional software issues.

BugScope: Learn to Find Bugs Like Human

cs.SE · 2025-07-21 · conditional · novelty 6.0

BugScope structures LLM bug detection into three human-mirroring steps and distills guidelines from examples, reaching 0.87 F1 on 33 real bugs while outperforming Claude and Cursor tools and uncovering 184 new issues in production code.

citing papers explorer

Showing 5 of 5 citing papers.

  • Do Skill Descriptions Tell the Truth? Detecting Undisclosed Security Behaviors in Code-Backed LLM Skills cs.CR · 2026-05-13 · conditional · none · ref 28 · internal anchor

    SKILLSCOPE detects undisclosed security behaviors in LLM skill implementations via security property graphs and taxonomy-based consistency checking, identifying confirmed inconsistencies in 9.4% of 4,556 evaluated skills with 84.8% precision and 96.5% recall against human review.

  • Generating Complex Code Analyzers from Natural Language Questions cs.SE · 2026-05-10 · unverdicted · none · ref 25 · internal anchor

    Merlin generates CodeQL queries from natural language questions via RAG-based iteration and a self-test technique using assistive queries, achieving 3.8x higher task accuracy and 31% less completion time in user studies while finding additional software issues.

  • Less Is More: Measuring How LLM Involvement affects Chatbot Accuracy in Static Analysis cs.SE · 2026-04-23 · unverdicted · none · ref 12 · internal anchor

    A structured JSON intermediate representation for LLM-generated static analysis queries outperforms both direct generation and agentic tool use, with gains of 15-25 percentage points on large models.

  • BugScope: Learn to Find Bugs Like Human cs.SE · 2025-07-21 · conditional · none · ref 22 · internal anchor

    BugScope structures LLM bug detection into three human-mirroring steps and distills guidelines from examples, reaching 0.87 F1 on 33 real bugs while outperforming Claude and Cursor tools and uncovering 184 new issues in production code.

  • Neuro-Symbolic AI for Cybersecurity: State of the Art, Challenges, and Opportunities cs.CR · 2025-09-08 · unverdicted · none · ref 146 · internal anchor

    A systematic review of neuro-symbolic AI in cybersecurity finds that deeper integration and causal reasoning improve performance across intrusion detection and vulnerability tasks, while identifying barriers and a research roadmap.