Merlin generates CodeQL queries from natural language questions via RAG-based iteration and a self-test technique using assistive queries, achieving 3.8x higher task accuracy and 31% less completion time in user studies while finding additional software issues.
Valgrind: a framework for heavyweight dynamic binary instrumentation
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.SE 3roles
background 1polarities
background 1representative citing papers
AnyPoC introduces a multi-agent system for generating and validating PoC tests from LLM bug reports, producing 1.3x more valid PoCs, rejecting 9.8x more false positives, and discovering 122 new bugs across 12 major projects.
Zorya introduces a concolic execution approach using Ghidra P-Code to detect vulnerabilities in Go programs and extend to other languages like C.
citing papers explorer
-
Generating Complex Code Analyzers from Natural Language Questions
Merlin generates CodeQL queries from natural language questions via RAG-based iteration and a self-test technique using assistive queries, achieving 3.8x higher task accuracy and 31% less completion time in user studies while finding additional software issues.
-
AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection
AnyPoC introduces a multi-agent system for generating and validating PoC tests from LLM bug reports, producing 1.3x more valid PoCs, rejecting 9.8x more false positives, and discovering 122 new bugs across 12 major projects.
-
Exposing Go's Hidden Bugs: A Novel Concolic Framework
Zorya introduces a concolic execution approach using Ghidra P-Code to detect vulnerabilities in Go programs and extend to other languages like C.