SWE-bench reveals that even top language models like Claude 2 resolve only 1.96% of 2,294 real-world GitHub issues, highlighting a gap in practical coding capabilities.
Universal fuzzing via large language models
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
NESA presents a neuro-symbolic framework that decomposes static analyses into policy-defined sub-problems solved by parsers and LLMs to enable compilation-free customizable analysis with reduced hallucinations.
LLMs achieve strong results on syntax parsing tasks but show limited and variable performance on dynamic reasoning, with a clear performance hierarchy across model scales.
FuzzAgent deploys specialized agents that collaborate on harness generation, execution, and crash triage to evolve fuzzing campaigns, delivering 45-191% more branch coverage than four baselines on 20 C/C++ libraries and surfacing 102 real bugs.
SDLLMFuzz combines LLM-based generation of syntactically valid inputs with a dynamic-static feedback loop from crash artifacts to improve bug discovery and time-to-bug on structured-input programs compared to traditional and LLM-assisted fuzzers.
citing papers explorer
-
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
SWE-bench reveals that even top language models like Claude 2 resolve only 1.96% of 2,294 real-world GitHub issues, highlighting a gap in practical coding capabilities.
-
NESA: Relational Neuro-Symbolic Static Program Analysis
NESA presents a neuro-symbolic framework that decomposes static analyses into policy-defined sub-problems solved by parsers and LLMs to enable compilation-free customizable analysis with reduced hallucinations.
-
Exploring Code Analysis: Zero-Shot Insights on Syntax and Semantics with LLMs
LLMs achieve strong results on syntax parsing tasks but show limited and variable performance on dynamic reasoning, with a clear performance hierarchy across model scales.
-
FuzzAgent: Multi-Agent System for Evolutionary Library Fuzzing
FuzzAgent deploys specialized agents that collaborate on harness generation, execution, and crash triage to evolve fuzzing campaigns, delivering 45-191% more branch coverage than four baselines on 20 C/C++ libraries and surfacing 102 real bugs.
-
SDLLMFuzz: Dynamic-static LLM-assisted greybox fuzzing for structured input programs
SDLLMFuzz combines LLM-based generation of syntactically valid inputs with a dynamic-static feedback loop from crash artifacts to improve bug discovery and time-to-bug on structured-input programs compared to traditional and LLM-assisted fuzzers.