Lemur: Integrating Large Language Models in Automated Program Verification
read the original abstract
The demonstrated code-understanding capability of LLMs raises the question of whether they can be used for automated program verification, a task that demands high-level abstract reasoning about program properties that is challenging for verification tools. We propose a general methodology to combine the power of LLMs and automated reasoners for automated program verification. We formally describe this methodology as a set of transition rules and prove its soundness. We instantiate the calculus as a sound automated verification procedure and demonstrate practical improvements on a set of synthetic and competition benchmarks.
This paper has not been read by Pith yet.
Forward citations
Cited by 4 Pith papers
-
Satisfiability Solving with LLMs: A Matched-Pair Evaluation of Reasoning Capability
A matched-pair protocol and Accurate Differentiation Rate metric reveal that conventional LLM accuracy on SAT problems is often inflated by over-predicting satisfiability, while cross-representation agreement exceeds ...
-
Event-B Agent: Towards LLM Agent for Formal Model Synthesis and Repair
Event-B Agent is an LLM agent that synthesizes, refines, and repairs Event-B formal models from natural language requirements via iterative verification feedback loops.
-
Guiding Human Validation of LLM-Generated Code via Verifiable Literate Programming
VLP adds an NL documentation layer with trace-linked mismatch detection and derived formal checks to make human validation of LLM code feasible, lifting pass@1 from 28.7-73.2% to 65.4-93.5%.
-
Evaluating LLM-Generated ACSL Annotations for Formal Verification
Rule-based annotation generation for ACSL outperforms LLM-based methods in achieving successful formal verification of C programs.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.