Prometheus reverse-engineers BDD-style executable specifications from bug failures and uses an RQA validation loop to achieve 93.97% correct patch rate on 680 Defects4J defects while rescuing 74.4% of bugs missed by strong baseline agents.
Abstain and validate: A dual-llm policy for reducing noise in agentic program repair.CoRR, abs/2510.03217
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Project Prometheus: Bridging the Intent Gap in Agentic Program Repair via Reverse-Engineered Executable Specifications
Prometheus reverse-engineers BDD-style executable specifications from bug failures and uses an RQA validation loop to achieve 93.97% correct patch rate on 680 Defects4J defects while rescuing 74.4% of bugs missed by strong baseline agents.