DebugRepair improves LLM-based automated program repair by adding test semantic purification, simulated instrumentation, and debugging-driven conversational repair, fixing 224 Defects4J bugs with GPT-3.5 (26.2% above prior SOTA) and 295 with DeepSeek-V3.
Title resolution pending
5 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.SE 5representative citing papers
DynaFix iteratively feeds execution-level dynamic information such as variable states and control flows into LLM prompts to repair 186 bugs on Defects4J, a 10% gain over baselines including 38 previously unrepaired cases.
GALA uses hierarchical graph alignment between UI screenshots and code structures to achieve state-of-the-art bug localization in multimodal automated program repair on SWE-bench.
Auto-Diagnose applies LLMs to summarize and diagnose root causes of integration test failures, reporting 90.14% accuracy on 71 manual cases and positive adoption after Google-wide rollout.
citing papers explorer
-
DebugRepair: Enhancing LLM-Based Automated Program Repair via Self-Directed Debugging
DebugRepair improves LLM-based automated program repair by adding test semantic purification, simulated instrumentation, and debugging-driven conversational repair, fixing 224 Defects4J bugs with GPT-3.5 (26.2% above prior SOTA) and 295 with DeepSeek-V3.
-
DynaFix: Iterative Automated Program Repair Driven by Execution-Level Dynamic Information
DynaFix iteratively feeds execution-level dynamic information such as variable states and control flows into LLM prompts to repair 186 bugs on Defects4J, a 10% gain over baselines including 38 previously unrepaired cases.
-
GALA: Multimodal Graph Alignment for Bug Localization in Automated Program Repair
GALA uses hierarchical graph alignment between UI screenshots and code structures to achieve state-of-the-art bug localization in multimodal automated program repair on SWE-bench.
-
LLM-Based Automated Diagnosis Of Integration Test Failures At Google
Auto-Diagnose applies LLMs to summarize and diagnose root causes of integration test failures, reporting 90.14% accuracy on 71 manual cases and positive adoption after Google-wide rollout.
- HELO-APR: Enhancing Low-Resource Program Repair through Cross-Lingual Knowledge Transfer