SGAgent: Suggestion-Guided LLM-Based Multi-Agent Framework for Repository-Level Software Repair
read the original abstract
Large Language Models (LLMs) have enabled intelligent agents that autonomously interact with environments and invoke external tools. Recently, agent-based software repair has drawn wide attention, as repair agents can localize bugs, generate patches, and achieve state-of-the-art performance on repository-level benchmarks (e.g., SWE-Bench). However, existing approaches usually adopt a localize-then-fix paradigm, jumping directly from "where the bug is" to "how to fix it", leaving a fundamental reasoning gap. To this end, we propose SGAgent, a Suggestion-Guided multi-Agent framework for repository-level software repair, which follows a localize-suggest-fix paradigm. SGAgent introduces a suggestion phase to strengthen the transition from localization to repair: the suggester starts from the buggy locations, incrementally retrieves relevant context until it fully understands the bug, and provides actionable repair suggestions. We further construct a Knowledge Graph (KG) from the target repository and develop a KG-based toolkit to strengthen SGAgent's global contextual awareness and repository-level reasoning. Three specialized sub-agents (i.e., localizer, suggester, and fixer) collaborate to achieve automated end-to-end software repair. We evaluate SGAgent on SWE-Bench-Lite. SGAgent with Claude-3.5 achieves 51.3% repair accuracy, 81.2% file-level, and 52.4% function-level localization accuracy at an average cost of $1.48 per instance, outperforming all baselines using the same base model. SGAgent also generalizes well across base LLMs, reaching a 60.7% resolution rate with Claude-4. When extended to vulnerability repair, it achieves 48.0% on VUL4J and VJBench, demonstrating strong generalization across tasks and programming languages.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Code as Agent Harness
A survey that organizes existing work on LLM-based agents around code as the central harness, structured in three layers of interfaces, mechanisms, and multi-agent scaling, with applications across domains and listed ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.