SGAgent: Suggestion-Guided LLM-Based Multi-Agent Framework for Repository-Level Software Repair

Chengyu Gao; Chunrong Fang; Liang Xiao; Quanjun Zhang; Ye Shang; Yu Han; Zhenyu Chen

arxiv: 2602.23647 · v2 · pith:7SDCABSFnew · submitted 2026-02-27 · 💻 cs.SE

SGAgent: Suggestion-Guided LLM-Based Multi-Agent Framework for Repository-Level Software Repair

Quanjun Zhang , Chengyu Gao , Yu Han , Ye Shang , Chunrong Fang , Zhenyu Chen , Liang Xiao This is my paper

classification 💻 cs.SE

keywords repairsgagentrepository-levelsoftwareaccuracyachieveachievesacross

0 comments

read the original abstract

Large Language Models (LLMs) have enabled intelligent agents that autonomously interact with environments and invoke external tools. Recently, agent-based software repair has drawn wide attention, as repair agents can localize bugs, generate patches, and achieve state-of-the-art performance on repository-level benchmarks (e.g., SWE-Bench). However, existing approaches usually adopt a localize-then-fix paradigm, jumping directly from "where the bug is" to "how to fix it", leaving a fundamental reasoning gap. To this end, we propose SGAgent, a Suggestion-Guided multi-Agent framework for repository-level software repair, which follows a localize-suggest-fix paradigm. SGAgent introduces a suggestion phase to strengthen the transition from localization to repair: the suggester starts from the buggy locations, incrementally retrieves relevant context until it fully understands the bug, and provides actionable repair suggestions. We further construct a Knowledge Graph (KG) from the target repository and develop a KG-based toolkit to strengthen SGAgent's global contextual awareness and repository-level reasoning. Three specialized sub-agents (i.e., localizer, suggester, and fixer) collaborate to achieve automated end-to-end software repair. We evaluate SGAgent on SWE-Bench-Lite. SGAgent with Claude-3.5 achieves 51.3% repair accuracy, 81.2% file-level, and 52.4% function-level localization accuracy at an average cost of $1.48 per instance, outperforming all baselines using the same base model. SGAgent also generalizes well across base LLMs, reaching a 60.7% resolution rate with Claude-4. When extended to vulnerability repair, it achieves 48.0% on VUL4J and VJBench, demonstrating strong generalization across tasks and programming languages.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Code as Agent Harness
cs.CL 2026-05 accept novelty 5.0

A survey that organizes existing work on LLM-based agents around code as the central harness, structured in three layers of interfaces, mechanisms, and multi-agent scaling, with applications across domains and listed ...