SWE-Doctor generates multi-faceted BRTs, derives runtime diagnoses from their executions, and uses the diagnoses to guide patch generation, raising average resolution rates to 75.7% on SWE-bench Verified and 59.4% on SWE-bench Pro.
Swe-replay: Efficient test-time scaling for software engineering agents,
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
EvoRepair is the first experience-based self-evolving agent framework for automated vulnerability repair, reporting 90.46% overall success on PATCHEVAL and SEC-bench benchmarks.
FastContext adds a dedicated exploration subagent with specialized models trained on reference trajectories and task rewards, cutting token consumption up to 60% and lifting resolution rates up to 5.5% on SWE-bench variants.
Survey framing LLM agents as model-plus-harness systems, decomposing harness responsibilities, mapping them to tasks, and highlighting open challenges in evaluation, safety, and co-evolution.
citing papers explorer
-
SWE-Doctor: Guiding Software Engineering Agents with Runtime Diagnosis from Multi-Faceted Bug Reproduction Tests
SWE-Doctor generates multi-faceted BRTs, derives runtime diagnoses from their executions, and uses the diagnoses to guide patch generation, raising average resolution rates to 75.7% on SWE-bench Verified and 59.4% on SWE-bench Pro.
-
EvoRepair: Enhancing Vulnerability Repair Agents Through Experience-Based Self-Evolution
EvoRepair is the first experience-based self-evolving agent framework for automated vulnerability repair, reporting 90.46% overall success on PATCHEVAL and SEC-bench benchmarks.
-
FastContext: Training Efficient Repository Explorer for Coding Agents
FastContext adds a dedicated exploration subagent with specialized models trained on reference trajectories and task rewards, cutting token consumption up to 60% and lifting resolution rates up to 5.5% on SWE-bench variants.
-
From Question Answering to Task Completion: A Survey on Agent System and Harness Design
Survey framing LLM agents as model-plus-harness systems, decomposing harness responsibilities, mapping them to tasks, and highlighting open challenges in evaluation, safety, and co-evolution.