arXiv preprint arXiv:2503.21710 , year=

· 2025 · arXiv 2503.21710

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

Is Agentic AI Ready for Real-World Hardware Engineering? A Deep Dive with Phoenix-bench

cs.AR · 2026-05-13 · unverdicted · novelty 7.0

Phoenix-bench shows agentic AI systems lose 37-58% resolved rate when moving from SWE-bench Verified to hardware tasks because bugs spread across parallel modules via signal flow, with testbench feedback lifting performance by 42-45% while file-level oracles add only 1.4%.

ARISE: A Repository-level Graph Representation and Toolset for Agentic Fault Localization and Program Repair

cs.SE · 2026-05-04 · unverdicted · novelty 7.0

ARISE adds a data-flow-augmented repository graph and three-tier tool API to LLM agents, raising Function Recall@1 by 17 points, Line Recall@1 by 15 points, and Pass@1 repair rate to 22% on SWE-bench Lite.

AgenticSZZ: Temporal Knowledge Graph-Guided Agentic Bug-Inducing Commit Identification

cs.SE · 2026-02-03 · conditional · novelty 7.0

AgenticSZZ reframes bug-inducing commit identification as temporal knowledge graph search navigated by an LLM agent, reporting F1 scores of 0.47-0.79 and up to 34% improvement over prior SZZ methods on three datasets.

Beyond Textual Repository Exploration: Dual-Modal Structural Reasoning for Agentic Issue Resolution

cs.SE · 2026-07-02 · unverdicted · novelty 6.0

DUALVIEW is a dual-modal framework using Module Coupling, Function Call, Class Hierarchy, and Program Dependence graphs to enable persistent structural reasoning for agentic issue resolution, reporting gains on SWE-bench Pro and Verified.

A Single Patch Is Not Enough: Deterministic Fusion of Repair Candidates

cs.SE · 2026-07-02 · unverdicted · novelty 6.0

PatchFusion uses deterministic atomic evidence fusion on candidate patches to outperform ranking, test-filtering, and LLM-judge selectors on SWE-bench and Defects4J pools.

RepoRescue: An Empirical Study of LLM Agents on Whole-Repository Compatibility Rescue

cs.SE · 2026-07-01 · unverdicted · novelty 6.0

RepoRescue creates a benchmark of 315 repositories and shows LLM agents rescue up to 41.5% with runtime enforcement and 62.7% when combining systems, with hardest cases requiring cross-file changes.

Beyond Fixed Tests: Repository-Level Issue Resolution as Coevolution of Code and Behavioral Constraints

cs.SE · 2026-04-06 · unverdicted · novelty 6.0

Agent-CoEvo is a multi-agent LLM framework that coevolves code patches and test patches to resolve repository-level issues, outperforming fixed-test baselines on SWE-bench Lite and SWT-bench Lite.

ContextSniper: AntTrail's Token-Efficient Code Memory for Repository-Level Program Repair

cs.AI · 2026-07-02 · unverdicted · novelty 4.0

ContextSniper reduces token use by 38.9-51.5% in repository-level program repair agents on SWE-bench Lite with 2 percentage point drops in resolution rate.

Dynamic analysis enhances issue resolution

cs.SE · 2026-03-23

citing papers explorer

Showing 7 of 7 citing papers after filters.

Is Agentic AI Ready for Real-World Hardware Engineering? A Deep Dive with Phoenix-bench cs.AR · 2026-05-13 · unverdicted · none · ref 41
Phoenix-bench shows agentic AI systems lose 37-58% resolved rate when moving from SWE-bench Verified to hardware tasks because bugs spread across parallel modules via signal flow, with testbench feedback lifting performance by 42-45% while file-level oracles add only 1.4%.
ARISE: A Repository-level Graph Representation and Toolset for Agentic Fault Localization and Program Repair cs.SE · 2026-05-04 · unverdicted · none · ref 40
ARISE adds a data-flow-augmented repository graph and three-tier tool API to LLM agents, raising Function Recall@1 by 17 points, Line Recall@1 by 15 points, and Pass@1 repair rate to 22% on SWE-bench Lite.
Beyond Textual Repository Exploration: Dual-Modal Structural Reasoning for Agentic Issue Resolution cs.SE · 2026-07-02 · unverdicted · none · ref 43
DUALVIEW is a dual-modal framework using Module Coupling, Function Call, Class Hierarchy, and Program Dependence graphs to enable persistent structural reasoning for agentic issue resolution, reporting gains on SWE-bench Pro and Verified.
A Single Patch Is Not Enough: Deterministic Fusion of Repair Candidates cs.SE · 2026-07-02 · unverdicted · none · ref 52
PatchFusion uses deterministic atomic evidence fusion on candidate patches to outperform ranking, test-filtering, and LLM-judge selectors on SWE-bench and Defects4J pools.
RepoRescue: An Empirical Study of LLM Agents on Whole-Repository Compatibility Rescue cs.SE · 2026-07-01 · unverdicted · none · ref 29
RepoRescue creates a benchmark of 315 repositories and shows LLM agents rescue up to 41.5% with runtime enforcement and 62.7% when combining systems, with hardest cases requiring cross-file changes.
Beyond Fixed Tests: Repository-Level Issue Resolution as Coevolution of Code and Behavioral Constraints cs.SE · 2026-04-06 · unverdicted · none · ref 51
Agent-CoEvo is a multi-agent LLM framework that coevolves code patches and test patches to resolve repository-level issues, outperforming fixed-test baselines on SWE-bench Lite and SWT-bench Lite.
ContextSniper: AntTrail's Token-Efficient Code Memory for Repository-Level Program Repair cs.AI · 2026-07-02 · unverdicted · none · ref 12
ContextSniper reduces token use by 38.9-51.5% in repository-level program repair agents on SWE-bench Lite with 2 percentage point drops in resolution rate.

arXiv preprint arXiv:2503.21710 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer