SWE-Shepherd trains a lightweight PRM on SWE-Bench trajectories to score intermediate actions and guide code agents, showing gains in efficiency and action quality on SWE-Bench Verified.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SWE-Shepherd: Advancing PRMs for Reinforcing Code Agents
SWE-Shepherd trains a lightweight PRM on SWE-Bench trajectories to score intermediate actions and guide code agents, showing gains in efficiency and action quality on SWE-Bench Verified.