LLMs fail most often during strategy formulation and logic synthesis when fixing GitHub issues, but succeed relatively well at localizing faults, according to a taxonomy derived from 243 manual failure cases.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Characterizing the Failure Modes of LLMs in Resolving Real-World GitHub Issues
LLMs fail most often during strategy formulation and logic synthesis when fixing GitHub issues, but succeed relatively well at localizing faults, according to a taxonomy derived from 243 manual failure cases.