STITCH trains superior agentic coding and reasoning LLMs by using fewer high-quality trajectories filtered to keep only critical decision tokens, delivering up to 63% relative gains on SWE-bench Verified.
Swe- bench-java: A github issue resolving benchmark for java.CoRR, abs/2408.14354, 2024
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
dataset 1
citation-polarity summary
fields
cs.SE 2verdicts
UNVERDICTED 2roles
dataset 1polarities
use dataset 1representative citing papers
A literature survey that collects and categorizes 124 papers on LLM-based agents for software engineering from SE and agent perspectives.
citing papers explorer
-
Yet Even Less Is Even Better For Agentic, Reasoning, and Coding LLMs
STITCH trains superior agentic coding and reasoning LLMs by using fewer high-quality trajectories filtered to keep only critical decision tokens, delivering up to 63% relative gains on SWE-bench Verified.
-
Large Language Model-Based Agents for Software Engineering: A Survey
A literature survey that collects and categorizes 124 papers on LLM-based agents for software engineering from SE and agent perspectives.