SWE-agent: Agent-computer interfaces enable automated soft- ware engineering

John Yang, Carlos E Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik R Narasimhan, Ofir Press · 2024

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

DeepWeb-Bench: A Deep Research Benchmark Demanding Massive Cross-Source Evidence and Long-Horizon Derivation

cs.AI · 2026-05-20 · unverdicted · novelty 6.0

DeepWeb-Bench is a benchmark requiring massive cross-source evidence collection and long-horizon derivation, with evaluations on nine frontier models showing derivation and calibration as primary failure modes.

Context Pruning for Coding Agents via Multi-Rubric Latent Reasoning

cs.AI · 2026-05-14 · unverdicted · novelty 5.0

LaMR decomposes code context pruning into two rubrics using dedicated CRFs, a mixture-of-experts gate, and AST-derived labels to filter noise and often match or beat full-context baselines on coding benchmarks.

citing papers explorer

Showing 2 of 2 citing papers.

DeepWeb-Bench: A Deep Research Benchmark Demanding Massive Cross-Source Evidence and Long-Horizon Derivation cs.AI · 2026-05-20 · unverdicted · none · ref 49
DeepWeb-Bench is a benchmark requiring massive cross-source evidence collection and long-horizon derivation, with evaluations on nine frontier models showing derivation and calibration as primary failure modes.
Context Pruning for Coding Agents via Multi-Rubric Latent Reasoning cs.AI · 2026-05-14 · unverdicted · none · ref 1
LaMR decomposes code context pruning into two rubrics using dedicated CRFs, a mixture-of-experts gate, and AST-derived labels to filter noise and often match or beat full-context baselines on coding benchmarks.

SWE-agent: Agent-computer interfaces enable automated soft- ware engineering

fields

years

verdicts

representative citing papers

citing papers explorer