Title resolution pending

Toufique Ahmed, Premkumar Devanbu, Christoph Treude, Michael Pradel

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Edit, But Verify: An Empirical Audit of Instructed Code-Editing Benchmarks

cs.SE · 2026-04-06 · conditional · novelty 8.0

The two main benchmarks for LLM instructed code editing over-represent Python, miss common real-world domains and edit types, and have test coverage issues that limit what they measure.

Reproduction Test Generation for Java SWE Issues

cs.SE · 2026-05-05 · unverdicted · novelty 6.0 · 2 refs

Introduces the first benchmark for Java reproduction test generation from repository issues and adapts a prior Python tool to produce high performance on it.

citing papers explorer

Showing 2 of 2 citing papers.

Edit, But Verify: An Empirical Audit of Instructed Code-Editing Benchmarks cs.SE · 2026-04-06 · conditional · none · ref 1
The two main benchmarks for LLM instructed code editing over-represent Python, miss common real-world domains and edit types, and have test coverage issues that limit what they measure.
Reproduction Test Generation for Java SWE Issues cs.SE · 2026-05-05 · unverdicted · none · ref 1 · 2 links
Introduces the first benchmark for Java reproduction test generation from repository issues and adapts a prior Python tool to produce high performance on it.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer