Title resolution pending

The prefix MUST be AT LEAST 50 -60 lines of code this is an a bso lu te r e q u i r e m e n t

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

DevBench: A Realistic, Developer-Informed Benchmark for Code Generation Models

cs.LG · 2026-01-17 · unverdicted · novelty 6.0

DevBench is a telemetry-driven benchmark with 1,800 instances across six languages and six task categories that evaluates LLMs on realistic code completion and finds the strongest model at only 43.5% Pass@1.

citing papers explorer

Showing 1 of 1 citing paper.

DevBench: A Realistic, Developer-Informed Benchmark for Code Generation Models cs.LG · 2026-01-17 · unverdicted · none · ref 77
DevBench is a telemetry-driven benchmark with 1,800 instances across six languages and six task categories that evaluates LLMs on realistic code completion and finds the strongest model at only 43.5% Pass@1.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer