Pavol Bielik, Veselin Raychev, and Martin T

9 Under review · 2016

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

ContractEval: A Benchmark for Evaluating Contract-Satisfying Assertions in Code Generation

cs.AI · 2025-10-14 · unverdicted · novelty 7.0

ContractEval benchmark on 364 tasks shows code LLMs achieve 75-82% functional pass@1 but 0% contract satisfaction under standard prompting, rising only to 23-41% with explicit contracts.

citing papers explorer

Showing 1 of 1 citing paper.

ContractEval: A Benchmark for Evaluating Contract-Satisfying Assertions in Code Generation cs.AI · 2025-10-14 · unverdicted · none · ref 2
ContractEval benchmark on 364 tasks shows code LLMs achieve 75-82% functional pass@1 but 0% contract satisfaction under standard prompting, rising only to 23-41% with explicit contracts.

Pavol Bielik, Veselin Raychev, and Martin T

fields

years

verdicts

representative citing papers

citing papers explorer