MathlibPR turns real Mathlib PR histories into a supervised benchmark showing that LLMs and agents fail to distinguish merge-ready contributions from build-passing but non-mergeable ones.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LO 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
MathlibPR: Pull Request Merge-Readiness Benchmark for Formal Mathematical Libraries
MathlibPR turns real Mathlib PR histories into a supervised benchmark showing that LLMs and agents fail to distinguish merge-ready contributions from build-passing but non-mergeable ones.