ReproRepo uses GitHub issues as natural supervision to benchmark LLM agents on detecting reproducibility blockers across 1,149 ML papers, with the top agent finding related issues for roughly 90% of cases.
Reproducibility in NLP: What have we learned from the checklist? InFindings of the Association for Computational Linguistics: ACL 2023, pages 12789–12811, 2023
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ReproRepo: Scaling Reproducibility Audits with GitHub Repository Issues
ReproRepo uses GitHub issues as natural supervision to benchmark LLM agents on detecting reproducibility blockers across 1,149 ML papers, with the top agent finding related issues for roughly 90% of cases.