SciCoQA benchmark reveals that even the strongest LLMs detect fewer than half of real discrepancies between scientific papers and their code.
InThe Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
SciCoQA: Quality Assurance for Scientific Paper--Code Alignment
SciCoQA benchmark reveals that even the strongest LLMs detect fewer than half of real discrepancies between scientific papers and their code.