← back to paper
arxiv: 2605.00674 · 2 revisions
Beyond Benchmarks: MathArena as an Evaluation Platform for Mathematics with LLMs