pith. sign in

Measuring mathematical problem solving with the MATH dataset

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.AI 1 cs.CL 1

years

2025 2

verdicts

UNVERDICTED 2

representative citing papers

MathArena: Evaluating LLMs on Uncontaminated Math Competitions

cs.AI · 2025-05-29 · unverdicted · novelty 7.0

MathArena evaluates over 50 LLMs on 162 fresh competition problems across seven contests, detects contamination in AIME 2024, and reports top models scoring below 40 percent on IMO 2025 proof tasks.

citing papers explorer

Showing 2 of 2 citing papers.