Soohak is a 439-problem mathematician-curated benchmark where frontier LLMs reach at most 30.4% on research math challenges and no model exceeds 50% on refusal for ill-posed problems.
Short proofs in combinatorics, probability and number theory II
4 Pith papers cite this work. Polarity classification is still indexing.
abstract
We give a quintet of proofs resulting from questions posed by Erd\H{o}s. These questions concern ordinary lines in planar point sets, sequences with uniformly small exponential sums, $K_4$-free $4$-critical graphs with few chords in any cycle, a counterexample to a "fewnomial" version of the Erd\H{o}s--Tur\'{a}n discrepancy bound, and a finiteness theorem for integers $n$ such that $n-a k^2$ is prime for all $k\leq \sqrt{n/a}$ coprime to $n$ (for fixed $a\in\mathbb Z_+$). Each proof is due to an internal model at OpenAI.
citation-role summary
citation-polarity summary
years
2026 4roles
background 1polarities
background 1representative citing papers
LLM-based agents in Lean solved 9 of 353 open Erdős problems and proved 44 of 492 OEIS conjectures at a few hundred dollars each.
An interactive AI workbench for mathematicians achieves 48% on FrontierMath Tier 4 and helped solve open problems in early tests.
citing papers explorer
-
Advancing Mathematics Research with AI-Driven Formal Proof Search
LLM-based agents in Lean solved 9 of 353 open Erdős problems and proved 44 of 492 OEIS conjectures at a few hundred dollars each.
-
AI co-mathematician: Accelerating mathematicians with agentic AI
An interactive AI workbench for mathematicians achieves 48% on FrontierMath Tier 4 and helped solve open problems in early tests.