CAM-Bench is a new Lean 4 theorem-proving benchmark of 1,000 problems in computational and applied mathematics, built from textbook exercises using a dependency-recovery pipeline to reconstruct local context.
Process-driven autoformalization in lean 4.arXiv preprint arXiv:2406.01940, 2024
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5representative citing papers
Paraphrase sensitivity in Lean 4 autoformalization is dominated by code-generation failures that differ between undergraduate and Olympiad datasets across multiple models.
LLMs display clear performance stratification on formal language tasks aligned with Chomsky hierarchy complexity levels, limited by severe efficiency barriers rather than absolute capability.
A minimal agentic system achieves competitive performance in automated theorem proving with a simpler design and lower cost than state-of-the-art methods.
AI for math combines task-specific architectures and general foundation models to support research and advance AI reasoning capabilities.
citing papers explorer
-
CAM-Bench: A Benchmark for Computational and Applied Mathematics in Lean
CAM-Bench is a new Lean 4 theorem-proving benchmark of 1,000 problems in computational and applied mathematics, built from textbook exercises using a dependency-recovery pipeline to reconstruct local context.
-
Characterizing Paraphrase-Induced Failures in Lean 4 Autoformalization
Paraphrase sensitivity in Lean 4 autoformalization is dominated by code-generation failures that differ between undergraduate and Olympiad datasets across multiple models.
-
Evaluating the Formal Reasoning Capabilities of Large Language Models through Chomsky Hierarchy
LLMs display clear performance stratification on formal language tasks aligned with Chomsky hierarchy complexity levels, limited by severe efficiency barriers rather than absolute capability.
-
A Minimal Agent for Automated Theorem Proving
A minimal agentic system achieves competitive performance in automated theorem proving with a simpler design and lower cost than state-of-the-art methods.
-
AI for Mathematics: Progress, Challenges, and Prospects
AI for math combines task-specific architectures and general foundation models to support research and advance AI reasoning capabilities.