Process-driven autoformalization in lean 4.arXiv preprint arXiv:2406.01940, 2024

CoRR, abs/2406 · 2024 · arXiv 2406.01940

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

baseline 1 dataset 1

citation-polarity summary

background 1 baseline 1

representative citing papers

CAM-Bench: A Benchmark for Computational and Applied Mathematics in Lean

cs.AI · 2026-05-17 · accept · novelty 7.0

CAM-Bench is a new Lean 4 theorem-proving benchmark of 1,000 problems in computational and applied mathematics, built from textbook exercises using a dependency-recovery pipeline to reconstruct local context.

Characterizing Paraphrase-Induced Failures in Lean 4 Autoformalization

cs.LG · 2026-04-25 · unverdicted · novelty 7.0 · 2 refs

Paraphrase sensitivity in Lean 4 autoformalization is dominated by code-generation failures that differ between undergraduate and Olympiad datasets across multiple models.

Evaluating the Formal Reasoning Capabilities of Large Language Models through Chomsky Hierarchy

cs.CL · 2026-04-03 · unverdicted · novelty 7.0

LLMs display clear performance stratification on formal language tasks aligned with Chomsky hierarchy complexity levels, limited by severe efficiency barriers rather than absolute capability.

A Minimal Agent for Automated Theorem Proving

cs.AI · 2026-02-27 · unverdicted · novelty 6.0

A minimal agentic system achieves competitive performance in automated theorem proving with a simpler design and lower cost than state-of-the-art methods.

AI for Mathematics: Progress, Challenges, and Prospects

math.HO · 2026-01-19 · unverdicted · novelty 4.0

AI for math combines task-specific architectures and general foundation models to support research and advance AI reasoning capabilities.

citing papers explorer

Showing 5 of 5 citing papers.

CAM-Bench: A Benchmark for Computational and Applied Mathematics in Lean cs.AI · 2026-05-17 · accept · none · ref 23
CAM-Bench is a new Lean 4 theorem-proving benchmark of 1,000 problems in computational and applied mathematics, built from textbook exercises using a dependency-recovery pipeline to reconstruct local context.
Characterizing Paraphrase-Induced Failures in Lean 4 Autoformalization cs.LG · 2026-04-25 · unverdicted · none · ref 6 · 2 links
Paraphrase sensitivity in Lean 4 autoformalization is dominated by code-generation failures that differ between undergraduate and Olympiad datasets across multiple models.
Evaluating the Formal Reasoning Capabilities of Large Language Models through Chomsky Hierarchy cs.CL · 2026-04-03 · unverdicted · none · ref 42
LLMs display clear performance stratification on formal language tasks aligned with Chomsky hierarchy complexity levels, limited by severe efficiency barriers rather than absolute capability.
A Minimal Agent for Automated Theorem Proving cs.AI · 2026-02-27 · unverdicted · none · ref 12
A minimal agentic system achieves competitive performance in automated theorem proving with a simpler design and lower cost than state-of-the-art methods.
AI for Mathematics: Progress, Challenges, and Prospects math.HO · 2026-01-19 · unverdicted · none · ref 104
AI for math combines task-specific architectures and general foundation models to support research and advance AI reasoning capabilities.

Process-driven autoformalization in lean 4.arXiv preprint arXiv:2406.01940, 2024

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer