Herald: A natural language annotated lean 4 dataset

· 2025 · arXiv 2410.10878

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

extension 1

citation-polarity summary

extend 1

representative citing papers

MathAtlas: A Benchmark for Autoformalization in the Wild

cs.AI · 2026-05-13 · accept · novelty 8.0

MathAtlas is the first large-scale benchmark for autoformalizing graduate mathematics, where even strong models reach only 9.8% correctness on theorem statements and drop to 2.6% on the hardest dependency-deep subset.

LeanSearch v2: Global Premise Retrieval for Lean 4 Theorem Proving

cs.IR · 2026-05-13 · conditional · novelty 7.0 · 2 refs

LeanSearch v2 recovers 46.1% of ground-truth premise groups for research-level Lean 4 theorems within 10 candidates and raises fixed-loop proof success to 20%.

SFT-GRPO Data Overlap as a Post-Training Hyperparameter for Autoformalization

cs.LG · 2026-04-15 · unverdicted · novelty 6.0

Disjoint SFT and GRPO data for autoformalization yields up to 10.4pp semantic accuracy gains over full overlap, which renders the GRPO stage redundant.

Automated Conjecture Resolution with Formal Verification

cs.LG · 2026-04-04 · unverdicted · novelty 6.0

An AI framework combining informal reasoning and formal verification resolves an open commutative algebra problem and produces a Lean 4-checked proof with minimal human input.

citing papers explorer

Showing 4 of 4 citing papers.

MathAtlas: A Benchmark for Autoformalization in the Wild cs.AI · 2026-05-13 · accept · none · ref 12
MathAtlas is the first large-scale benchmark for autoformalizing graduate mathematics, where even strong models reach only 9.8% correctness on theorem statements and drop to 2.6% on the hardest dependency-deep subset.
LeanSearch v2: Global Premise Retrieval for Lean 4 Theorem Proving cs.IR · 2026-05-13 · conditional · none · ref 7 · 2 links
LeanSearch v2 recovers 46.1% of ground-truth premise groups for research-level Lean 4 theorems within 10 candidates and raises fixed-loop proof success to 20%.
SFT-GRPO Data Overlap as a Post-Training Hyperparameter for Autoformalization cs.LG · 2026-04-15 · unverdicted · none · ref 5
Disjoint SFT and GRPO data for autoformalization yields up to 10.4pp semantic accuracy gains over full overlap, which renders the GRPO stage redundant.
Automated Conjecture Resolution with Formal Verification cs.LG · 2026-04-04 · unverdicted · none · ref 20
An AI framework combining informal reasoning and formal verification resolves an open commutative algebra problem and produces a Lean 4-checked proof with minimal human input.

Herald: A natural language annotated lean 4 dataset

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer