arXiv preprint arXiv:2202.01344 , year=

Formal mathematics statement curriculum learning · 2025 · arXiv 2202.01344

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

read on arXiv browse 13 citing papers

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

Diffusion-Proof: Recipe for Formal Theorem Proving Beyond Auto-Regressive Generation

cs.LG · 2026-06-17 · unverdicted · novelty 8.0

Diffusion-Proof trains diffusion LLMs (dLLM-Prover-7B for whole proofs and dLLM-Corrector-7B for local correction) that outperform AR baselines by 1.61% on ProofNet-Test and 6.14% on MiniF2F-Test while solving one IMO problem unsolved by DeepSeek-Prover-V2-7B.

LAMP: Lean-based Agentic framework with MCP and Proof Repair

cs.LO · 2026-06-27 · conditional · novelty 7.0

LAMP achieves 96.7% success generating verified Lean proofs for 90 Combinatorics on Words theorems by coordinating Planner, Builder, and Verifier agents with a CoW ontology accessed through Model Context Protocol.

Geometric Measurements of the Axiom of Choice in Neural Proof Embeddings

cs.LG · 2026-06-26 · unverdicted · novelty 7.0

Proofs depending on the axiom of choice show a geometric signature in neural embeddings of tactic sequences that weakens with dependency-graph distance and correlates with prover failure rates.

Formalizing Mathematics at Scale

cs.AI · 2026-05-28 · accept · novelty 7.0

A multi-agent framework called AutoformBot autoformalized 26 textbooks spanning analysis, algebra, topology, combinatorics and probability into a verified Lean 4 library of 45k declarations, demonstrating scalable formalization of graduate math.

Automating Formal Verification with Agent-Guided Tree Search

cs.LO · 2026-05-26 · unverdicted · novelty 6.0

Agent-directed tree search improves LLM performance on Lean formal verification tasks, with context-based orchestration solving more intermediate specs at lower token cost than baseline agents.

OProver: A Unified Framework for Agentic Formal Theorem Proving

cs.CL · 2026-05-17 · unverdicted · novelty 6.0

OProver-32B achieves top Pass@32 scores on MiniF2F, ProverBench, and PutnamBench by combining continued pretraining with iterative agentic proving, retrieval, SFT on repairs, and RL on unresolved cases using a 6.86M-proof dataset.

Correcting Influence: Unboxing LLM Outputs with Orthogonal Latent Spaces

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.

Rethinking Supervision Granularity: Segment-Level Learning for LLM-Based Theorem Proving

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

Segment-level supervision extracts coherent proof segments to train policy models that achieve 61-66% success on miniF2F, outperforming step-level and whole-proof methods while also improving existing provers.

ProofSketcher: Hybrid LLM + Lightweight Proof Checker for Reliable Math/Logic Reasoning

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

A hybrid pipeline lets an LLM write high-level proof sketches in a compact DSL that a lightweight kernel then expands into explicit, checkable obligations for reliable math and logic reasoning.

Measuring Representation Robustness in Large Language Models for Geometry

cs.CL · 2026-04-03 · unverdicted · novelty 6.0

LLMs display accuracy gaps of up to 14 percentage points on the same geometry problems solely due to representation choice, with vector forms consistently weakest and a convert-then-solve prompt helping only high-capacity models.

Llemma: An Open Language Model For Mathematics

cs.CL · 2023-10-16 · unverdicted · novelty 6.0

Continued pretraining of Code Llama on Proof-Pile-2 yields Llemma, an open math-specialized LLM that beats known open base models on MATH and supports tool use plus formal proving out of the box.

PaLM-E: An Embodied Multimodal Language Model

cs.LG · 2023-03-06 · conditional · novelty 6.0

PaLM-E is a single 562B-parameter multimodal model that performs embodied reasoning tasks like robotic manipulation planning and visual question answering by interleaving vision, state, and text inputs with positive transfer from joint training on language and robotics data.

Automating Formal Verification with Reinforcement Learning and Recursive Inference

cs.LG · 2026-05-29 · unverdicted · novelty 5.0

RLVR training raises verified Dafny pass rates from 9.7% to 31.1% on a filtered benchmark while a Lean proof scaffold lifts success from 46.2% to 69.2% on a pilot set and solves 7 of 42 prior unsolved tasks.

citing papers explorer

Showing 10 of 10 citing papers after filters.

Diffusion-Proof: Recipe for Formal Theorem Proving Beyond Auto-Regressive Generation cs.LG · 2026-06-17 · unverdicted · none · ref 3
Diffusion-Proof trains diffusion LLMs (dLLM-Prover-7B for whole proofs and dLLM-Corrector-7B for local correction) that outperform AR baselines by 1.61% on ProofNet-Test and 6.14% on MiniF2F-Test while solving one IMO problem unsolved by DeepSeek-Prover-V2-7B.
Geometric Measurements of the Axiom of Choice in Neural Proof Embeddings cs.LG · 2026-06-26 · unverdicted · none · ref 8
Proofs depending on the axiom of choice show a geometric signature in neural embeddings of tactic sequences that weakens with dependency-graph distance and correlates with prover failure rates.
Automating Formal Verification with Agent-Guided Tree Search cs.LO · 2026-05-26 · unverdicted · none · ref 20
Agent-directed tree search improves LLM performance on Lean formal verification tasks, with context-based orchestration solving more intermediate specs at lower token cost than baseline agents.
OProver: A Unified Framework for Agentic Formal Theorem Proving cs.CL · 2026-05-17 · unverdicted · none · ref 129
OProver-32B achieves top Pass@32 scores on MiniF2F, ProverBench, and PutnamBench by combining continued pretraining with iterative agentic proving, retrieval, SFT on repairs, and RL on unresolved cases using a 6.86M-proof dataset.
Correcting Influence: Unboxing LLM Outputs with Orthogonal Latent Spaces cs.LG · 2026-05-12 · unverdicted · none · ref 17
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
Rethinking Supervision Granularity: Segment-Level Learning for LLM-Based Theorem Proving cs.AI · 2026-05-12 · unverdicted · none · ref 16
Segment-level supervision extracts coherent proof segments to train policy models that achieve 61-66% success on miniF2F, outperforming step-level and whole-proof methods while also improving existing provers.
ProofSketcher: Hybrid LLM + Lightweight Proof Checker for Reliable Math/Logic Reasoning cs.AI · 2026-04-07 · unverdicted · none · ref 28
A hybrid pipeline lets an LLM write high-level proof sketches in a compact DSL that a lightweight kernel then expands into explicit, checkable obligations for reliable math and logic reasoning.
Measuring Representation Robustness in Large Language Models for Geometry cs.CL · 2026-04-03 · unverdicted · none · ref 24
LLMs display accuracy gaps of up to 14 percentage points on the same geometry problems solely due to representation choice, with vector forms consistently weakest and a convert-then-solve prompt helping only high-capacity models.
Llemma: An Open Language Model For Mathematics cs.CL · 2023-10-16 · unverdicted · none · ref 172
Continued pretraining of Code Llama on Proof-Pile-2 yields Llemma, an open math-specialized LLM that beats known open base models on MATH and supports tool use plus formal proving out of the box.
Automating Formal Verification with Reinforcement Learning and Recursive Inference cs.LG · 2026-05-29 · unverdicted · none · ref 49
RLVR training raises verified Dafny pass rates from 9.7% to 31.1% on a filtered benchmark while a Lean proof scaffold lifts success from 46.2% to 69.2% on a pilot set and solves 7 of 42 prior unsolved tasks.

arXiv preprint arXiv:2202.01344 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer