Draft, sketch, and prove: Guiding formal theorem provers with informal proofs.arXiv preprint arXiv:2210.12283, 2022

Albert Q Jiang, Sean Welleck, Jin Peng Zhou, Wenda Li, Jiacheng Liu, Mateja Jamnik, Timothée Lacroix, Yuhuai Wu, Guillaume Lample · 2022 · arXiv 2210.12283

20 Pith papers cite this work. Polarity classification is still indexing.

20 Pith papers citing it

read on arXiv browse 20 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

A Machine-Verified Proof of a Quantum-Optimization Conjecture

quant-ph · 2026-06-29 · accept · novelty 8.0

A Lean 4 machine-verified proof establishes that depth-p QAOA on the ring of disagrees attains approximation ratio (2p+1)/(2p+2) exactly.

LAMP: Lean-based Agentic framework with MCP and Proof Repair

cs.LO · 2026-06-27 · conditional · novelty 7.0

LAMP achieves 96.7% success generating verified Lean proofs for 90 Combinatorics on Words theorems by coordinating Planner, Builder, and Verifier agents with a CoW ontology accessed through Model Context Protocol.

Inductive Deductive Synthesis: Enabling AI to Generate Formally Verified Systems

cs.AI · 2026-05-22 · unverdicted · novelty 7.0

IDS is an agentic LLM system that incrementally synthesizes both implementation and proof for distributed key-value stores, succeeding on all 7 specs where prior agents succeeded on only 2.

Advancing Mathematics Research with AI-Driven Formal Proof Search

cs.AI · 2026-05-21 · conditional · novelty 7.0 · 2 refs

An LLM-based agent with Lean verification autonomously solved multiple open Erdős problems and OEIS conjectures in the first large-scale test.

Fidelity Probes for Specification--Code Alignment

cs.LG · 2026-05-17 · unverdicted · novelty 7.0

Fidelity probes from code raise specification fidelity from 0.63 to 0.94 on a 12k-line COBOL benchmark over eight iterations, with convergence predicted by a two-state Markov fixed point from four iterations of rate data.

Neurosymbolic Auditing of Natural-Language Software Requirements

cs.SE · 2026-05-13 · unverdicted · novelty 7.0

VERIMED translates natural-language requirements to formal logic via LLMs, detects ambiguity from stochastic formalization differences, and audits for inconsistency and safety violations using SMT queries.

Automatic Textbook Formalization

cs.AI · 2026-04-03 · accept · novelty 7.0

Multi-agent AI system formalizes entire 500-page graduate algebraic combinatorics textbook into Lean, creating 130K lines of code in one week at human-expert cost.

Provably Secure Agent Guardrail

cs.AI · 2026-05-28 · unverdicted · novelty 6.0

Introduces ePCA framework using neural-symbolic isolation to force agents to formalize intentions as logical constraints, claiming zero attack success and false positive rates in tested scenarios.

ImProver 2: Iteratively Self-Improving LMs for Neurosymbolic Proof Optimization

cs.AI · 2026-05-21 · unverdicted · novelty 6.0

ImProver 2 combines a data-efficient expert-iteration pipeline with a neurosymbolic scaffold to train a 7B model that outperforms larger models in Lean 4 proof optimization across structural metrics.

OProver: A Unified Framework for Agentic Formal Theorem Proving

cs.CL · 2026-05-17 · unverdicted · novelty 6.0

OProver-32B achieves top Pass@32 scores on MiniF2F, ProverBench, and PutnamBench by combining continued pretraining with iterative agentic proving, retrieval, SFT on repairs, and RL on unresolved cases using a 6.86M-proof dataset.

Viverra: Text-to-Code with Guarantees

cs.SE · 2026-05-14 · unverdicted · novelty 6.0

Viverra generates C code from text descriptions together with assertions that are verified by model checkers, and a user study with over 400 participants shows the verified assertions improve code comprehension.

Rethinking Supervision Granularity: Segment-Level Learning for LLM-Based Theorem Proving

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

Segment-level supervision extracts coherent proof segments to train policy models that achieve 61-66% success on miniF2F, outperforming step-level and whole-proof methods while also improving existing provers.

A Minimal Agent for Automated Theorem Proving

cs.AI · 2026-02-27 · unverdicted · novelty 6.0

A minimal agentic system achieves competitive performance in automated theorem proving with a simpler design and lower cost than state-of-the-art methods.

VERGE: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning

cs.CL · 2026-01-27 · unverdicted · novelty 6.0

VERGE decomposes LLM outputs into atomic claims, autoformalizes them to first-order logic, verifies with SMT solvers and consensus, and refines via minimal correction subsets, yielding 18.7% average uplift on reasoning benchmarks.

Aristotle: IMO-level Automated Theorem Proving

cs.AI · 2025-10-01 · unverdicted · novelty 6.0

Aristotle reaches gold-medal-equivalent performance on 2025 IMO problems via integrated Lean proof search, informal lemma formalization, and a dedicated geometry solver.

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

cs.CL · 2024-02-05 · unverdicted · novelty 6.0

DeepSeekMath 7B reaches 51.7% on MATH via continued pretraining on curated web math data and Group Relative Policy Optimization.

ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models

cs.CL · 2023-05-23 · conditional · novelty 6.0

ReWOO decouples reasoning from tool observations in augmented language models, delivering 5x token efficiency and 4% higher accuracy on multi-step reasoning benchmarks like HotpotQA.

Automating Formal Verification with Reinforcement Learning and Recursive Inference

cs.LG · 2026-05-29 · unverdicted · novelty 5.0

RLVR training raises verified Dafny pass rates from 9.7% to 31.1% on a filtered benchmark while a Lean proof scaffold lifts success from 46.2% to 69.2% on a pilot set and solves 7 of 42 prior unsolved tasks.

On Reasoning-Centric LLM-based Automated Theorem Proving

cs.SE · 2026-04-21 · unverdicted · novelty 5.0

ReCent-Prover achieves a 22.58% relative improvement over prior state-of-the-art in proved theorems on the CoqStoq benchmark by using reasoning-centric techniques under a fixed LLM invocation budget.

Rethinking Wireless Communications through Formal Mathematical AI Reasoning

eess.SP · 2026-04-28 · unverdicted · novelty 4.0

Proposes a three-layer framework using formal AI reasoning for verification, derivation, and discovery in wireless communications theory.

citing papers explorer

Showing 1 of 1 citing paper after filters.

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models cs.CL · 2024-02-05 · unverdicted · none · ref 20
DeepSeekMath 7B reaches 51.7% on MATH via continued pretraining on curated web math data and Group Relative Policy Optimization.

Draft, sketch, and prove: Guiding formal theorem provers with informal proofs.arXiv preprint arXiv:2210.12283, 2022

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer