pith. sign in

arxiv: 2505.18492 · v5 · pith:NZ3PTLDJnew · submitted 2025-05-24 · 💻 cs.AI

Formally Solving Answer-Construction Problems in Lean

classification 💻 cs.AI
keywords answerformalllmsproofsproblemsanswersconstructionlean
0
0 comments X
read the original abstract

Mathematical competition problems fall into two broad types: theorem proving, which asks for a proof of a given statement, and answer construction, which requires constructing a property-satifying object with proofs. With recent advances in large language models (LLMs), formal theorem-proving techniques have made substantial progress on theorem-proving problems, yet formal answer construction remains less studied. This exposes a mismatch between current LLM model families: general LLMs are strong at informal conjecturing but are expensive and unreliable at formal proof generation, whereas prover LLMs are cheap and optimized for formal proofs but weak at mathematical reasoning for proposing candidate answers. Moreover, Lean proof checking alone does not enforce that a constructed witness is a canonical answer: circular or non-closed-form witnesses can eliminate the existential quantifier while failing to constitute an admissible contest answer. To close this gap, we introduce \textit{Enumerate-Conjecture-Prove} (ECP), a neuro-symbolic framework in Lean for end-to-end answer construction with formal proofs. ECP leverages tool-assisted general LLMs to enumerate evidence and construct candidate answers, and invokes prover LLMs to produce machine-checked proofs. On PutnamBench's and autoformalized MathArena's answer-construction problems, ECP formally solves 17/346 and 18/75 instances with admissible answers and proofs, respectively, which outperform LLM baselines at aligned inference budgets. Our code is available at https://github.com/sunjia72/ecp-lpar.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.