From LLM-Generated Conjectures to Lean Formalizations: Automated Polynomial Inequality Proving via Sum-of-Squares Certificates
Pith reviewed 2026-05-19 14:36 UTC · model grok-4.3
The pith
LLM-suggested approximate sum-of-squares decompositions can be refined symbolically into exact Lean-verified proofs for polynomial inequalities with up to 10 variables.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
NSPI supplies an end-to-end pipeline in which an LLM generates a candidate approximate polynomial sum-of-squares decomposition; symbolic refinement converts the candidate into an exact SOS representation that proves the target inequality; the proof is then certified inside Lean, yielding a fully machine-checked formalization. Experiments show the pipeline succeeds on challenging benchmark inequalities involving polynomials with as many as 10 variables.
What carries the argument
The neuro-symbolic refinement loop that converts an LLM-proposed approximate sum-of-squares decomposition into an exact algebraic certificate via symbolic computation.
If this is right
- Purely symbolic solvers can be reserved for the final tightening step rather than the entire search.
- Machine-checked Lean proofs become a routine byproduct of the inequality-proving process.
- The method extends the reach of automated proving beyond the small-variable regime typical of earlier LLM-only attempts.
Where Pith is reading between the lines
- Similar hybrid pipelines might be tested on other algebraic tasks such as proving identities or solving systems of polynomial equations.
- If refinement cost remains modest, the same pattern could reduce the need for hand-crafted heuristics in formal mathematics libraries.
Load-bearing premise
LLM-generated approximate SOS decompositions are close enough to exact identities that symbolic refinement succeeds without prohibitive cost or repeated failure on the target polynomials.
What would settle it
A benchmark set of 8-to-10-variable polynomial inequalities on which the symbolic refiner fails to produce an exact SOS decomposition for a majority of LLM proposals would falsify the claimed scalability.
Figures
read the original abstract
Automated proving of polynomial inequalities is a fundamental challenge in automated mathematical reasoning, where rich algebraic structure and a rapidly growing certificate search space hinder scalability. Purely symbolic approaches provide strong guarantees but often scale poorly as the number of variables or the degree increases, due to expensive algebraic manipulations and rapidly growing intermediate expressions. In parallel, LLM-guided methods have made notable progress, particularly on competition-style inequalities with a small number of variables. To address the remaining scalability challenges, we propose NSPI, a neuro-symbolic framework that combines the complementary strengths of LLMs and symbolic computation for polynomial-inequality proving. Concretely, an LLM proposes a conjecture in the form of an approximate polynomial Sum-Of-Squares (SOS) decomposition; we refine it via symbolic computation to obtain an exact polynomial SOS representation, which directly proves the target inequality, and we further certify the proof in Lean, yielding an end-to-end pipeline from heuristic discovery to machine-checked proof. Experiments on challenging benchmarks involving polynomials with up to 10 variables demonstrate the effectiveness and scalability of the proposed method.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes NSPI, a neuro-symbolic pipeline for automated proving of polynomial inequalities. An LLM generates an approximate Sum-of-Squares (SOS) decomposition as a conjecture; this is refined by symbolic computation into an exact SOS certificate that proves the inequality; the certificate is then formalized and checked in Lean. The authors claim this yields an end-to-end workflow from heuristic discovery to machine-checked proof and that experiments on benchmarks with polynomials of up to 10 variables demonstrate effectiveness and scalability.
Significance. If the empirical claims are substantiated, the work would be significant for automated mathematical reasoning: it offers a concrete way to leverage LLM heuristics while retaining the strong guarantees of SOS certificates and Lean formalization. The integration of approximate LLM output with exact symbolic refinement and external verification is a clear strength, and the focus on higher-variable cases addresses a recognized scalability bottleneck in purely symbolic inequality proving.
major comments (2)
- [Experiments section] Abstract and Experiments section: the claim that experiments 'demonstrate the effectiveness and scalability' of NSPI on polynomials with up to 10 variables is load-bearing for the central contribution, yet the manuscript supplies no quantitative success rates for the symbolic refinement step, no failure-mode analysis, no baseline comparisons, and no runtime scaling data as the number of variables or degree increases. Without these metrics it is impossible to assess whether LLM-generated approximations reliably lie inside the basin of attraction of the refinement procedure.
- [Method section] Method section (pipeline description): the correctness of the end-to-end claim rests on the unexamined assumption that LLM approximate SOS decompositions are sufficiently close to exact rational SOS certificates for symbolic refinement (e.g., Newton iteration or exact SDP over rationals) to succeed at acceptable cost. No analysis or bounds are given on the approximation error or on how this error grows with dimension, leaving the scalability assertion for 10-variable instances unsupported.
minor comments (1)
- [Method and Lean sections] The notation for the approximate versus exact SOS decompositions could be made more consistent across the method description and the Lean formalization section to improve readability.
Simulated Author's Rebuttal
Thank you for the constructive feedback on our manuscript. We address each major comment below, indicating planned revisions where appropriate to strengthen the presentation of our experimental results and methodological assumptions.
read point-by-point responses
-
Referee: [Experiments section] Abstract and Experiments section: the claim that experiments 'demonstrate the effectiveness and scalability' of NSPI on polynomials with up to 10 variables is load-bearing for the central contribution, yet the manuscript supplies no quantitative success rates for the symbolic refinement step, no failure-mode analysis, no baseline comparisons, and no runtime scaling data as the number of variables or degree increases. Without these metrics it is impossible to assess whether LLM-generated approximations reliably lie inside the basin of attraction of the refinement procedure.
Authors: We acknowledge that the Experiments section primarily reports successful end-to-end proofs on the selected benchmarks with up to 10 variables but lacks the detailed quantitative breakdown requested. To address this directly, the revised manuscript will expand the Experiments section with: (i) success rates specifically for the symbolic refinement step (e.g., percentage of LLM conjectures that converge to exact rational SOS certificates), (ii) a failure-mode analysis categorizing cases where refinement fails (such as when the initial approximation error exceeds the basin of attraction), (iii) baseline comparisons against purely symbolic SOS solvers, and (iv) runtime scaling data and plots as the number of variables and degree increase. These additions will provide concrete evidence for the reliability of the LLM-generated approximations and the overall scalability claims. revision: yes
-
Referee: [Method section] Method section (pipeline description): the correctness of the end-to-end claim rests on the unexamined assumption that LLM approximate SOS decompositions are sufficiently close to exact rational SOS certificates for symbolic refinement (e.g., Newton iteration or exact SDP over rationals) to succeed at acceptable cost. No analysis or bounds are given on the approximation error or on how this error grows with dimension, leaving the scalability assertion for 10-variable instances unsupported.
Authors: The Method section describes the neuro-symbolic pipeline in which the LLM provides an approximate SOS conjecture that is subsequently refined symbolically. We agree that no explicit error bounds or growth analysis with dimension are provided. In revision we will add an empirical analysis subsection that reports observed approximation errors from the LLM outputs on the benchmark set and demonstrates the convergence radius of the refinement procedures (Newton iteration and rational SDP) used. This will substantiate that the approximations lie within the basin of attraction for the instances considered. However, deriving general theoretical bounds on error growth with dimension would require new analysis of LLM behavior in algebraic domains and lies beyond the scope of the current empirical study; our 10-variable results remain supported by the observed practical success of the refinement step. revision: partial
- Deriving theoretical bounds on LLM approximation error and its scaling with dimension, which would necessitate substantial new theoretical research outside the present empirical contribution.
Circularity Check
NSPI is a procedural pipeline with external verification; no derivation reduces to inputs by construction
full rationale
The paper presents NSPI as a sequence of distinct stages: LLM heuristic generation of an approximate SOS decomposition, followed by symbolic refinement to an exact SOS certificate, followed by Lean formalization. Each stage is described as a computational procedure whose output is checked by independent tools (symbolic solvers and the Lean theorem prover). No equation or central claim is shown to be equivalent to its own inputs, no parameter is fitted on a subset and then relabeled as a prediction, and no load-bearing premise rests solely on a self-citation whose content is unverified outside the present work. The reported effectiveness on up to 10-variable benchmarks is offered as empirical evidence rather than a mathematical identity derived from the method itself. Consequently the derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math A polynomial that admits an exact sum-of-squares decomposition is non-negative and thereby proves the target inequality.
invented entities (1)
-
NSPI framework
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
an LLM proposes a conjecture in the form of an approximate polynomial Sum-Of-Squares (SOS) decomposition; we refine it via symbolic computation to obtain an exact polynomial SOS representation, which directly proves the target inequality, and we further certify the proof in Lean
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat.equivNat unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Experiments on challenging benchmarks involving polynomials with up to 10 variables demonstrate the effectiveness and scalability of the proposed method.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Springer, 1993. 9 From LLM Conjectures to Lean Formalizations: Automated Inequality Proving via Sum-of-Squares Certificates Heule, M. J. H., Kullmann, O., and Marek, V . W. Solving and verifying the boolean pythagorean triples problem via cube-and-conquer. In Creignou, N. and Le Berre, D. (eds.),Theory and Applications of Satisfiability Testing – SAT 2016...
-
[2]
URL https://www.sciencedirect.com/ science/article/pii/S0304397508006452
doi: https://doi.org/10.1016/j.tcs.2008.09.025. URL https://www.sciencedirect.com/ science/article/pii/S0304397508006452. Symbolic-Numerical Computations. Peyrl, H. and Parrilo, P. A. Computing sum of squares decompositions with rational coefficients.Theor. Com- put. Sci., 409:269–281, 2008b. URL https://api. semanticscholar.org/CorpusID:2370460. Polu, S....
-
[3]
The polynomialf(x)admits a rational SOS decomposition, that is, there exist polynomialsf i(x)∈Q[x]such that f(x) = rX i=1 fi(x)2
-
[4]
There exists a symmetric positive semidefinite Gram matrixG∈S m ∩Q m×m such that f(x) =v(x) ⊤Gv(x), G⪰0. For the numerical solution GN obtained after the Gauss–Newton refinement, when the backward errorθ is sufficiently small, one can recover from GN an exact rational PSD matrix eG, which satisfies the polynomial identity exactly, thereby yielding a certi...
work page 2013
-
[5]
Analyze the input polynomial carefully, focusing on the coefficients and the combinations of variables involved
-
[6]
Infer possible linear or polynomial terms inside each square in the sum-of-squares expression
-
[7]
Keep the output compact and well-structured
Construct a sum of square terms without expanding the squares. Keep the output compact and well-structured
-
[8]
Minor numerical deviations are acceptable
Aim for the expanded form of your SOS expression to closely approximate the coefficients in the original polynomial. Minor numerical deviations are acceptable
-
[9]
If multiple valid SOS decompositions exist, prefer one that is simple, symmetric, and easy to interpret
-
[10]
Include variables and constants inside parentheses when appropriate to match constant terms in the input polynomial. Output format: Please provide your response in the following structure: <SOS Expression>: <sum of squares expression> Where<sum of squares expression>is a sum-of-squares term, expressed compactly. For example: (x1 + 1)ˆ2 + (2 *x1 + 3 *x2)ˆ2...
-
[11]
The output must be a sum-of-squares expression
Do not simply rewrite the input polynomial or output expanded terms. The output must be a sum-of-squares expression
-
[12]
Constants inside square terms can be fractional or decimal, as needed, to best approximate the original polynomial
-
[13]
Avoid expanding the squares in the output; always keep terms inside parentheses squared
-
[14]
Symmetry and simplicity are preferred if multiple answers fit
Aim for clear and interpretable variable groupings. Symmetry and simplicity are preferred if multiple answers fit
-
[15]
Your SOS expression should fully explain the input polynomial’s structure and coefficients to the best achievable extent. Example: Input polynomial: x1ˆ2 + 2 *x1 + 1 Output: (SOS Expression): (x1 + 1)ˆ2 Explanation: The polynomial is a perfect square trinomial; the SOS reconstruction is exact. Input polynomial: 5*x1ˆ2 + 12 *x1*x2 + 6 *x1 + 9 *x2ˆ2 + 9 Out...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.