pith. sign in

arxiv: 2607.00815 · v1 · pith:PQ4SCWMWnew · submitted 2026-07-01 · 💻 cs.LO · cs.AI

LRAT-Catcher: Importing SAT Solver Certificates into Lean4 by Reflection

Pith reviewed 2026-07-02 04:47 UTC · model grok-4.3

classification 💻 cs.LO cs.AI
keywords SAT solversLRAT certificatesLean4reflectionformal verificationSchur numberRamsey numbercube-and-conquer
0
0 comments X

The pith

LRAT-Catcher imports SAT solver certificates into Lean4 theorems by running the verified checker via reflection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents LRAT-Catcher as a tool that accepts a DIMACS formula and its LRAT certificate and produces a corresponding Lean4 theorem. It achieves this by invoking the formally verified LRAT checker already present in Lean core, but executing it as compiled native code through reflection rather than constructing an explicit proof term. This design also supports combining multiple cube-and-conquer runs inside Lean by merging per-cube refutations with a separate cover-completeness LRAT proof. The resulting theorems are connected back to the original combinatorial statements through verified encodings. Evaluation on the Schur number S(4)=44 and the Ramsey number R(4,4)=18 shows the method succeeds where direct proof-term construction exhausts memory.

Core claim

LRAT-Catcher is a standalone tool that imports a DIMACS formula together with an LRAT certificate into Lean 4 as a theorem by running the formally verified LRAT checker from Lean core as compiled native code via reflection. This scales to instances where Mathlib's explicit proof-term import exhausts memory. LRAT-Catcher also composes cube-and-conquer solving runs entirely inside Lean by combining per-cube refutations with a cover-completeness certificate that is itself an LRAT proof, then connects the CNF-level results to the original combinatorial problems through verified encodings.

What carries the argument

Reflection that executes the compiled LRAT checker from Lean core as native code to verify certificates and generate theorems without building explicit proof terms.

If this is right

  • Combinatorial results whose SAT encodings exceed the memory limits of explicit proof-term import become provable as Lean theorems.
  • Cube-and-conquer decompositions can be verified inside Lean by merging per-cube LRAT proofs with one cover-completeness certificate.
  • CNF-level unsatisfiability statements can be lifted to statements about the original combinatorial objects via verified encodings.
  • External SAT solver output can be checked inside the theorem prover without relying on separate external tools at verification time.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same reflection approach could be applied to other certificate formats if Lean core supplies a verified checker for them.
  • Larger Ramsey or Schur numbers might become reachable inside Lean once bigger LRAT certificates are generated externally.
  • The method reduces dependence on external checkers by moving the verification step inside the prover while still using native execution speed.

Load-bearing premise

Lean 4's reflection mechanism executes the LRAT checker soundly and without introducing verification errors or gaps when importing certificates.

What would settle it

An LRAT certificate for an unsatisfiable instance that LRAT-Catcher turns into a Lean theorem that is later shown false by an independent check, or a run on the Schur or Ramsey instances that still exhausts memory.

Figures

Figures reproduced from arXiv: 2607.00815 by Stefan Szeider.

Figure 1
Figure 1. Figure 1: Peak memory against certificate size, across the pigeonhole and Schur benchmark [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
read the original abstract

SAT solvers settle combinatorial problems beyond the reach of interactive theorem provers and produce LRAT certificates for independent verification. We present LRAT-Catcher, a standalone, general-purpose tool that imports a DIMACS formula together with an LRAT certificate into Lean 4 as a theorem. LRAT-Catcher runs the formally verified LRAT checker from Lean core as compiled native code via reflection. This scales to instances where Mathlib's explicit proof-term import exhausts memory. LRAT-Catcher also composes cube-and-conquer solving runs entirely inside Lean. Per-cube refutations are combined with a cover-completeness certificate, itself an LRAT proof, into a single unsatisfiability theorem. Verified encodings connect CNF-level results to the original combinatorial problems. We evaluate the tool against Mathlib's proof-term import and the external checker cake_lpr on establishing the Schur number S(4) = 44 and the Ramsey number R(4,4) = 18 as Lean theorems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper presents LRAT-Catcher, a standalone tool that imports a DIMACS formula and LRAT certificate into Lean 4 as a theorem by reflecting the formally verified LRAT checker from Lean core as compiled native code. It claims this scales to instances where Mathlib's explicit proof-term import exhausts memory, and that it can compose cube-and-conquer solving runs entirely inside Lean by combining per-cube refutations with a cover-completeness LRAT certificate into a single unsatisfiability theorem, with verified encodings connecting to the original combinatorial problems. Evaluation is reported on establishing Schur number S(4)=44 and Ramsey number R(4,4)=18 as Lean theorems, compared against Mathlib import and the external cake_lpr checker.

Significance. If the reflection mechanism preserves the verified semantics of the LRAT checker, the work provides a practical, memory-efficient bridge between SAT solvers and interactive theorem proving in Lean, leveraging an existing verified component rather than reimplementing it. The ability to perform cube-and-conquer composition inside Lean and the concrete evaluations on S(4) and R(4,4) demonstrate applicability to hard combinatorial results with machine-checked connections to the original statements.

major comments (2)
  1. [Abstract] Abstract: the central claim that LRAT-Catcher 'runs the formally verified LRAT checker from Lean core as compiled native code via reflection' without introducing verification errors or soundness gaps is load-bearing for the scaling results on S(4)=44 and R(4,4)=18, yet the manuscript provides no explicit argument, auxiliary lemma, or discussion of how the reflection layer preserves the checker's verified semantics for arbitrary LRAT proofs.
  2. [Abstract] Abstract (evaluation paragraph): the reported success on two concrete instances is presented without implementation details, error analysis, or discussion of potential reflection soundness issues, leaving the practical scaling claim only moderately supported.
minor comments (1)
  1. [Abstract] The abstract would benefit from a brief statement of the precise Lean 4 reflection primitives used (e.g., which meta-programming or native-code invocation mechanism).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We agree that the abstract and evaluation section would benefit from additional discussion of the reflection mechanism's soundness properties. We will revise the manuscript to address both points explicitly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that LRAT-Catcher 'runs the formally verified LRAT checker from Lean core as compiled native code via reflection' without introducing verification errors or soundness gaps is load-bearing for the scaling results on S(4)=44 and R(4,4)=18, yet the manuscript provides no explicit argument, auxiliary lemma, or discussion of how the reflection layer preserves the checker's verified semantics for arbitrary LRAT proofs.

    Authors: We acknowledge that the abstract does not contain an explicit argument or auxiliary lemma addressing preservation of the LRAT checker's verified semantics under reflection. The implementation uses Lean's native reflection to execute the already-verified checker definition from Lean core; however, we agree this requires explicit justification in the paper. We will add a new subsection (likely in Section 3 or 4) that (1) recalls the relevant Lean reflection primitives, (2) states the informal argument that the reflected execution is semantically equivalent to the verified definition because the checker is extracted from the same definition used in the kernel, and (3) notes that any remaining trust assumptions are identical to those already present in Lean core's LRAT checker. An auxiliary lemma formalizing the equivalence will be stated (even if its proof is external to the main development). revision: yes

  2. Referee: [Abstract] Abstract (evaluation paragraph): the reported success on two concrete instances is presented without implementation details, error analysis, or discussion of potential reflection soundness issues, leaving the practical scaling claim only moderately supported.

    Authors: We agree that the evaluation paragraph in the abstract (and the corresponding section in the body) lacks sufficient implementation detail and explicit treatment of reflection-related soundness considerations. We will expand both the abstract and the evaluation section to include: (a) a brief description of the reflection call site and the size of the LRAT proofs handled, (b) the concrete resource measurements (memory and time) that demonstrate the improvement over Mathlib's proof-term import, and (c) a short paragraph discussing why the reflection layer does not introduce new soundness gaps beyond those already accepted for the core checker. These additions will be supported by the new subsection mentioned in the first response. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain

full rationale

The paper's central mechanism imports LRAT certificates by invoking the pre-existing formally verified LRAT checker from Lean core via reflection, then evaluates scaling on concrete combinatorial theorems (S(4)=44, R(4,4)=18) and cube-and-conquer composition. No step reduces by definition to its own output, renames a fitted input as a prediction, or relies on a load-bearing self-citation chain whose justification is internal to the paper. The soundness premise is anchored in Lean's external reflection facility and core library verification, which are independent of the present work and not derived within it.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the soundness of Lean's reflection mechanism and the pre-verified status of the LRAT checker in Lean core; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption Lean's reflection mechanism preserves the soundness of the executed LRAT checker
    The paper depends on reflection correctly running the checker without introducing errors, as stated in the description of how LRAT-Catcher operates.

pith-pipeline@v0.9.1-grok · 5697 in / 1211 out tokens · 23992 ms · 2026-07-02T04:47:44.685594+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

12 extracted references · 8 canonical work pages

  1. [1]

    Keizer, Léon Frénot, Ab- dalrhman Mohamed, Léo Stefanesco, Harun Khan, Joshua Clune, Clark W

    Henrik Böving, Siddharth Bhat, Luisa Cicolini, Alex C. Keizer, Léon Frénot, Ab- dalrhman Mohamed, Léo Stefanesco, Harun Khan, Joshua Clune, Clark W. Barrett, and Tobias Grosser. Interactive bitvector reasoning using verified bit-blasting.Proc. ACM Program. Lang., 9(OOPSLA2):3259–3285, 2025. doi: 10.1145/3763167. URL https://doi.org/10.1145/3763167

  2. [2]

    Luís Cruz-Filipe, Marijn J. H. Heule, Warren A. Hunt Jr., Matt Kaufmann, and Pe- ter Schneider-Kamp. Efficient certified RAT verification. In Leonardo de Moura, editor,Automated Deduction - CADE 26 - 26th International Conference on Auto- mated Deduction, Gothenburg, Sweden, August 6-11, 2017, Proceedings, volume 10395 ofLecture Notes in Computer Science,...

  3. [3]

    James Gallicchio, Cayden Codel, Jeremy Avigad, and Marijn J. H. Heule. An end-to- end verification of Keller’s conjecture. In17th International Conference on Interactive Theorem Proving (ITP 2026), LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum für Infor- matik, 2026. To appear

  4. [4]

    Cube and con- quer: Guiding CDCL SAT solvers by lookaheads

    Marijn Heule, Oliver Kullmann, Siert Wieringa, and Armin Biere. Cube and con- quer: Guiding CDCL SAT solvers by lookaheads. In Kerstin Eder, João Lourenço, and Onn Shehory, editors,Hardware and Software: Verification and Testing - 7th International Haifa Verification Conference, HVC 2011, Haifa, Israel, December 6- 8, 2011, Revised Selected Papers, volume...

  5. [5]

    Marijn J. H. Heule. Schur number five. In Sheila A. McIlraith and Kilian Q. Wein- berger, editors,Proceedings of the Thirty-Second AAAI Conference on Artificial Intel- ligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI- 18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orle...

  6. [6]

    Formally verified graph generation with SAT modulo symmetries and Lean

    Markus Kirchweger, Pablo Manrique, and Stefan Szeider. Formally verified graph generation with SAT modulo symmetries and Lean. InInternational Joint Confer- ence on Automated Reasoning (IJCAR 2026), part of the Federated Logic Conference (FLoC 2026), Lisbon, Portugal, July 26–29, 2026

  7. [7]

    Fast and verified UNSAT certificate checking

    Peter Lammich. Fast and verified UNSAT certificate checking. In Christoph Benzmüller, Marijn J. H. Heule, and Renate A. Schmidt, editors,Automated Rea- soning - 12th International Joint Conference, IJCAR 2024, Nancy, France, July 3-6, 2024, Proceedings, Part I, volume 14739 ofLecture Notes in Computer Sci- ence, pages 439–457. Springer, 2024. doi: 10.1007...

  8. [8]

    Trusted scalable SAT solving with on-the-fly LRAT checking

    Dominik Schreiber. Trusted scalable SAT solving with on-the-fly LRAT checking. In Supratik Chakraborty and Jie-Hong Roland Jiang, editors,27th International Con- ference on Theory and Applications of Satisfiability Testing, SAT 2024, Pune, In- dia, August 21-24, 2024, volume 305 ofLIPIcs, pages 25:1–25:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik...

  9. [9]

    Codel, Mario Carneiro, and Marijn J

    Bernardo Subercaseaux, Wojciech Nawrocki, James Gallicchio, Cayden R. Codel, Mario Carneiro, and Marijn J. H. Heule. Formal verification of the empty hexagon number. In Yves Bertot, Temur Kutsia, and Michael Norrish, editors,15th Inter- national Conference on Interactive Theorem Proving, ITP 2024, Tbilisi, Georgia, September 9-14, 2024, volume 309 ofLIPIc...

  10. [10]

    PBLean: VeriPB proof certificates for Lean 4

    Stefan Szeider. PBLean: VeriPB proof certificates for Lean 4. In17th International Workshop on Pragmatics of SAT (PoS 2026), a workshop of the 29th International Conference on Theory and Applications of Satisfiability Testing (SAT 2026) and the Federated Logic Conference (FLoC 2026), Lisbon, Portugal, July 19, 2026.https: //github.com/leansolving/pblean. 10

  11. [11]

    Yong Kiam Tan, Marijn J. H. Heule, and Magnus O. Myreen. cake_lpr: Verified propa- gationredundancycheckingincakeml. InJanFrisoGrooteandKimGuldstrandLarsen, editors,Tools and Algorithms for the Construction and Analysis of Systems - 27th Inter- national Conference, TACAS 2021, Held as Part of the European Joint Conferences on Theory and Practice of Softwa...

  12. [12]

    Part of Mathlib4, the Lean 4 mathematical library, 2024.https://github

    The Mathlib Community.Mathlib.Tactic.Sat.FromLRAT: importing LRAT proofs into lean. Part of Mathlib4, the Lean 4 mathematical library, 2024.https://github. com/leanprover-community/mathlib4. 11