pith. sign in

arxiv: 2605.16169 · v1 · pith:SJ5FT4V2new · submitted 2026-05-15 · 💻 cs.LO · cs.MS· physics.chem-ph

LeanBET: Formally-verified surface area calculations in Lean

Pith reviewed 2026-05-19 18:36 UTC · model grok-4.3

classification 💻 cs.LO cs.MSphysics.chem-ph
keywords formal verificationLean theorem proverBET surface areaadsorption isothermslinear regressiontheorem provingscientific computing
0
0 comments X

The pith

A fully executable BET surface area pipeline has been formally verified in Lean with proofs over the reals and floating-point execution that matches a reference implementation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents an executable and formally verified implementation of the BET method for estimating surface areas from adsorption isotherms using the Lean 4 theorem prover. It formalizes the full BETSI-style workflow including window enumeration, monotonicity checks, knee selection, and linear regression, carrying out computations in floating-point while developing corresponding correctness proofs over the real numbers via a shared polymorphic implementation. On the proof side the regression coefficients are shown to agree with their specification-level definitions and to minimize the least-squares error, the window enumeration is proved sound and complete, and the admissibility checks and knee-based selection are shown to satisfy their formal specifications. The algebraic derivation of the BET linearized expression is connected directly to the executable pipeline. Evaluation against the BETSI reference on benchmark isotherms shows agreement to machine precision for 18 of 19 cases with only a 0.03 percent deviation on one dataset.

Core claim

The central claim is that a complete BET Surface Identification workflow can be implemented in Lean such that the same code supports both floating-point execution and real-number correctness proofs, with the regression coefficients agreeing with their specification-level definitions and minimizing the least-squares error under the stated assumptions, the window enumeration being sound and complete, and the admissibility checks and knee-based selection satisfying their formal specifications.

What carries the argument

shared polymorphic implementation that supports both floating-point arithmetic for execution and real-number proofs for verification

If this is right

  • The regression coefficients returned by the algorithm agree with their specification-level definitions and minimize the least-squares error under the stated assumptions.
  • The window enumeration is sound and complete.
  • The admissibility checks and knee-based selection satisfy their formal specifications.
  • The implementation agrees with the BETSI reference to machine precision on benchmark isotherms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This verification approach could be applied to other standard methods for analyzing adsorption data beyond the BET model.
  • Materials characterization workflows might incorporate such verified components to increase confidence in reported surface areas.
  • The polymorphic style offers a template for other scientific computing tasks that require both executable results and formal guarantees.

Load-bearing premise

The floating-point implementation is assumed to be a faithful enough approximation of the real-number proofs for the numerical results to be trusted in practice.

What would settle it

An adsorption isotherm on which the Lean implementation produces a surface area differing from the BETSI reference by more than 0.1 percent, or a mathematical counterexample to the claim that the returned regression coefficients minimize the least-squares error.

Figures

Figures reproduced from arXiv: 2605.16169 by Colin T. Jones, Ejike D. Ugwuanyi, John Velkey, Tyler R. Josephson.

Figure 1
Figure 1. Figure 1: Overview of the formally verified BET analysis pipeline. The executable algorithm (top row) transforms adsorption data through BET linearization, windowed linear regression, and BETSI-style filter￾ing to compute surface area and error metrics. Each computational step is paired with a corresponding formal proof (bottom blocks) establishing algebraic correctness of the BET linearization, least-squares op￾tim… view at source ↗
Figure 2
Figure 2. Figure 2: Polymorphic functions link proofs (over idealized Real numbers) with execution (over floating [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
read the original abstract

The Brunauer--Emmett--Teller (BET) method is a standard tool for estimating surface areas from adsorption isotherms, yet practical implementations involve multiple algorithmic steps whose correctness is rarely made explicit. In this work, we present a fully executable and formally verified BET analysis pipeline implemented in the Lean~4 theorem prover. Our formalization covers the complete BET Surface Identification (BETSI)-style workflow, including window enumeration, monotonicity checks, knee selection, and linear regression. We carry out computations in floating-point arithmetic and develop the corresponding correctness proofs over the real numbers, using a shared polymorphic implementation that supports both. On the proof side, we show that the regression coefficients returned by the algorithm agree with their specification-level definitions and minimize the least-squares error under the stated assumptions. We also formalize the algebraic derivation of the BET linearized expression and connect that result directly to the executable analysis pipeline. We further prove that the window enumeration is sound and complete, and that the admissibility checks and knee-based selection satisfy their formal specifications. We evaluate the implementation against the BETSI reference method on benchmark adsorption isotherms. Compared to BETSI, LeanBET agrees to machine precision for 18 of the 19 isotherms, with only a 0.03\% deviation for the UiO-66 dataset. This demonstrates that a scientific computing workflow can be built in Lean, yielding both formal verification guarantees and numerical agreement with an established Python reference implementation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents LeanBET, a fully executable and formally verified BET analysis pipeline in Lean 4. It formalizes the complete BETSI-style workflow, including window enumeration, monotonicity checks, knee selection, linear regression, and the algebraic derivation of the BET linearized expression. Proofs over the reals show that regression coefficients agree with specifications and minimize least-squares error, that window enumeration is sound and complete, and that admissibility checks and knee selection satisfy their specs. Computations are performed in floating-point using a polymorphic implementation, with numerical evaluation showing agreement to machine precision with the BETSI Python reference on 18 of 19 isotherms, and a small 0.03% deviation on UiO-66.

Significance. If the results hold, this work is significant for demonstrating the application of interactive theorem proving to a real-world scientific computing task in physical chemistry. The machine-checked proofs for key components of the BET method, the polymorphic real/float implementation, and the direct linkage to the algebraic derivation are notable strengths. It provides both formal guarantees and practical validation against an established reference, highlighting the potential for verified scientific workflows.

major comments (2)
  1. [Implementation and proofs section] Implementation and proofs section: The correctness proofs (regression coefficients minimizing least-squares error, window soundness/completeness, admissibility/knee checks) are developed over the reals for the polymorphic code. However, the executable pipeline performs computations in floating-point arithmetic, and no formal semantics for Lean floats (IEEE 754 rounding) or transfer theorem from the real-number proofs to float execution is supplied. This assumption is load-bearing for the central claim of a formally verified surface-area calculation pipeline, even though empirical agreement to machine precision is shown on 18/19 isotherms.
  2. [Evaluation section] Evaluation section: The manuscript reports agreement with BETSI on 18 of 19 isotherms to machine precision (0.03% deviation on UiO-66) but provides no floating-point error analysis or bounds that would rigorously connect the real-number proofs to the observed numerical results. This leaves the practical trustworthiness of the verified pipeline dependent on an unproven approximation.
minor comments (2)
  1. [Abstract] The abstract states agreement on '18 of the 19 isotherms' without identifying the deviating dataset; adding this detail would improve readability.
  2. [Notation and definitions] Notation for knee-based selection and admissibility checks could benefit from an additional inline example or expanded definition to aid readers unfamiliar with BETSI.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their positive assessment of the work's significance and for the detailed, constructive comments. We address each major comment point by point below.

read point-by-point responses
  1. Referee: [Implementation and proofs section] The correctness proofs (regression coefficients minimizing least-squares error, window soundness/completeness, admissibility/knee checks) are developed over the reals for the polymorphic code. However, the executable pipeline performs computations in floating-point arithmetic, and no formal semantics for Lean floats (IEEE 754 rounding) or transfer theorem from the real-number proofs to float execution is supplied. This assumption is load-bearing for the central claim of a formally verified surface-area calculation pipeline, even though empirical agreement to machine precision is shown on 18/19 isotherms.

    Authors: We agree that the formal proofs are developed over the reals while the executable pipeline uses floating-point arithmetic. Lean's Float type follows IEEE 754, but a complete formal semantics for rounding modes together with a transfer theorem lifting the real-number results to the floating-point implementation would require a substantial separate development. In the revised manuscript we have added an explicit discussion clarifying that the machine-checked guarantees apply to the real-number specification and that the floating-point execution is validated by direct empirical comparison with BETSI. We have also noted the polymorphic design as the mechanism that keeps the two layers aligned where possible. revision: partial

  2. Referee: [Evaluation section] The manuscript reports agreement with BETSI on 18 of 19 isotherms to machine precision (0.03% deviation on UiO-66) but provides no floating-point error analysis or bounds that would rigorously connect the real-number proofs to the observed numerical results. This leaves the practical trustworthiness of the verified pipeline dependent on an unproven approximation.

    Authors: We acknowledge that a rigorous a-priori floating-point error analysis or derived bounds for the full pipeline is absent from the original submission. Such bounds are difficult to obtain because window selection and knee detection introduce discrete decisions that interact with rounding. In the revised version we have expanded the evaluation section with a short discussion of the sources of the single observed discrepancy (UiO-66) and of the implementation differences between Lean and the Python reference. The empirical agreement to machine precision on the remaining 18 isotherms remains the primary practical evidence we offer. revision: partial

Circularity Check

0 steps flagged

No significant circularity; verification grounded in external specs and independent reference

full rationale

The paper's central results consist of Lean proofs that the polymorphic implementation's regression coefficients match specification-level least-squares definitions, that window enumeration is sound and complete, and that admissibility/knee checks satisfy their specs. These are connected to the standard algebraic derivation of the BET linearized form and validated by direct numerical agreement (to machine precision on 18/19 cases) against the independent BETSI Python reference. The floating-point execution is covered by an explicit assumption of faithful approximation rather than any self-referential definition, fitted parameter renamed as prediction, or load-bearing self-citation chain. No step reduces by construction to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The verification rests on the standard axioms of real analysis for the regression and monotonicity proofs, plus the assumption that the external BETSI reference implementation is correct. No new physical constants or ad-hoc entities are introduced.

axioms (2)
  • standard math Standard axioms of the real numbers and the algebraic properties of linear regression
    Invoked when proving that the returned coefficients minimize least-squares error and agree with the specification.
  • domain assumption Correctness of the reference BETSI Python implementation
    Used as the external benchmark for numerical agreement on the 19 isotherms.

pith-pipeline@v0.9.0 · 5806 in / 1489 out tokens · 46466 ms · 2026-05-19T18:36:32.727924+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · 1 internal anchor

  1. [1]

    Stephen Brunauer, P. H. Emmett, and Edward Teller. Adsorption of Gases in Multimolecular Layers. Journal of the American Chemical Society, 60(2):309–319, February 1938. URL:https://pubs.a cs.org/doi/abs/10.1021/ja01269a023,doi:10.1021/ja01269a023

  2. [2]

    Van Erp and Johan A

    Titus S. Van Erp and Johan A. Martens. A standardization for BET fitting of adsorption isotherms. Microporous and Mesoporous Materials, 145(1-3):188–193, November 2011. URL:https://link inghub.elsevier.com/retrieve/pii/S1387181111002435,doi:10.1016/j.micromeso.20 11.05.022

  3. [3]

    Johannes W. M. Osterrieth, James Rampersad, David Madden, Nakul Rampal, Luka Skoric, Bethany Connolly, Mark D. Allendorf, Vitalie Stavila, Jonathan L. Snider, Rob Ameloot, João Marreiros, Conchi Ania, Diana Azevedo, Enrique Vilarrasa-Garcia, Bianca F. Santos, Xian-He Bu, Ze Chang, Hana Bunzen, Neil R. Champness, Sarah L. Griffin, Banglin Chen, Rui-Biao Li...

  4. [4]

    Macdonald, Vladimir Martis, and Ivan P

    Filip Ambroz, Thomas J. Macdonald, Vladimir Martis, and Ivan P. Parkin. Evaluation of the BET Theory for the Characterization of Meso and Microporous MOFs.Small Methods, 2(11):1800173, November 2018. URL:https://onlinelibrary.wiley.com/doi/10.1002/smtd.201800173, doi:10.1002/smtd.201800173

  5. [5]

    D.D. Do, H.D. Do, and D. Nicholson. A computer appraisal of BET theory, BET surface area and the calculation of surface excess for gas adsorption on a graphite surface.Chemical Engineering Science, 65(10):3331–3340, May 2010. URL:https://linkinghub.elsevier.com/retrieve/pii/S00 09250910000965,doi:10.1016/j.ces.2010.02.023

  6. [6]

    Rouquerol, D

    J. Rouquerol, D. Avnir, C. W. Fairbridge, D. H. Everett, J. M. Haynes, N. Pernicone, J. D. F. Ramsay, K. S. W. Sing, and K. K. Unger. Recommendations for the characterization of porous solids (Technical Report).Pure and Applied Chemistry, 66(8):1739–1758, January 1994. Publisher: De Gruyter. URL: https://www.degruyterbrill.com/document/doi/10.1351/pac1994...

  7. [7]

    In: Computer Vision – ECCV 2018

    Leonardo De Moura and Sebastian Ullrich. The Lean 4 Theorem Prover and Programming Language. In André Platzer and Geoff Sutcliffe, editors,Automated Deduction – CADE 28, volume 12699, pages 625–635. Springer International Publishing, Cham, 2021. Series Title: Lecture Notes in Computer Science. URL:https://link.springer.com/10.1007/978- 3- 030- 79876- 5_37...

  8. [8]

    THE ADSORPTION OF GASES ON PLANE SURFACES OF GLASS, MICA AND PLATINUM.Journal of the American Chemical Society, 40(9):1361–1403, September 1918

    Irving Langmuir. THE ADSORPTION OF GASES ON PLANE SURFACES OF GLASS, MICA AND PLATINUM.Journal of the American Chemical Society, 40(9):1361–1403, September 1918. URL:https://pubs.acs.org/doi/abs/10.1021/ja02242a004,doi:10.1021/ja02242a004

  9. [9]

    Bobbin, Samiha Sharlin, Parivash Feyzishendi, An Hong Dang, Catherine M

    Maxwell P. Bobbin, Samiha Sharlin, Parivash Feyzishendi, An Hong Dang, Catherine M. Wraback, and Tyler R. Josephson. Formalizing chemical physics using the Lean theorem prover.Digi- tal Discovery, 3(2):264–280, 2024. URL:https://xlink.rsc.org/?DOI=D3DD00077J, doi:10.1039/D3DD00077J

  10. [10]

    Microsoft, 2023

    David Thrane Christiansen.Functional Programming in Lean. Microsoft, 2023. URL:https: //leanprover.github.io/functional_programming_in_lean/

  11. [11]

    lecopivo/SciLean, April 2025

    Tomáš Skˇrivan. lecopivo/SciLean, April 2025. original-date: 2021-09-27T21:50:10Z. URL:https: //github.com/lecopivo/SciLean

  12. [12]

    TorchLean: Formalizing Neural Networks in Lean, 2026

    Robert Joseph George, Jennifer Cruden, Xiangru Zhong, Huan Zhang, and Anima Anandkumar. TorchLean: Formalizing Neural Networks in Lean, 2026. Version Number: 1. URL:https: //arxiv.org/abs/2602.22631,doi:10.48550/ARXIV.2602.22631

  13. [13]

    Daniel Selsam, Percy Liang, and David L. Dill. Developing Bug-Free Machine Learning Systems With Formal Mathematics. InProceedings of the 34th International Conference on Machine Learning, pages 3047–3056. PMLR, July 2017. ISSN: 2640-3498. URL:https://proceedings.mlr.pres s/v70/selsam17a.html

  14. [14]

    Exploring the capabilities of the Lean interactive theorem prover

    Adrián Doña Mateo. Exploring the capabilities of the Lean interactive theorem prover. Technical report, Computer Science and Mathematics School of Informatics University of Edinburgh, 2023. URL:https://webhomes.maths.ed.ac.uk/~adona/files/honours_project.pdf

  15. [15]

    Programming in Lean

    Jeremy Avigad and Simon Hudon. Programming in Lean. Online tutorial manuscript, Carnegie Mellon University, 2019. URL:https://avigad.github.io/programming_in_lean/programming_in_ lean.pdf

  16. [16]

    Flocq: A Unified Library for Proving Floating-Point Al- gorithms in Coq

    Sylvie Boldo and Guillaume Melquiond. Flocq: A Unified Library for Proving Floating-Point Al- gorithms in Coq. In2011 IEEE 20th Symposium on Computer Arithmetic, pages 243–252, Tuebin- gen, Germany, July 2011. IEEE. URL:http://ieeexplore.ieee.org/document/5992132/, doi:10.1109/ARITH.2011.40. 20

  17. [17]

    Certifying floating-point implementations using Gappa

    Florent De Dinechin, Christoph Quirin Lauter, and Guillaume Melquiond. Certifying floating-point implementations using Gappa, 2008. Version Number: 1. URL:https://arxiv.org/abs/0801.0 523,doi:10.48550/ARXIV.0801.0523

  18. [18]

    Flean: Floating Point Numbers in Lean, 2025

    Joseph McKinsey. Flean: Floating Point Numbers in Lean, 2025. URL:https://josephmckinsey .com/flean.html

  19. [19]

    Floating-point arithmetic in the Coq system.Information and Computation, 216:14–23, July 2012

    Guillaume Melquiond. Floating-point arithmetic in the Coq system.Information and Computation, 216:14–23, July 2012. URL:https://linkinghub.elsevier.com/retrieve/pii/S0890540112 000739,doi:10.1016/j.ic.2011.09.005

  20. [20]

    LeanCert: Verified Numerical Computation in Lean 4, 2026

    Alejandro Radisic. LeanCert: Verified Numerical Computation in Lean 4, 2026. URL:https: //github.com/alerad/leancert

  21. [21]

    Ugwuanyi, Colin T

    Ejike D. Ugwuanyi, Colin T. Jones, John Velkey, and Tyler R. Josephson. Benchmarking energy calculations using formal proofs.Molecular Physics, 123(21-22):e2539421, November 2025. URL: https://www.tandfonline.com/doi/full/10.1080/00268976.2025.2539421,doi:10.108 0/00268976.2025.2539421