LeanBET: Formally-verified surface area calculations in Lean
Pith reviewed 2026-05-19 18:36 UTC · model grok-4.3
The pith
A fully executable BET surface area pipeline has been formally verified in Lean with proofs over the reals and floating-point execution that matches a reference implementation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a complete BET Surface Identification workflow can be implemented in Lean such that the same code supports both floating-point execution and real-number correctness proofs, with the regression coefficients agreeing with their specification-level definitions and minimizing the least-squares error under the stated assumptions, the window enumeration being sound and complete, and the admissibility checks and knee-based selection satisfying their formal specifications.
What carries the argument
shared polymorphic implementation that supports both floating-point arithmetic for execution and real-number proofs for verification
If this is right
- The regression coefficients returned by the algorithm agree with their specification-level definitions and minimize the least-squares error under the stated assumptions.
- The window enumeration is sound and complete.
- The admissibility checks and knee-based selection satisfy their formal specifications.
- The implementation agrees with the BETSI reference to machine precision on benchmark isotherms.
Where Pith is reading between the lines
- This verification approach could be applied to other standard methods for analyzing adsorption data beyond the BET model.
- Materials characterization workflows might incorporate such verified components to increase confidence in reported surface areas.
- The polymorphic style offers a template for other scientific computing tasks that require both executable results and formal guarantees.
Load-bearing premise
The floating-point implementation is assumed to be a faithful enough approximation of the real-number proofs for the numerical results to be trusted in practice.
What would settle it
An adsorption isotherm on which the Lean implementation produces a surface area differing from the BETSI reference by more than 0.1 percent, or a mathematical counterexample to the claim that the returned regression coefficients minimize the least-squares error.
Figures
read the original abstract
The Brunauer--Emmett--Teller (BET) method is a standard tool for estimating surface areas from adsorption isotherms, yet practical implementations involve multiple algorithmic steps whose correctness is rarely made explicit. In this work, we present a fully executable and formally verified BET analysis pipeline implemented in the Lean~4 theorem prover. Our formalization covers the complete BET Surface Identification (BETSI)-style workflow, including window enumeration, monotonicity checks, knee selection, and linear regression. We carry out computations in floating-point arithmetic and develop the corresponding correctness proofs over the real numbers, using a shared polymorphic implementation that supports both. On the proof side, we show that the regression coefficients returned by the algorithm agree with their specification-level definitions and minimize the least-squares error under the stated assumptions. We also formalize the algebraic derivation of the BET linearized expression and connect that result directly to the executable analysis pipeline. We further prove that the window enumeration is sound and complete, and that the admissibility checks and knee-based selection satisfy their formal specifications. We evaluate the implementation against the BETSI reference method on benchmark adsorption isotherms. Compared to BETSI, LeanBET agrees to machine precision for 18 of the 19 isotherms, with only a 0.03\% deviation for the UiO-66 dataset. This demonstrates that a scientific computing workflow can be built in Lean, yielding both formal verification guarantees and numerical agreement with an established Python reference implementation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents LeanBET, a fully executable and formally verified BET analysis pipeline in Lean 4. It formalizes the complete BETSI-style workflow, including window enumeration, monotonicity checks, knee selection, linear regression, and the algebraic derivation of the BET linearized expression. Proofs over the reals show that regression coefficients agree with specifications and minimize least-squares error, that window enumeration is sound and complete, and that admissibility checks and knee selection satisfy their specs. Computations are performed in floating-point using a polymorphic implementation, with numerical evaluation showing agreement to machine precision with the BETSI Python reference on 18 of 19 isotherms, and a small 0.03% deviation on UiO-66.
Significance. If the results hold, this work is significant for demonstrating the application of interactive theorem proving to a real-world scientific computing task in physical chemistry. The machine-checked proofs for key components of the BET method, the polymorphic real/float implementation, and the direct linkage to the algebraic derivation are notable strengths. It provides both formal guarantees and practical validation against an established reference, highlighting the potential for verified scientific workflows.
major comments (2)
- [Implementation and proofs section] Implementation and proofs section: The correctness proofs (regression coefficients minimizing least-squares error, window soundness/completeness, admissibility/knee checks) are developed over the reals for the polymorphic code. However, the executable pipeline performs computations in floating-point arithmetic, and no formal semantics for Lean floats (IEEE 754 rounding) or transfer theorem from the real-number proofs to float execution is supplied. This assumption is load-bearing for the central claim of a formally verified surface-area calculation pipeline, even though empirical agreement to machine precision is shown on 18/19 isotherms.
- [Evaluation section] Evaluation section: The manuscript reports agreement with BETSI on 18 of 19 isotherms to machine precision (0.03% deviation on UiO-66) but provides no floating-point error analysis or bounds that would rigorously connect the real-number proofs to the observed numerical results. This leaves the practical trustworthiness of the verified pipeline dependent on an unproven approximation.
minor comments (2)
- [Abstract] The abstract states agreement on '18 of the 19 isotherms' without identifying the deviating dataset; adding this detail would improve readability.
- [Notation and definitions] Notation for knee-based selection and admissibility checks could benefit from an additional inline example or expanded definition to aid readers unfamiliar with BETSI.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the work's significance and for the detailed, constructive comments. We address each major comment point by point below.
read point-by-point responses
-
Referee: [Implementation and proofs section] The correctness proofs (regression coefficients minimizing least-squares error, window soundness/completeness, admissibility/knee checks) are developed over the reals for the polymorphic code. However, the executable pipeline performs computations in floating-point arithmetic, and no formal semantics for Lean floats (IEEE 754 rounding) or transfer theorem from the real-number proofs to float execution is supplied. This assumption is load-bearing for the central claim of a formally verified surface-area calculation pipeline, even though empirical agreement to machine precision is shown on 18/19 isotherms.
Authors: We agree that the formal proofs are developed over the reals while the executable pipeline uses floating-point arithmetic. Lean's Float type follows IEEE 754, but a complete formal semantics for rounding modes together with a transfer theorem lifting the real-number results to the floating-point implementation would require a substantial separate development. In the revised manuscript we have added an explicit discussion clarifying that the machine-checked guarantees apply to the real-number specification and that the floating-point execution is validated by direct empirical comparison with BETSI. We have also noted the polymorphic design as the mechanism that keeps the two layers aligned where possible. revision: partial
-
Referee: [Evaluation section] The manuscript reports agreement with BETSI on 18 of 19 isotherms to machine precision (0.03% deviation on UiO-66) but provides no floating-point error analysis or bounds that would rigorously connect the real-number proofs to the observed numerical results. This leaves the practical trustworthiness of the verified pipeline dependent on an unproven approximation.
Authors: We acknowledge that a rigorous a-priori floating-point error analysis or derived bounds for the full pipeline is absent from the original submission. Such bounds are difficult to obtain because window selection and knee detection introduce discrete decisions that interact with rounding. In the revised version we have expanded the evaluation section with a short discussion of the sources of the single observed discrepancy (UiO-66) and of the implementation differences between Lean and the Python reference. The empirical agreement to machine precision on the remaining 18 isotherms remains the primary practical evidence we offer. revision: partial
Circularity Check
No significant circularity; verification grounded in external specs and independent reference
full rationale
The paper's central results consist of Lean proofs that the polymorphic implementation's regression coefficients match specification-level least-squares definitions, that window enumeration is sound and complete, and that admissibility/knee checks satisfy their specs. These are connected to the standard algebraic derivation of the BET linearized form and validated by direct numerical agreement (to machine precision on 18/19 cases) against the independent BETSI Python reference. The floating-point execution is covered by an explicit assumption of faithful approximation rather than any self-referential definition, fitted parameter renamed as prediction, or load-bearing self-citation chain. No step reduces by construction to its own inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Standard axioms of the real numbers and the algebraic properties of linear regression
- domain assumption Correctness of the reference BETSI Python implementation
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat ≃ Nat unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
polymorphic implementation that supports both [reals and floats]
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Stephen Brunauer, P. H. Emmett, and Edward Teller. Adsorption of Gases in Multimolecular Layers. Journal of the American Chemical Society, 60(2):309–319, February 1938. URL:https://pubs.a cs.org/doi/abs/10.1021/ja01269a023,doi:10.1021/ja01269a023
-
[2]
Titus S. Van Erp and Johan A. Martens. A standardization for BET fitting of adsorption isotherms. Microporous and Mesoporous Materials, 145(1-3):188–193, November 2011. URL:https://link inghub.elsevier.com/retrieve/pii/S1387181111002435,doi:10.1016/j.micromeso.20 11.05.022
-
[3]
Johannes W. M. Osterrieth, James Rampersad, David Madden, Nakul Rampal, Luka Skoric, Bethany Connolly, Mark D. Allendorf, Vitalie Stavila, Jonathan L. Snider, Rob Ameloot, João Marreiros, Conchi Ania, Diana Azevedo, Enrique Vilarrasa-Garcia, Bianca F. Santos, Xian-He Bu, Ze Chang, Hana Bunzen, Neil R. Champness, Sarah L. Griffin, Banglin Chen, Rui-Biao Li...
-
[4]
Macdonald, Vladimir Martis, and Ivan P
Filip Ambroz, Thomas J. Macdonald, Vladimir Martis, and Ivan P. Parkin. Evaluation of the BET Theory for the Characterization of Meso and Microporous MOFs.Small Methods, 2(11):1800173, November 2018. URL:https://onlinelibrary.wiley.com/doi/10.1002/smtd.201800173, doi:10.1002/smtd.201800173
-
[5]
D.D. Do, H.D. Do, and D. Nicholson. A computer appraisal of BET theory, BET surface area and the calculation of surface excess for gas adsorption on a graphite surface.Chemical Engineering Science, 65(10):3331–3340, May 2010. URL:https://linkinghub.elsevier.com/retrieve/pii/S00 09250910000965,doi:10.1016/j.ces.2010.02.023
-
[6]
J. Rouquerol, D. Avnir, C. W. Fairbridge, D. H. Everett, J. M. Haynes, N. Pernicone, J. D. F. Ramsay, K. S. W. Sing, and K. K. Unger. Recommendations for the characterization of porous solids (Technical Report).Pure and Applied Chemistry, 66(8):1739–1758, January 1994. Publisher: De Gruyter. URL: https://www.degruyterbrill.com/document/doi/10.1351/pac1994...
-
[7]
In: Computer Vision – ECCV 2018
Leonardo De Moura and Sebastian Ullrich. The Lean 4 Theorem Prover and Programming Language. In André Platzer and Geoff Sutcliffe, editors,Automated Deduction – CADE 28, volume 12699, pages 625–635. Springer International Publishing, Cham, 2021. Series Title: Lecture Notes in Computer Science. URL:https://link.springer.com/10.1007/978- 3- 030- 79876- 5_37...
-
[8]
Irving Langmuir. THE ADSORPTION OF GASES ON PLANE SURFACES OF GLASS, MICA AND PLATINUM.Journal of the American Chemical Society, 40(9):1361–1403, September 1918. URL:https://pubs.acs.org/doi/abs/10.1021/ja02242a004,doi:10.1021/ja02242a004
-
[9]
Bobbin, Samiha Sharlin, Parivash Feyzishendi, An Hong Dang, Catherine M
Maxwell P. Bobbin, Samiha Sharlin, Parivash Feyzishendi, An Hong Dang, Catherine M. Wraback, and Tyler R. Josephson. Formalizing chemical physics using the Lean theorem prover.Digi- tal Discovery, 3(2):264–280, 2024. URL:https://xlink.rsc.org/?DOI=D3DD00077J, doi:10.1039/D3DD00077J
-
[10]
David Thrane Christiansen.Functional Programming in Lean. Microsoft, 2023. URL:https: //leanprover.github.io/functional_programming_in_lean/
work page 2023
-
[11]
Tomáš Skˇrivan. lecopivo/SciLean, April 2025. original-date: 2021-09-27T21:50:10Z. URL:https: //github.com/lecopivo/SciLean
work page 2025
-
[12]
TorchLean: Formalizing Neural Networks in Lean, 2026
Robert Joseph George, Jennifer Cruden, Xiangru Zhong, Huan Zhang, and Anima Anandkumar. TorchLean: Formalizing Neural Networks in Lean, 2026. Version Number: 1. URL:https: //arxiv.org/abs/2602.22631,doi:10.48550/ARXIV.2602.22631
-
[13]
Daniel Selsam, Percy Liang, and David L. Dill. Developing Bug-Free Machine Learning Systems With Formal Mathematics. InProceedings of the 34th International Conference on Machine Learning, pages 3047–3056. PMLR, July 2017. ISSN: 2640-3498. URL:https://proceedings.mlr.pres s/v70/selsam17a.html
work page 2017
-
[14]
Exploring the capabilities of the Lean interactive theorem prover
Adrián Doña Mateo. Exploring the capabilities of the Lean interactive theorem prover. Technical report, Computer Science and Mathematics School of Informatics University of Edinburgh, 2023. URL:https://webhomes.maths.ed.ac.uk/~adona/files/honours_project.pdf
work page 2023
-
[15]
Jeremy Avigad and Simon Hudon. Programming in Lean. Online tutorial manuscript, Carnegie Mellon University, 2019. URL:https://avigad.github.io/programming_in_lean/programming_in_ lean.pdf
work page 2019
-
[16]
Flocq: A Unified Library for Proving Floating-Point Al- gorithms in Coq
Sylvie Boldo and Guillaume Melquiond. Flocq: A Unified Library for Proving Floating-Point Al- gorithms in Coq. In2011 IEEE 20th Symposium on Computer Arithmetic, pages 243–252, Tuebin- gen, Germany, July 2011. IEEE. URL:http://ieeexplore.ieee.org/document/5992132/, doi:10.1109/ARITH.2011.40. 20
-
[17]
Certifying floating-point implementations using Gappa
Florent De Dinechin, Christoph Quirin Lauter, and Guillaume Melquiond. Certifying floating-point implementations using Gappa, 2008. Version Number: 1. URL:https://arxiv.org/abs/0801.0 523,doi:10.48550/ARXIV.0801.0523
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.0801.0523 2008
-
[18]
Flean: Floating Point Numbers in Lean, 2025
Joseph McKinsey. Flean: Floating Point Numbers in Lean, 2025. URL:https://josephmckinsey .com/flean.html
work page 2025
-
[19]
Floating-point arithmetic in the Coq system.Information and Computation, 216:14–23, July 2012
Guillaume Melquiond. Floating-point arithmetic in the Coq system.Information and Computation, 216:14–23, July 2012. URL:https://linkinghub.elsevier.com/retrieve/pii/S0890540112 000739,doi:10.1016/j.ic.2011.09.005
-
[20]
LeanCert: Verified Numerical Computation in Lean 4, 2026
Alejandro Radisic. LeanCert: Verified Numerical Computation in Lean 4, 2026. URL:https: //github.com/alerad/leancert
work page 2026
-
[21]
Ejike D. Ugwuanyi, Colin T. Jones, John Velkey, and Tyler R. Josephson. Benchmarking energy calculations using formal proofs.Molecular Physics, 123(21-22):e2539421, November 2025. URL: https://www.tandfonline.com/doi/full/10.1080/00268976.2025.2539421,doi:10.108 0/00268976.2025.2539421
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.