pith. sign in

arxiv: 2605.16407 · v1 · pith:HVHB2WFJnew · submitted 2026-05-13 · 💻 cs.LO · cs.CL· cs.CR· cs.PL

Proof-Carrying Certificates for LLM Pipelines: A Trust-Boundary Architecture

Pith reviewed 2026-05-20 21:26 UTC · model grok-4.3

classification 💻 cs.LO cs.CLcs.CRcs.PL
keywords proof-carrying certificatesLLM pipelinestrust boundariesLean 4 verificationbilattice groundingHoare-style actionscompositional stabilityassurance cards
0
0 comments X

The pith

LLM pipelines can be made auditable for high-stakes use by certifying only their deterministic structured parts with Lean 4 proofs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes a trust-boundary architecture that verifies the deterministic computations around large language models rather than the models themselves. Certificates are valid when they pass a Lean 4 kernel type check and a sorry-free audit against the small trusted axiom set consisting of propext, Classical.choice, and Quot.sound, while all other assumptions are explicitly declared and tiered. The framework supplies three certificate families together with two operators that together produce a per-call Universal Assurance Card. If the approach holds, regulated applications such as clinical decision support and agentic systems with irreversible actions could proceed with explicit, auditable guarantees instead of opaque model trust.

Core claim

The central claim is that isolating the deterministic structured computations surrounding an LLM allows three local certificate families—conflict-aware bilattice grounding with an emission-gate soundness lemma, embedding sensitivity and paraphrase stability, and Hoare-style agent action—plus the operators Maximal Certifiable Residue and Compositional Stability theorem to deliver pipeline-wide assurance. Validity reduces to a Lean 4 kernel type-check together with a transitive axiom audit against the fixed trusted set, all other assumptions being partitioned by tier and declared. A compiled reference artifact covers all 22 certificate types with 17 axiom-free and the remainder depending only,

What carries the argument

The trust-boundary architecture that partitions assumptions by tier and reduces certificate validity to Lean 4 kernel type-checking plus a sorry-free transitive audit against the trusted axiom set.

If this is right

  • High-stakes deployments such as clinical decision support and regulated finance can use per-call Universal Assurance Cards.
  • The Compositional Stability theorem supplies a closed-form pipeline-wide perturbation budget from per-layer gains and margins.
  • Abstentions convert to a maximum-weight certifiable residue with an audit log of dropped claims.
  • The three families were tested on adversarially perturbed HotpotQA and prompt-injection scenarios.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar tiered-assumption partitions could be applied in other proof assistants to handle mixed human and machine oracles.
  • If the isolation succeeds, the same certificate families might extend to non-LLM agent pipelines that contain deterministic control flow.
  • The approach suggests a route to regulatory acceptance that treats the LLM as an untrusted oracle while still producing auditable outputs.

Load-bearing premise

The deterministic structured computations surrounding an LLM can be cleanly isolated from the model and verified independently using the three proposed certificate families without the isolation itself introducing new failure modes in the overall pipeline.

What would settle it

An adversarial prompt injection that succeeds against the Hoare-style agent action certificate inside the filesystem sandbox while still passing the kernel audit and axiom check would show the isolation does not hold.

Figures

Figures reproduced from arXiv: 2605.16407 by George Koomullil.

Figure 1
Figure 1. Figure 1: The hybrid architecture, inherited from [ [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
read the original abstract

We present a framework for verifying the deterministic structured computations surrounding a large language model rather than the model itself, extending a Lean 4 trust-boundary architecture to the generic interfaces of modern LLM pipelines. Certificate validity is a Lean 4 kernel type-check plus a sorry-free transitive axiom audit against the trusted set {propext, Classical.choice, Quot.sound}; other assumptions are declared and partitioned by tier (mathematical placeholders, cryptographic assumptions, ML/human oracles). The technical contribution comprises three local certificate families and two operators. The families are conflict-aware bilattice grounding (with an emission-gate soundness lemma), embedding sensitivity and paraphrase stability, and Hoare-style agent action. The operators are a Maximal Certifiable Residue, which turns abstention into the maximum-weight certifiable residue with audit-logged dropped claims, and a Compositional Stability theorem, which yields a closed-form pipeline-wide perturbation budget from per-layer gains and margins. The three families plus a Universal Assurance Card consolidator form the per-call deliverable for high-stakes deployments: patent and legal retrieval, regulated finance, clinical decision support, and agentic systems with irreversible side effects. A compiled Lean 4 reference artifact (Lean v4.30.0-rc2, Mathlib) covers all 22 certificate types, with 17 of 46 kernel-audited declarations axiom-free, the rest depending only on the trusted set and declared assumptions, and zero uses of sorryAx or Lean.ofReduceBool. The three families are empirically tested through four registered pilots: bilattice grounding on adversarially perturbed HotpotQA, embedding sensitivity in short- and long-form settings, and Hoare-style agent action on a filesystem sandbox with adversarial prompt injection.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a trust-boundary architecture for verifying deterministic structured computations surrounding LLMs in pipelines, rather than the models themselves. It introduces three certificate families (conflict-aware bilattice grounding with emission-gate soundness lemma, embedding sensitivity and paraphrase stability, Hoare-style agent actions) and two operators (Maximal Certifiable Residue and Compositional Stability theorem). Certificate validity reduces to Lean 4 kernel type-checking plus a sorry-free transitive audit against the minimal trusted axiom set {propext, Classical.choice, Quot.sound}, with other assumptions partitioned by tier. A Lean 4 artifact covers 22 certificate types (17 of 46 declarations axiom-free), and the approach is tested in four registered pilots including adversarially perturbed HotpotQA and filesystem sandbox agent actions.

Significance. If the isolation and boundary claims hold, the work offers a concrete route to adding machine-checked assurance to high-stakes LLM deployments (patent retrieval, regulated finance, clinical support, irreversible agentic actions) without attempting to verify the LLM core. Strengths include the sorry-free Lean 4 artifact with explicit trusted-set audit, the closed-form perturbation budget derived from per-layer gains, and the tiered assumption partitioning that keeps cryptographic and oracle assumptions separate from the kernel.

major comments (2)
  1. Abstract and section on Hoare-style agent action family: the claim that deterministic computations can be extracted and certified independently rests on the assumption that LLM outputs enter the deterministic layer only through explicitly declared oracles whose effects are fully captured by the emission-gate soundness lemma and perturbation budget. The skeptic concern that parsing, state-update, or prompt-injection interfaces may introduce implicit dependencies not covered by the three families is load-bearing; without an explicit argument or lemma showing that the transitive audit extends across these boundaries, the end-to-end pipeline guarantee does not follow from the isolated fragments.
  2. Section describing the four registered pilots: the empirical tests (bilattice grounding on HotpotQA, embedding sensitivity, Hoare-style actions in sandbox) must demonstrate that no new failure modes arise at the LLM-deterministic interface. The current description leaves open whether the pilots exercise the boundary conditions presupposed by the Compositional Stability theorem; if they do not, the pilots do not yet support the central claim that the certificate families suffice for the full pipeline.
minor comments (2)
  1. The abstract states both '22 certificate types' and '17 of 46 kernel-audited declarations'; a brief clarifying sentence or table reconciling these two counts would prevent reader confusion about the artifact's scope.
  2. A summary table listing the assumption tiers, their dependencies on the trusted set, and which certificate families rely on each tier would improve readability of the assurance architecture.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on the trust-boundary architecture. The comments correctly identify that the end-to-end guarantee depends on explicit interface handling and that the pilot descriptions should more clearly demonstrate coverage of boundary conditions. We address both points below and will incorporate clarifications and an additional bridging argument in the revised manuscript.

read point-by-point responses
  1. Referee: Abstract and section on Hoare-style agent action family: the claim that deterministic computations can be extracted and certified independently rests on the assumption that LLM outputs enter the deterministic layer only through explicitly declared oracles whose effects are fully captured by the emission-gate soundness lemma and perturbation budget. The skeptic concern that parsing, state-update, or prompt-injection interfaces may introduce implicit dependencies not covered by the three families is load-bearing; without an explicit argument or lemma showing that the transitive audit extends across these boundaries, the end-to-end pipeline guarantee does not follow from the isolated fragments.

    Authors: We agree that the end-to-end claim requires an explicit bridging argument to rule out implicit dependencies at parsing, state-update, and prompt-injection points. The emission-gate soundness lemma and oracle declarations are intended to capture these interfaces, with assumptions partitioned by tier. In the revised manuscript we will insert a new lemma (placed after the Hoare-style family) that composes the transitive audit across the declared boundaries by invoking the Compositional Stability theorem and the Maximal Certifiable Residue operator. This addition will make the skeptic concern directly addressable without altering the core architecture. revision: yes

  2. Referee: Section describing the four registered pilots: the empirical tests (bilattice grounding on HotpotQA, embedding sensitivity, Hoare-style actions in sandbox) must demonstrate that no new failure modes arise at the LLM-deterministic interface. The current description leaves open whether the pilots exercise the boundary conditions presupposed by the Compositional Stability theorem; if they do not, the pilots do not yet support the central claim that the certificate families suffice for the full pipeline.

    Authors: The pilots were explicitly constructed to probe the interfaces: the adversarially perturbed HotpotQA case exercises embedding sensitivity and bilattice grounding under perturbation budgets, while the filesystem sandbox pilot includes adversarial prompt injection to test the Hoare-style action boundary. No new failure modes were observed at these points. To remove any ambiguity, the revised pilot section will add a short mapping table that links each pilot to the specific boundary conditions and the assumptions of the Compositional Stability theorem, together with a statement confirming that the observed certificates remained valid under the declared oracles. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation anchored in external Lean kernel and fixed trusted axioms

full rationale

The paper defines certificate validity explicitly as a Lean 4 kernel type-check plus a sorry-free transitive axiom audit against the fixed external trusted set {propext, Classical.choice, Quot.sound}, with 17 of 46 declarations axiom-free and the remainder depending only on declared assumptions partitioned by tier. The three certificate families (bilattice grounding with emission-gate soundness lemma, embedding sensitivity, Hoare-style actions) and the Compositional Stability theorem are implemented as Lean artifacts checked against Mathlib and the kernel rather than derived from fitted parameters, self-referential definitions, or load-bearing self-citations. No equations or theorems reduce by construction to their own inputs; the isolation of deterministic computations is presented as an engineering claim verified through the formalization and pilots, not as a mathematical identity that presupposes its conclusion. This is the normal case of a self-contained formal development against an external proof assistant.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the Lean 4 kernel plus a small trusted axiom set and tiered declared assumptions; no numerical free parameters or newly postulated physical entities are introduced.

axioms (1)
  • standard math propext, Classical.choice, Quot.sound form the trusted set for axiom audit
    Explicitly listed as the base against which all other assumptions are audited.

pith-pipeline@v0.9.0 · 5848 in / 1344 out tokens · 67222 ms · 2026-05-20T21:26:42.046354+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages · 8 internal anchors

  1. [1]

    Safereinforcementlearningviashielding

    Mohammed Alshiekh, Roderick Bloem, Rüdiger Ehlers, Bettina Könighofer, Scott Niekum, andUfukTopcu. Safereinforcementlearningviashielding. InProceedingsof the AAAI Conference on Artificial Intelligence, pages 2669–2678. AAAI Press, 2018

  2. [2]

    Angelopoulos and Stephen Bates

    Anastasios N. Angelopoulos and Stephen Bates. Conformal prediction: A gentle introduction.Foundations and Trends in Machine Learning, 16(4):494–591, 2023

  3. [3]

    Formal certification ofcode-basedcryptographicproofs

    Gilles Barthe, Benjamin Grégoire, and Santiago Zanella Béguelin. Formal certification ofcode-basedcryptographicproofs. InProceedingsofthe36thAnnualACMSIGPLAN- SIGACT Symposium on Principles of Programming Languages (POPL), pages 90–101. ACM, 2009

  4. [4]

    Nuel D. Belnap. A useful four-valued logic. In J. Michael Dunn and George Epstein, editors,Modern Uses of Multiple-Valued Logic, pages 5–37. Reidel, Dordrecht, 1977

  5. [5]

    Bohnet, V

    Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, et al. Attributed question answer- ing: Evaluation and modeling for attributed large language models.arXiv preprint arXiv:2212.08037, 2022

  6. [6]

    OxfordUniversityPress,Oxford,1stedition, 2013

    Stéphane Boucheron, Gábor Lugosi, and Pascal Massart.Concentration Inequalities: ANonasymptoticTheoryofIndependence. OxfordUniversityPress,Oxford,1stedition, 2013

  7. [7]

    Bowman, Gabor Angeli, Christopher Potts, and Christopher D

    Samuel R. Bowman, Gabor Angeli, Christopher Potts, and Christopher D. Manning. A large annotated corpus for learning natural language inference. InProceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 632–642. Association for Computational Linguistics, 2015

  8. [8]

    Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, et al

    Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, et al. Language models are few-shot learners. InAdvances in Neural Information Processing Systems 33 (NeurIPS 2020), volume 33, pages 1877–1901. Curran Associates, Inc., 2020

  9. [9]

    Lean4Lean: VerifyingatypecheckerforLean, inLean.arXivpreprint arXiv:2403.14064, 2024

    MarioCarneiro. Lean4Lean: VerifyingatypecheckerforLean, inLean.arXivpreprint arXiv:2403.14064, 2024

  10. [10]

    Daggitt, Omri Isac, Guy Katz, Verena Rieser, and Oliver Lemon

    MarcoCasadio,TanviDinkar,EkaterinaKomendantskaya,LucaArnaboldi,MatthewL. Daggitt, Omri Isac, Guy Katz, Verena Rieser, and Oliver Lemon. NLP verification: To- wardsageneralmethodologyforcertifyingrobustness.arXivpreprintarXiv:2403.10144, 2024. 70

  11. [11]

    Cohen, Elan Rosenfeld, and J

    Jeremy M. Cohen, Elan Rosenfeld, and J. Zico Kolter. Certified adversarial robustness via randomized smoothing. InProceedings of the 36th International Conference on Machine Learning (ICML), pages 1310–1320. PMLR, 2019

  12. [12]

    Mavor-Parker, Aengus Lynch, Stefan Heimersheim, and Adrià Garriga-Alonso

    Arthur Conmy, Augustine N. Mavor-Parker, Aengus Lynch, Stefan Heimersheim, and Adrià Garriga-Alonso. Towards automated circuit discovery for mechanistic interpretability. InAdvances in Neural Information Processing Systems 36 (NeurIPS 2023). Curran Associates, Inc., 2023

  13. [13]

    Abstract interpretation: A unified lattice model for staticanalysisofprogramsbyconstructionorapproximationoffixpoints

    Patrick Cousot and Radhia Cousot. Abstract interpretation: A unified lattice model for staticanalysisofprogramsbyconstructionorapproximationoffixpoints. InConference Record of the Fourth ACM Symposium on Principles of Programming Languages (POPL), pages 238–252. ACM, 1977

  14. [14]

    Sparse Autoencoders Find Highly Interpretable Features in Language Models

    Hoagy Cunningham, Aidan Ewart, Logan Riggs, Robert Huben, and Lee Sharkey. Sparse autoencoders find highly interpretable features in language models.arXiv preprint arXiv:2309.08600, 2023

  15. [15]

    Cutler, Craig Disselkoen, Aaron Eline, Shaobo He, Kyle Headley, Michael Hicks,KeshaHietala,EleftheriosIoannidis,etal

    Joseph W. Cutler, Craig Disselkoen, Aaron Eline, Shaobo He, Kyle Headley, Michael Hicks,KeshaHietala,EleftheriosIoannidis,etal. Cedar: Anewlanguageforexpressive, fast, safe, and analyzable authorization.Proceedings of the ACM on Programming Languages, 8(OOPSLA1):670–697, 2024

  16. [16]

    AsurveyoftheprojectAUTOMATH

    NicolaasGovertdeBruijn. AsurveyoftheprojectAUTOMATH. InJonathanP.Seldin and J. Roger Hindley, editors,To H. B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, pages 579–606. Academic Press, New York, 1980

  17. [17]

    The lean 4 theorem prover and programming language

    LeonardodeMouraandSebastianUllrich. TheLean4theoremproverandprogramming language. In André Platzer and Geoff Sutcliffe, editors,Automated Deduction – CADE 28, volume 12699 ofLecture Notes in Computer Science, pages 625–635, Cham, 2021. Springer. doi: 10.1007/978-3-030-79876-5_37

  18. [18]

    Dijkstra.A Discipline of Programming

    Edsger W. Dijkstra.A Discipline of Programming. Prentice-Hall, Englewood Cliffs, NJ, 1976

  19. [19]

    A mathematical framework for transformer circuits.Transformer Circuits Thread, 2021

    Nelson Elhage, Neel Nanda, Catherine Olsson, Tom Henighan, Nicholas Joseph, Ben Mann, Amanda Askell, Yuntao Bai, et al. A mathematical framework for transformer circuits.Transformer Circuits Thread, 2021

  20. [20]

    What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models.Transactions of the Association for Computational Linguistics, 8:34–48, 2020

    Allyson Ettinger. What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models.Transactions of the Association for Computational Linguistics, 8:34–48, 2020

  21. [21]

    A synthetic approach to Markov kernels, conditional independence and theorems on sufficient statistics.Advances in Mathematics, 370:107239, 2020

    Tobias Fritz. A synthetic approach to Markov kernels, conditional independence and theorems on sufficient statistics.Advances in Mathematics, 370:107239, 2020. 71

  22. [22]

    Retrieval-Augmented Generation for Large Language Models: A Survey

    Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Meng Wang, and Haofen Wang. Retrieval-augmented generation for large language models: A survey.arXiv preprint arXiv:2312.10997, 2023

  23. [23]

    AI2: Safety and robustness certification of neural networks with abstract interpretation

    Timon Gehr, Matthew Mirman, Dana Drachsler-Cohen, Petar Tsankov, Swarat Chaud- huri, and Martin Vechev. AI2: Safety and robustness certification of neural networks with abstract interpretation. In2018 IEEE Symposium on Security and Privacy (SP), pages 3–18. IEEE Computer Society, 2018

  24. [24]

    Multivaluedlogics: Auniformapproachtoreasoninginartificial intelligence.Computational Intelligence, 4(3):265–316, 1988

    MatthewL.Ginsberg. Multivaluedlogics: Auniformapproachtoreasoninginartificial intelligence.Computational Intelligence, 4(3):265–316, 1988

  25. [25]

    Notwhatyou’vesignedupfor: Compromisingreal-worldLLM-integrated applications with indirect prompt injection

    KaiGreshake,SaharAbdelnabi,ShaileshMishra,ChristophEndres,ThorstenHolz,and MarioFritz. Notwhatyou’vesignedupfor: Compromisingreal-worldLLM-integrated applications with indirect prompt injection. InProceedings of the 16th ACM Workshop on Artificial Intelligence and Security (AISec ’23), pages 79–90. ACM, 2023

  26. [26]

    Cautious reinforcementlearningwithlogicalconstraints

    Mohammadhosein Hasanbeig, Alessandro Abate, and Daniel Kroening. Cautious reinforcementlearningwithlogicalconstraints. InProceedingsofthe19thInternational Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 483–491. IFAAMAS, 2020

  27. [27]

    C. A. R. Hoare. An axiomatic basis for computer programming.Communications of the ACM, 12(10):576–580, 1969. doi: 10.1145/363235.363259

  28. [28]

    Probability inequalities for sums of bounded random variables

    Wassily Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13–30, 1963

  29. [29]

    Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, QianglongChen,WeihuaPeng,etal.Asurveyonhallucinationinlargelanguagemodels: Principles,taxonomy,challenges,andopenquestions.arXivpreprintarXiv:2311.05232, 2023

  30. [30]

    Survey of hallucination in natural language generation.ACM Computing Surveys, 55(12):1–38,

    Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Delong Chen, Wenliang Dai, Andrea Madotto, and Pascale Fung. Survey of hallucination in natural language generation.ACM Computing Surveys, 55(12):1–38,

  31. [31]

    doi: 10.1145/3571730

  32. [32]

    Certified robustness to adversarial word substitutions

    Robin Jia, Aditi Raghunathan, Kerem Göksel, and Percy Liang. Certified robustness to adversarial word substitutions. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4129–4142. Association for Computational Linguistics, 2019

  33. [33]

    Jiang, Sean Welleck, Jin Peng Zhou, Timothée Lacroix, Jiacheng Liu, Wenda Li, Mateja Jamnik, Guillaume Lample, and Yuhuai Wu

    Albert Q. Jiang, Sean Welleck, Jin Peng Zhou, Timothée Lacroix, Jiacheng Liu, Wenda Li, Mateja Jamnik, Guillaume Lample, and Yuhuai Wu. Draft, sketch, and prove: 72 Guiding formal theorem provers with informal proofs. InThe Eleventh International Conference on Learning Representations (ICLR), 2023

  34. [34]

    Cliff B. Jones. Specification and design of (parallel) programs. InInformation Processing 83, Proceedings of the IFIP 9th World Computer Congress, pages 321–332. North-Holland, 1983

  35. [35]

    Weakest precondition reasoning for expected runtimes of randomized algo- rithms.Journal of the ACM, 65(5):30:1–30:68, 2018

    Benjamin Lucien Kaminski, Joost-Pieter Katoen, Christoph Matheja, and Federico Olmedo. Weakest precondition reasoning for expected runtimes of randomized algo- rithms.Journal of the ACM, 65(5):30:1–30:68, 2018

  36. [36]

    Dense Passage Retrieval for Open-Domain Question Answering

    Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. Dense passage retrieval for open-domain ques- tionanswering. InProceedingsofthe2020ConferenceonEmpiricalMethodsinNatural Language Processing (EMNLP), pages 6769–6781. Association for Computational Linguistics, 2020. doi: 10.18653/v1/2020.emnlp...

  37. [37]

    Negated and misprimed probes for pretrained language models: Birds can talk, but cannot fly

    Nora Kassner and Hinrich Schütze. Negated and misprimed probes for pretrained language models: Birds can talk, but cannot fly. InProceedings of the 58th Annual MeetingoftheAssociationforComputationalLinguistics,pages7811–7818.Association for Computational Linguistics, 2020

  38. [38]

    Formally Verified Patent Analysis via Dependent Type Theory: Machine-Checkable Certificates from a Hybrid AI + Lean 4 Pipeline

    George Koomullil. Formally verified patent analysis via dependent type theory: Machine-checkable certificates from a hybrid AI + Lean 4 pipeline.arXiv preprint arXiv:2604.18882, April 2026

  39. [39]

    Semantics of probabilistic programs.Journal of Computer and System Sciences, 22(3):328–350, 1981

    Dexter Kozen. Semantics of probabilistic programs.Journal of Computer and System Sciences, 22(3):328–350, 1981. doi: 10.1016/0022-0000(81)90036-2

  40. [40]

    Measuring Faithfulness in Chain-of-Thought Reasoning

    Tamera Lanham, Anna Chen, Ansh Radhakrishnan, Benoit Steiner, Carson Denison, Danny Hernandez, Dustin Li, Esin Durmus, et al. Measuring faithfulness in chain-of- thought reasoning.arXiv preprint arXiv:2307.13702, 2023

  41. [41]

    Retrieval-augmented generation for knowledge-intensive NLP tasks

    Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive NLP tasks. InAdvances in Neural Information Processing Systems 33 (NeurIPS 2020), volume 33, pages 9459–9474. Cu...

  42. [42]

    Let's Verify Step by Step

    Hunter Lightman, Vineet Kosaraju, Yura Burda, Harri Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, et al. Let’s verify step by step.arXiv preprint arXiv:2305.20050, 2023

  43. [43]

    Potsawee Manakul, Adian Liusie, and Mark J. F. Gales. SelfCheckGPT: Zero-resource black-box hallucination detection for generative large language models. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing 73 (EMNLP), pages 9004–9017. Association for Computational Linguistics, 2023. doi: 10.18653/v1/2023.emnlp-main.557

  44. [44]

    Proceedings of the 2023

    Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Wei Koh, Mohit Iyyer, Luke Zettlemoyer, and Hannaneh Hajishirzi. FActScore: Fine-grained atomic evaluation of factual precision in long form text generation. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 12076–12100. Association f...

  45. [45]

    George C. Necula. Proof-carrying code. InProceedings of the 24th ACM SIGPLAN- SIGACTSymposiumonPrinciplesofProgrammingLanguages(POPL),pages106–119. ACM, 1997

  46. [46]

    Peter W. O’Hearn. Resources, concurrency, and local reasoning.Theoretical Computer Science, 375(1–3):271–307, 2007. doi: 10.1016/j.tcs.2006.12.035

  47. [47]

    Zoom in: An introduction to circuits.Distill, 5(3), 2020

    Chris Olah, Nick Cammarata, Ludwig Schubert, Gabriel Goh, Michael Petrov, and Shan Carter. Zoom in: An introduction to circuits.Distill, 5(3), 2020

  48. [48]

    Ignore Previous Prompt: Attack Techniques For Language Models

    Fábio Perez and Ian Ribeiro. Ignore previous prompt: Attack techniques for language models.arXiv preprint arXiv:2211.09527, 2022

  49. [49]

    Conformal language modeling.arXiv preprint arXiv:2306.10193, 2023

    Victor Quach, Adam Fisch, Tal Schuster, Adam Yala, Jae Ho Sohn, Tommi S. Jaakkola, and Regina Barzilay. Conformal language modeling.arXiv preprint arXiv:2306.10193, 2023

  50. [50]

    Sentence-BERT: Sentence embeddings using Siamese BERT-networks

    Nils Reimers and Iryna Gurevych. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992. Association for Computational Linguistics, 2019

  51. [51]

    Rondon, Ming Kawaguchi, and Ranjit Jhala

    Patrick M. Rondon, Ming Kawaguchi, and Ranjit Jhala. Liquid types. InProceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 159–169. ACM, 2008

  52. [52]

    Toolformer: Language models can teach themselves to use tools

    Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. Toolformer: Language models can teach themselves to use tools. InAdvances in Neural Information Processing Systems 36 (NeurIPS 2023), volume 36. Curran Associates, Inc., 2023

  53. [53]

    Dependent types and multi-monadic effects in F*

    Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine Delignat- Lavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, et al. Dependent types and multi-monadic effects in F*. InProceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 256–270. ACM, 2016. 74

  54. [54]

    TheMathlib4mathematicslibraryforLean4,2025

    TheMathlibCommunity. TheMathlib4mathematicslibraryforLean4,2025. Software repository, leanprover-community/mathlib4

  55. [55]

    Miles Turpin, Julian Michael, Ethan Perez, and Samuel R. Bowman. Language models don’talwayssaywhattheythink: Unfaithfulexplanationsinchain-of-thoughtprompting. InAdvances in Neural Information Processing Systems 36 (NeurIPS 2023), volume 36. Curran Associates, Inc., 2023

  56. [56]

    Seidel, Ranjit Jhala, Dimitrios Vytiniotis, and Simon Peyton-Jones

    Niki Vazou, Eric L. Seidel, Ranjit Jhala, Dimitrios Vytiniotis, and Simon Peyton-Jones. Refinement types for Haskell. InProceedings of the 19th ACM SIGPLAN International Conference on Functional Programming (ICFP), pages 269–282. ACM, 2014

  57. [57]

    Lipschitz regularity of deep neural networks: Analysis and efficient estimation

    Aladin Virmaux and Kevin Scaman. Lipschitz regularity of deep neural networks: Analysis and efficient estimation. InAdvances in Neural Information Processing Systems 31 (NeurIPS 2018). Curran Associates, Inc., 2018

  58. [58]

    Springer, New York, 1st edition, 2005

    Vladimir Vovk, Alexander Gammerman, and Glenn Shafer.Algorithmic Learning in a Random World. Springer, New York, 1st edition, 2005

  59. [59]

    Frontiers Comput

    Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, et al. A survey on large language model based autonomous agents. FrontiersofComputerScience,18(6):186345,2024. doi: 10.1007/s11704-024-40231-1

  60. [60]

    Le, Ed H

    Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc V. Le, Ed H. Chi, Sharan Narang, Aakanksha Chowdhery, and Denny Zhou. Self-consistency improves chain of thought reasoning in language models. InProceedings of the 11th International Conference on Learning Representations (ICLR), 2023

  61. [61]

    Chain-of-thoughtpromptingelicitsreasoning in large language models

    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, EdH.Chi,QuocV.Le,andDennyZhou. Chain-of-thoughtpromptingelicitsreasoning in large language models. InAdvances in Neural Information Processing Systems 35 (NeurIPS 2022), volume 35, pages 24824–24837. Curran Associates, Inc., 2022

  62. [62]

    Cohen, Ruslan Salakhutdinov, and Christopher D

    Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William W. Cohen, Ruslan Salakhutdinov, and Christopher D. Manning. HotpotQA: A dataset for diverse, explainable multi-hop question answering. InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2369–2380. Association for Computational Linguistics, 2018

  63. [63]

    ReAct: Synergizing reasoning and acting in language models

    Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. ReAct: Synergizing reasoning and acting in language models. In Proceedings of the 11th International Conference on Learning Representations (ICLR), 2023

  64. [64]

    Universal and Transferable Adversarial Attacks on Aligned Language Models

    Andy Zou, Zifan Wang, Nicholas Carlini, Milad Nasr, J. Zico Kolter, and Matt Fredrikson. Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043, 2023. 75 A Notation Symbols used in the paper, in roughly the order they appear. Symbol Meaning Trust boundary and certificate validity ΩTrusted axiom set{prope...