Provably Auditable and Safe LLM Agents from Human-Authored Ontologies

Aaron Sterling

arxiv: 2606.04903 · v1 · pith:3555BZEOnew · submitted 2026-06-03 · 💻 cs.LO · cs.AI· cs.MA· cs.PL

Provably Auditable and Safe LLM Agents from Human-Authored Ontologies

Aaron Sterling This is my paper

Pith reviewed 2026-06-28 03:47 UTC · model grok-4.3

classification 💻 cs.LO cs.AIcs.MAcs.PL

keywords LLM agentstyped lambda calculusauditabilityontologiesBasic Formal OntologyAgentic Reduxhealthcare compliancesecurity vulnerability disclosure

0 comments

The pith

Typed lambda calculus proves that Agentic Redux LLM agent executions are semantically correct with append-only audit ledgers on suitable domains.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an LLM agent architecture for use with nontrivial problem domains that require linear auditability. It proves using the typed lambda calculus that on appropriate domains the executions are semantically guaranteed to be correct with decisions in an append-only ledger. It also introduces a methodology for creating agent frameworks by having human experts ontologize the domain with Basic Formal Ontology and derive roles for agents and humans. Example domains include healthcare billing compliance and security vulnerability disclosure. This setup aims to ensure that agents follow assigned roles without deviation while maintaining full traceability.

Core claim

Agentic Redux is an architecture in which LLM agents operate according to roles derived from human-authored ontologies in Basic Formal Ontology. Using the typed lambda calculus, the paper proves that when run on domains that can be appropriately formalized, the executions of these agents are semantically guaranteed to be correct, and all decisions are recorded in an append-only ledger. This provides provable auditability and safety for applications in healthcare billing compliance and security vulnerability disclosure.

What carries the argument

Agentic Redux architecture combined with Ontology-First Agent Design, where typed lambda calculus supplies semantic guarantees for correctness and an append-only ledger supplies linear auditability.

If this is right

In healthcare billing compliance, agent decisions can be verified against formal rules with complete traceability.
In security vulnerability disclosure, every step of the process remains recorded and correct by construction.
Human experts can define agent roles through ontology rather than direct programming of rules.
The architecture separates domain formalization from LLM execution to enforce role adherence.
The same proof technique applies to any domain that admits a suitable typed lambda calculus model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could apply to other regulated domains such as finance if they admit similar formalizations.
The append-only ledger structure may integrate with existing compliance logging systems.
Testing the approach on additional domains would check whether the lambda calculus proofs scale beyond the two examples.
The separation of ontology creation from agent execution may limit certain forms of unintended behavior.

Load-bearing premise

Problem domains can be formalized in the typed lambda calculus such that LLM agent executions stay inside the model and follow the ontology-derived roles without deviation.

What would settle it

An execution trace from one of the domains in which an agent produces a decision outside the ontology-derived role or omits a step from the ledger would falsify the semantic guarantee.

Figures

Figures reproduced from arXiv: 2606.04903 by Aaron Sterling.

**Figure 1.** Figure 1: The Agentic Redux architecture. Sub-agents receive isolated state slices and propose actions to a meta-agent, which adjudicates proposals against declared invariants before committing state transitions. Code is available at https://github.com/Thistleseeds/ agentic-redux 1.3 Agentic Redux 1.3.1 Agentic Redux Architecture This paper introduces the agent architecture Agentic Redux. It is inspired by the fron… view at source ↗

**Figure 2.** Figure 2: Ontology-First Agent Design. The ontologies referenced in this paper were pro [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Role Derivation procedure. to agents of the classic Confinement Problem[18], restricts both what an agent can observe and what an agent can directly modify: an agent sees only the minimal slice of state needed to perform its work, and cannot directly modify state at all, since all state changes are mediated by the meta-agent. 2.3 Derivation of System Invariants Once we have roles and slices of state that w… view at source ↗

**Figure 4.** Figure 4: Deriving invariants, then assigning the cross-cutting invariants to the Agentic Redux [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗

**Figure 5.** Figure 5: Agentic Redux expanded with a Counselor Queue for the UDT Compliance probem [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗

**Figure 6.** Figure 6: The independent continuant Vulnerability and its relations, in the Security Vulnera [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗

**Figure 7.** Figure 7: A comparison between the happy paths of the independent continuant’s journey in [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗

**Figure 8.** Figure 8: The Agentic Redux architecture for the Security Vulnerability Disclosure ontology. [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗

read the original abstract

We introduce the LLM agent architecture Agentic Redux, intended for use with nontrivial problem domains that require linear auditability. Using the typed lambda calculus, we prove that, run on appropriate domains, Agentic Redux executions are semantically guaranteed to be correct, with all decisions recorded in an append-only ledger. We present two production-grade appropriate domains, in healthcare billing compliance, and security vulnerability disclosure. Working code for Agentic Redux run on both domains is available in a supporting code repository. We also introduce Ontology-First Agent Design, a methodology for creation of agent frameworks on a problem domain, in which a human expert ontologizes the problem domain with Basic Formal Ontology, and then assigns an LLM to derive roles that agents and humans-in-the-loop can fill, in order to work the problems in the domain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper asserts a typed lambda calculus proof for correct, auditable LLM agents via Ontology-First design, but the abstract gives no derivation details and the runtime enforcement mechanism is missing.

read the letter

The central claim is that Agentic Redux, built from human-authored BFO ontologies, lets you prove LLM agent executions are semantically correct on suitable domains with an append-only ledger. That is the one thing a colleague needs to know up front.

The new piece is the specific combination of Ontology-First Agent Design with a formal model. A human expert builds the ontology, the LLM derives roles from it, and the whole thing is supposed to run inside typed lambda calculus so every decision is recorded and correct by construction. They name two domains—healthcare billing compliance and security vulnerability disclosure—and say working code exists for both. That is more concrete than most LLM safety papers.

The soft spot is enforcement. The proof only covers executions that stay inside the model. Nothing in the abstract describes reduction rules, a type checker at runtime, or any guard that would reject an LLM output that is not a well-typed term. If the agent can produce out-of-model actions, the formal guarantee does not apply to the actual system. The domains may be formalizable, but the paper does not show how the LLM is prevented from leaving the model.

This is for people who need auditable agents in regulated settings and are willing to invest in heavy domain modeling. A reader who already works with formal methods and BFO will get the most from it. The citation pattern looks light on prior formal LLM work, but that is secondary to whether the proof actually closes the gap between model and implementation.

I would send it to peer review so referees can see the full derivation and any runtime checks. The idea is worth checking even if the current write-up leaves the key enforcement question open.

Referee Report

3 major / 1 minor

Summary. The paper introduces Agentic Redux, an LLM agent architecture for nontrivial domains requiring linear auditability. It claims that, using the typed lambda calculus, executions on appropriate domains are semantically guaranteed to be correct with all decisions recorded in an append-only ledger. It presents Ontology-First Agent Design, in which a human expert uses Basic Formal Ontology (BFO) to ontologize the domain and an LLM derives roles for agents and humans-in-the-loop. Two production-grade domains are given (healthcare billing compliance and security vulnerability disclosure) along with working code.

Significance. If the claimed proof and enforcement mechanisms were fully specified and verified, the work would offer a notable contribution to verifiable LLM agents by combining human-authored ontologies with formal semantics for auditability in high-stakes domains. The availability of working code on concrete domains is a positive element.

major comments (3)

[Abstract] Abstract: The central claim that 'using the typed lambda calculus, we prove that... Agentic Redux executions are semantically guaranteed to be correct' is asserted without any derivation, type system definition, reduction rules, or formal model of how LLM outputs are constrained to well-typed terms. This is load-bearing for the 'provably' guarantee.
The description of Ontology-First Agent Design and the two domains does not provide an explicit mechanism (e.g., runtime type checker, reduction to lambda terms, or rejection of out-of-model outputs) that would ensure every LLM-generated action remains a valid typed term whose reduction is recorded in the ledger. Without this, the formal model describes an ideal agent rather than the actual LLM-driven system.
The weakest assumption—that problem domains can be suitably formalized in typed lambda calculus so that LLM executions remain inside the model—is stated but not supported by any concrete type assignment, ontology-to-lambda mapping, or example derivation for the healthcare or security domains.

minor comments (1)

[Abstract] The abstract and introduction use the term 'Agentic Redux' and 'Ontology-First Agent Design' without an initial definition or citation to prior work on BFO or lambda calculus in agent systems.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address each major comment below and commit to revisions that will strengthen the formal content of the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that 'using the typed lambda calculus, we prove that... Agentic Redux executions are semantically guaranteed to be correct' is asserted without any derivation, type system definition, reduction rules, or formal model of how LLM outputs are constrained to well-typed terms. This is load-bearing for the 'provably' guarantee.

Authors: We agree that the abstract states the central claim without accompanying formal details. The current manuscript does not contain the type system definition, reduction rules, or proof derivation. We will revise by adding a new section (or expanding Section 3) that defines the typed lambda calculus, presents the reduction rules, describes the formal model for constraining LLM outputs to well-typed terms, and includes a proof sketch of semantic correctness. revision: yes
Referee: The description of Ontology-First Agent Design and the two domains does not provide an explicit mechanism (e.g., runtime type checker, reduction to lambda terms, or rejection of out-of-model outputs) that would ensure every LLM-generated action remains a valid typed term whose reduction is recorded in the ledger. Without this, the formal model describes an ideal agent rather than the actual LLM-driven system.

Authors: This observation is correct. The manuscript describes the overall architecture and the append-only ledger but does not specify the runtime enforcement mechanism. We will add an explicit subsection detailing the runtime type checker, the process by which LLM outputs are reduced to lambda terms or rejected if they fall outside the model, and how valid reductions are recorded in the ledger. revision: yes
Referee: The weakest assumption—that problem domains can be suitably formalized in typed lambda calculus so that LLM executions remain inside the model—is stated but not supported by any concrete type assignment, ontology-to-lambda mapping, or example derivation for the healthcare or security domains.

Authors: We acknowledge that the manuscript states the assumption at a high level without concrete supporting examples. We will add an appendix containing explicit type assignments, ontology-to-lambda mappings, and example derivations for both the healthcare billing compliance and security vulnerability disclosure domains. revision: yes

Circularity Check

0 steps flagged

No significant circularity; formal proof relies on external typed lambda calculus

full rationale

The paper's central claim is a proof that Agentic Redux executions are semantically guaranteed to be correct when run on appropriate domains, using the typed lambda calculus as the formal system. This is a standard external mathematical framework, not derived from or equivalent to the paper's own inputs, fitted parameters, or self-referential definitions. The Ontology-First methodology involves human-authored BFO ontologies with LLM role derivation, but the proof is conditional on the stated assumption that domains can be formalized such that executions remain within the model; this assumption is not smuggled in via self-citation or reduced by construction to the result itself. No load-bearing self-citations, ansatzes, or renamings of known results appear in the derivation chain. The result is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The paper relies on standard formal methods and introduces new architectural concepts without additional fitted parameters visible in the abstract.

axioms (2)

standard math Typed lambda calculus can model and prove semantic correctness of agent executions on appropriate domains.
Invoked as the basis for the proof of correctness in the abstract.
domain assumption Basic Formal Ontology can be used by human experts to structure problem domains for subsequent LLM role derivation.
Central to the Ontology-First Agent Design methodology described.

invented entities (2)

Agentic Redux no independent evidence
purpose: LLM agent architecture providing semantic correctness guarantees and append-only audit ledger.
Newly introduced architecture in the paper.
Ontology-First Agent Design no independent evidence
purpose: Methodology for creating agent frameworks by human ontology authoring followed by LLM role derivation.
New methodology proposed in the paper.

pith-pipeline@v0.9.1-grok · 5662 in / 1409 out tokens · 38225 ms · 2026-06-28T03:47:44.288204+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 9 canonical work pages

[1]

Redux: A JS library for predictable and maintainable global state man- agement, 2015–2026

Dan Abramov. Redux: A JS library for predictable and maintainable global state man- agement, 2015–2026. URLhttps://redux.js.org/

2015
[2]

Vera: A language designed for machines to write, 2026

Alasdair Allan. Vera: A language designed for machines to write, 2026. URLhttps: //veralang.dev/

2026
[3]

Spear.Building Ontologies with Basic Formal Ontology

Robert Arp, Barry Smith, and Andrew D. Spear.Building Ontologies with Basic Formal Ontology. MIT Press, 2015. doi:10.7551/mitpress/9780262527811.001.0001

work page doi:10.7551/mitpress/9780262527811.001.0001 2015
[4]

Llms4ol 2025 overview: The 2nd large language models for ontology learning challenge

Hamed Babaei Giglou, Jennifer D’Souza, Nandana Mihindukulasooriya, and S¨ oren Auer. Llms4ol 2025 overview: The 2nd large language models for ontology learning challenge. Open Conference Proceedings, 6, Oct. 2025. doi:10.52825/ocp.v6i.2913. URLhttps:// www.tib-op.org/ojs/index.php/ocp/article/view/2913

work page doi:10.52825/ocp.v6i.2913 2025
[5]

Franklin, Ali Ghodsi, Joseph M

Peter Bailis, Alan Fekete, Michael J. Franklin, Ali Ghodsi, Joseph M. Hellerstein, and Ion Stoica. Coordination avoidance in database systems. InProceedings of the VLDB Endowment, 2014. doi:10.14778/2735508.2735509

work page doi:10.14778/2735508.2735509 2014
[6]

Bakker, Daan L

Roos M. Bakker, Daan L. Di Scala, Maaike H. T. de Boer, and Stephan A. Raaijmakers. Ontology learning with LLMs: A benchmark study on axiom identification, 2025. URL https://arxiv.org/abs/2512.05594

arXiv 2025
[7]

A critique of ANSI SQL isolation levels.SIGMOD Rec., 24(2):1–10, May 1995

Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O’Neil, and Patrick O’Neil. A critique of ANSI SQL isolation levels.SIGMOD Rec., 24(2):1–10, May 1995. ISSN 0163-

1995
[8]

URLhttps://doi.org/10.1145/568271.223785

doi:10.1145/568271.223785. URLhttps://doi.org/10.1145/568271.223785

work page doi:10.1145/568271.223785
[9]

Blackhat LLMs, 2026

Nicholas Carlini. Blackhat LLMs, 2026. URLhttps://youtu.be/1sd26pWhfmg

2026
[10]

Evaluating and mitigating the growing risk of LLM- discovered 0-days, 2026

Nicholas Carlini, Keane Lucas, Evyatar Ben Asher, Newton Cheng, Hasnain Lakhani, David Forsythe, and Kyla Guru. Evaluating and mitigating the growing risk of LLM- discovered 0-days, 2026. URLhttps://red.anthropic.com/2026/zero-days/

2026
[11]

Risk agents

CashApp/Block. Risk agents. Minutes of meeting of ROOST Coop working group, 2026. URLhttps://github.com/roostorg/coop/discussions/171

2026
[12]

Scaling managed agents: Decoupling the brain from the hands,

Anthropic Engineering. Scaling managed agents: Decoupling the brain from the hands,
[13]

URLhttps://www.anthropic.com/engineering/managed-agents
[14]

Basic Formal Ontology, 2002–2026

Barry Smith et al. Basic Formal Ontology, 2002–2026. URLhttps:// basic-formal-ontology.org/

2002
[15]

Basic Formal Ontology 2.0, 2020

Barry Smith et al. Basic Formal Ontology 2.0, 2020. URLhttps://raw. githubusercontent.com/BFO-ontology/BFO/master/docs/bfo2-reference/ BFO2-Reference.pdf. Direct link to pdf

2020
[16]

Basic Formal Ontology ISO standard, 2021

Barry Smith et al. Basic Formal Ontology ISO standard, 2021. URLhttps://standards. iso.org/iso-iec/21838/-2/ed-1/en/. 2021

2021
[17]

LLMs4Life: Large language models for ontology learning in life sciences, 2024

Nadeen Fathallah, Steffen Staab, and Alsayed Algergawy. LLMs4Life: Large language models for ontology learning in life sciences, 2024. URLhttps://arxiv.org/abs/2412. 02035

2024
[18]

Ontology.Encyclopedia of Database Systems, 2009

Tom Gruber. Ontology.Encyclopedia of Database Systems, 2009. doi:10.1007/978-0-387- 39940-9 1318. 22

work page doi:10.1007/978-0-387- 2009
[19]

O’Reilly Media, Sebastopol, CA, October 2025

Valliappa Lakshmanan and Hannes Hapke.Generative AI Design Patterns. O’Reilly Media, Sebastopol, CA, October 2025. ISBN 9798341622654. URLhttps://www.oreilly.com/ library/view/generative-ai-design/9798341622654/

arXiv 2025
[20]

Butler W. Lampson. A note on the confinement problem.Commun. ACM, 16(10):613–615, October 1973. ISSN 0001-0782. doi:10.1145/362375.362389. URLhttps://doi.org/10. 1145/362375.362389

work page doi:10.1145/362375.362389 1973
[21]

On- tology generation using large language models

Anna Sofia Lippolis, Mohammad Javad Saeedizade, Robin Keskis¨ arkk¨ a, Sara Zuppiroli, Miguel Ceriani, Aldo Gangemi, Eva Blomqvist, and Andrea Giovanni Nuzzolese. On- tology generation using large language models. InThe Semantic Web: 22nd Eu- ropean Semantic Web Conference, ESWC 2025, Portoroz, Slovenia, June 1–5, 2025, Proceedings, Part I, pages 321–341,...

work page doi:10.1007/978-3-031-94575-5 2025
[22]

Claude code found a linux vulnerability hidden for 23 years, 2026

Max Lynch. Claude code found a linux vulnerability hidden for 23 years, 2026. URL https://mtlynch.io/claude-code-found-linux-vulnerability/

2026
[23]

Do LLMs really adapt to domains? an ontology learning perspective

Huu Tan Mai, Cuong Xuan Chu, and Heiko Paulheim. Do LLMs really adapt to domains? an ontology learning perspective. In Gianluca Demartini, Katja Hose, Maribel Acosta, Matteo Palmonari, Gong Cheng, Hala Skaf-Molli, Nicolas Ferranti, Daniel Hern´ andez, and Aidan Hogan, editors,The Semantic Web – ISWC 2024, pages 126–143, Cham, 2025. Springer Nature Switzer...

work page doi:10.1007/978-3-031-77844-5 2024
[24]

Microsoft agent framework, 2026

Microsoft. Microsoft agent framework, 2026. URLhttps://github.com/microsoft/ agent-framework

2026
[25]

Pierce.Types and Programming Languages

Benjamin C. Pierce.Types and Programming Languages. MIT Press, 2002. URLhttps: //mitpress.mit.edu/9780262162098/types-and-programming-languages/

arXiv 2002
[26]

Osprey, 2026

ROOST. Osprey, 2026. URLhttps://github.com/roostorg/osprey

2026
[27]

Awesome LLM apps, 2026

Shubham Saboo. Awesome LLM apps, 2026. URLhttps://github.com/Shubhamsaboo/ awesome-llm-apps

2026
[28]

Different coding agent strategies are better at different jobs, 2026

Aaron Sterling. Different coding agent strategies are better at different jobs, 2026. In preparation

2026
[29]

AHCCCS covered behavioral health services guide, 2025

Arizona Health Care Cost Containment System. AHCCCS covered behavioral health services guide, 2025. URLhttps://www.azahcccs.gov/PlansProviders/Downloads/ MedicalCodingResources/AHCCCSCoveredBHServicesManual.pdf

2025
[30]

Wright and Matthias Felleisen

Andrew K. Wright and Matthias Felleisen. A syntactic approach to type soundness.In- formation and Computation, 115(1):38–94, 1994. doi:10.1006/inco.1994.1093

work page doi:10.1006/inco.1994.1093 1994
[31]

a list of state components a process reads or writes

Xiao Zhang, Huiyuan Lai, Qianru Meng, and Johan Bos. OntoURL: A benchmark for evaluating large language models on symbolic ontological understanding, reasoning and learning, 2025. URLhttps://arxiv.org/abs/2505.11031. A The Agentic Redux Calculus Note about the proofs in this Appendix:The text in the main body of the paper is 99% text I wrote myself. The t...

arXiv 2025
[32]

consultωto select a roler∈Roles D (thearbitration choice)
[33]

compute the sliceσ=project D(s.domain, r)
[34]

invokeα r(σ, ω) to obtain an actiona
[35]

At most one (r, a) pair is in flight to the commit discipline at any time

return (r, a). At most one (r, a) pair is in flight to the commit discipline at any time. Simultaneity among real-world agents is absorbed into the arbitration choice in Step 1. Definition A.35(Agentic Redux commit discipline).TheAgentic Redux commit discipline κAgentic Redux is the function that, on input (s, a, t, ω):
[36]

computes thecandidate domain stateˆs≜applyMutation D(s.domain, a, t)
[37]

walksinvariants D in list order, evaluating each invariant on ˆs
[38]

halts on the first invariantι k ∈invariants D whose evaluation on ˆsdoes not returnpass: •ifι k(ˆs) =reject(m), appends tos.audita single entry with tagrejected, actiona, timestampt, and witness slotSome(id(ι k), m), and returns the framework state with domainandqueuecomponents unchanged; •ifι k(ˆs) =escalate(m), appends tos.audita single entry with tages...
[39]

what happens if two agents act at once?

if every invariant returnspasson ˆs, appends tos.audita single entry with tagapproved, actiona, timestampt, and witness slotNone, and returns the framework state withdomain component ˆsandqueuecomponent unchanged. Each branch appends exactly one audit entry; the approve and reject branches leave the queue unchanged, and the escalate branch appends exactly...

[1] [1]

Redux: A JS library for predictable and maintainable global state man- agement, 2015–2026

Dan Abramov. Redux: A JS library for predictable and maintainable global state man- agement, 2015–2026. URLhttps://redux.js.org/

2015

[2] [2]

Vera: A language designed for machines to write, 2026

Alasdair Allan. Vera: A language designed for machines to write, 2026. URLhttps: //veralang.dev/

2026

[3] [3]

Spear.Building Ontologies with Basic Formal Ontology

Robert Arp, Barry Smith, and Andrew D. Spear.Building Ontologies with Basic Formal Ontology. MIT Press, 2015. doi:10.7551/mitpress/9780262527811.001.0001

work page doi:10.7551/mitpress/9780262527811.001.0001 2015

[4] [4]

Llms4ol 2025 overview: The 2nd large language models for ontology learning challenge

Hamed Babaei Giglou, Jennifer D’Souza, Nandana Mihindukulasooriya, and S¨ oren Auer. Llms4ol 2025 overview: The 2nd large language models for ontology learning challenge. Open Conference Proceedings, 6, Oct. 2025. doi:10.52825/ocp.v6i.2913. URLhttps:// www.tib-op.org/ojs/index.php/ocp/article/view/2913

work page doi:10.52825/ocp.v6i.2913 2025

[5] [5]

Franklin, Ali Ghodsi, Joseph M

Peter Bailis, Alan Fekete, Michael J. Franklin, Ali Ghodsi, Joseph M. Hellerstein, and Ion Stoica. Coordination avoidance in database systems. InProceedings of the VLDB Endowment, 2014. doi:10.14778/2735508.2735509

work page doi:10.14778/2735508.2735509 2014

[6] [6]

Bakker, Daan L

Roos M. Bakker, Daan L. Di Scala, Maaike H. T. de Boer, and Stephan A. Raaijmakers. Ontology learning with LLMs: A benchmark study on axiom identification, 2025. URL https://arxiv.org/abs/2512.05594

arXiv 2025

[7] [7]

A critique of ANSI SQL isolation levels.SIGMOD Rec., 24(2):1–10, May 1995

Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O’Neil, and Patrick O’Neil. A critique of ANSI SQL isolation levels.SIGMOD Rec., 24(2):1–10, May 1995. ISSN 0163-

1995

[8] [8]

URLhttps://doi.org/10.1145/568271.223785

doi:10.1145/568271.223785. URLhttps://doi.org/10.1145/568271.223785

work page doi:10.1145/568271.223785

[9] [9]

Blackhat LLMs, 2026

Nicholas Carlini. Blackhat LLMs, 2026. URLhttps://youtu.be/1sd26pWhfmg

2026

[10] [10]

Evaluating and mitigating the growing risk of LLM- discovered 0-days, 2026

Nicholas Carlini, Keane Lucas, Evyatar Ben Asher, Newton Cheng, Hasnain Lakhani, David Forsythe, and Kyla Guru. Evaluating and mitigating the growing risk of LLM- discovered 0-days, 2026. URLhttps://red.anthropic.com/2026/zero-days/

2026

[11] [11]

Risk agents

CashApp/Block. Risk agents. Minutes of meeting of ROOST Coop working group, 2026. URLhttps://github.com/roostorg/coop/discussions/171

2026

[12] [12]

Scaling managed agents: Decoupling the brain from the hands,

Anthropic Engineering. Scaling managed agents: Decoupling the brain from the hands,

[13] [13]

URLhttps://www.anthropic.com/engineering/managed-agents

[14] [14]

Basic Formal Ontology, 2002–2026

Barry Smith et al. Basic Formal Ontology, 2002–2026. URLhttps:// basic-formal-ontology.org/

2002

[15] [15]

Basic Formal Ontology 2.0, 2020

Barry Smith et al. Basic Formal Ontology 2.0, 2020. URLhttps://raw. githubusercontent.com/BFO-ontology/BFO/master/docs/bfo2-reference/ BFO2-Reference.pdf. Direct link to pdf

2020

[16] [16]

Basic Formal Ontology ISO standard, 2021

Barry Smith et al. Basic Formal Ontology ISO standard, 2021. URLhttps://standards. iso.org/iso-iec/21838/-2/ed-1/en/. 2021

2021

[17] [17]

LLMs4Life: Large language models for ontology learning in life sciences, 2024

Nadeen Fathallah, Steffen Staab, and Alsayed Algergawy. LLMs4Life: Large language models for ontology learning in life sciences, 2024. URLhttps://arxiv.org/abs/2412. 02035

2024

[18] [18]

Ontology.Encyclopedia of Database Systems, 2009

Tom Gruber. Ontology.Encyclopedia of Database Systems, 2009. doi:10.1007/978-0-387- 39940-9 1318. 22

work page doi:10.1007/978-0-387- 2009

[19] [19]

O’Reilly Media, Sebastopol, CA, October 2025

Valliappa Lakshmanan and Hannes Hapke.Generative AI Design Patterns. O’Reilly Media, Sebastopol, CA, October 2025. ISBN 9798341622654. URLhttps://www.oreilly.com/ library/view/generative-ai-design/9798341622654/

arXiv 2025

[20] [20]

Butler W. Lampson. A note on the confinement problem.Commun. ACM, 16(10):613–615, October 1973. ISSN 0001-0782. doi:10.1145/362375.362389. URLhttps://doi.org/10. 1145/362375.362389

work page doi:10.1145/362375.362389 1973

[21] [21]

On- tology generation using large language models

Anna Sofia Lippolis, Mohammad Javad Saeedizade, Robin Keskis¨ arkk¨ a, Sara Zuppiroli, Miguel Ceriani, Aldo Gangemi, Eva Blomqvist, and Andrea Giovanni Nuzzolese. On- tology generation using large language models. InThe Semantic Web: 22nd Eu- ropean Semantic Web Conference, ESWC 2025, Portoroz, Slovenia, June 1–5, 2025, Proceedings, Part I, pages 321–341,...

work page doi:10.1007/978-3-031-94575-5 2025

[22] [22]

Claude code found a linux vulnerability hidden for 23 years, 2026

Max Lynch. Claude code found a linux vulnerability hidden for 23 years, 2026. URL https://mtlynch.io/claude-code-found-linux-vulnerability/

2026

[23] [23]

Do LLMs really adapt to domains? an ontology learning perspective

Huu Tan Mai, Cuong Xuan Chu, and Heiko Paulheim. Do LLMs really adapt to domains? an ontology learning perspective. In Gianluca Demartini, Katja Hose, Maribel Acosta, Matteo Palmonari, Gong Cheng, Hala Skaf-Molli, Nicolas Ferranti, Daniel Hern´ andez, and Aidan Hogan, editors,The Semantic Web – ISWC 2024, pages 126–143, Cham, 2025. Springer Nature Switzer...

work page doi:10.1007/978-3-031-77844-5 2024

[24] [24]

Microsoft agent framework, 2026

Microsoft. Microsoft agent framework, 2026. URLhttps://github.com/microsoft/ agent-framework

2026

[25] [25]

Pierce.Types and Programming Languages

Benjamin C. Pierce.Types and Programming Languages. MIT Press, 2002. URLhttps: //mitpress.mit.edu/9780262162098/types-and-programming-languages/

arXiv 2002

[26] [26]

Osprey, 2026

ROOST. Osprey, 2026. URLhttps://github.com/roostorg/osprey

2026

[27] [27]

Awesome LLM apps, 2026

Shubham Saboo. Awesome LLM apps, 2026. URLhttps://github.com/Shubhamsaboo/ awesome-llm-apps

2026

[28] [28]

Different coding agent strategies are better at different jobs, 2026

Aaron Sterling. Different coding agent strategies are better at different jobs, 2026. In preparation

2026

[29] [29]

AHCCCS covered behavioral health services guide, 2025

Arizona Health Care Cost Containment System. AHCCCS covered behavioral health services guide, 2025. URLhttps://www.azahcccs.gov/PlansProviders/Downloads/ MedicalCodingResources/AHCCCSCoveredBHServicesManual.pdf

2025

[30] [30]

Wright and Matthias Felleisen

Andrew K. Wright and Matthias Felleisen. A syntactic approach to type soundness.In- formation and Computation, 115(1):38–94, 1994. doi:10.1006/inco.1994.1093

work page doi:10.1006/inco.1994.1093 1994

[31] [31]

a list of state components a process reads or writes

Xiao Zhang, Huiyuan Lai, Qianru Meng, and Johan Bos. OntoURL: A benchmark for evaluating large language models on symbolic ontological understanding, reasoning and learning, 2025. URLhttps://arxiv.org/abs/2505.11031. A The Agentic Redux Calculus Note about the proofs in this Appendix:The text in the main body of the paper is 99% text I wrote myself. The t...

arXiv 2025

[32] [32]

consultωto select a roler∈Roles D (thearbitration choice)

[33] [33]

compute the sliceσ=project D(s.domain, r)

[34] [34]

invokeα r(σ, ω) to obtain an actiona

[35] [35]

At most one (r, a) pair is in flight to the commit discipline at any time

return (r, a). At most one (r, a) pair is in flight to the commit discipline at any time. Simultaneity among real-world agents is absorbed into the arbitration choice in Step 1. Definition A.35(Agentic Redux commit discipline).TheAgentic Redux commit discipline κAgentic Redux is the function that, on input (s, a, t, ω):

[36] [36]

computes thecandidate domain stateˆs≜applyMutation D(s.domain, a, t)

[37] [37]

walksinvariants D in list order, evaluating each invariant on ˆs

[38] [38]

halts on the first invariantι k ∈invariants D whose evaluation on ˆsdoes not returnpass: •ifι k(ˆs) =reject(m), appends tos.audita single entry with tagrejected, actiona, timestampt, and witness slotSome(id(ι k), m), and returns the framework state with domainandqueuecomponents unchanged; •ifι k(ˆs) =escalate(m), appends tos.audita single entry with tages...

[39] [39]

what happens if two agents act at once?

if every invariant returnspasson ˆs, appends tos.audita single entry with tagapproved, actiona, timestampt, and witness slotNone, and returns the framework state withdomain component ˆsandqueuecomponent unchanged. Each branch appends exactly one audit entry; the approve and reject branches leave the queue unchanged, and the escalate branch appends exactly...