CBCL: Safe Self-Extending Agent Communication
Pith reviewed 2026-05-10 11:42 UTC · model grok-4.3
The pith
CBCL shows that agent communication languages can allow runtime self-extensions while remaining provably safe through homoiconic design and three machine-checked invariants.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CBCL constrains every message, including those that define new language extensions, to the deterministic context-free language class. Agents can transmit and adopt domain-specific dialect extensions as ordinary messages in the same format. Three safety invariants, labeled R1 through R3 and verified in Lean 4, collectively bound expansion, apply resource limits, and protect the core vocabulary. A Rust reference implementation with differential testing extracts a verified parser binary, establishing that the homoiconic representation of extensions does not compromise the ability to prove safety properties.
What carries the argument
Three safety invariants R1-R3 that together prevent unbounded expansion, enforce resource limits, and preserve core vocabulary, applied to a homoiconic protocol in which extension definitions are represented identically to ordinary messages and all inputs stay within the deterministic context-free language class.
If this is right
- Agents can define, transmit, and adopt domain-specific dialect extensions at runtime while every message remains inside the deterministic context-free language class.
- The three invariants R1-R3 block unbounded expansion, enforce resource limits, and protect the core vocabulary for every accepted message.
- Formalization in Lean 4 allows machine-checked proofs of the safety properties and extraction of a verified parser binary.
- Oversight of autonomous agents becomes feasible because the expressible language is formally bounded even after self-extension.
- Heterogeneous agents can coordinate across domains using safe, extensible communication without leaving the verified fragment.
Where Pith is reading between the lines
- The same homoiconic-plus-invariants pattern could be adapted to other self-modifying protocols in distributed systems where runtime adaptability must coexist with static safety guarantees.
- Property-based testing of the Rust implementation against the Lean model provides a practical bridge that future work could extend to adversarial message generation.
- If the invariants hold in practice, CBCL-style designs might reduce the attack surface in open multi-agent environments by making extension behavior predictable and auditable.
- The DCFL constraint suggests that similar bounded grammars could be applied to other agent or protocol languages that currently rely on Turing-complete extension mechanisms.
Load-bearing premise
The three safety invariants R1-R3, once machine-checked for the formal model, will be sufficient to guarantee safety when the language is used by real autonomous agents in open environments.
What would settle it
A concrete sequence of messages that satisfies the CBCL grammar and the three invariants yet still produces unbounded expansion, violates a declared resource limit, or corrupts the core vocabulary.
Figures
read the original abstract
Agent communication languages (ACLs) enable heterogeneous agents to share knowledge and coordinate across diverse domains. This diversity demands extensibility, but expressive extension mechanisms can push the input language beyond the complexity classes where full validation is tractable. We present CBCL (Common Business Communication Language), an agent communication language that constrains all messages, including runtime language extensions, to the deterministic context-free language (DCFL) class. CBCL allows agents to define, transmit, and adopt domain-specific "dialect" extensions as first-class messages; three safety invariants (R1--R3), machine-checked in Lean 4 and enforced in a Rust reference implementation, prevent unbounded expansion, applying declared resource limits, and preserving core vocabulary. We formalize the language and its safety properties in Lean 4, implement a reference parser and dialect engine in Rust with property-based and differential tests, and extract a verified parser binary. Our results demonstrate that homoiconic protocol design, where extension definitions share the same representation as ordinary messages, can be made provably safe. As autonomous agents increasingly extend their own communication capabilities, formally bounding what they can express to each other is a precondition for oversight.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces CBCL, an agent communication language restricting all messages—including runtime-defined dialect extensions—to deterministic context-free languages. Extensions are treated homoiconically as first-class messages. Three safety invariants R1–R3 (no unbounded expansion, enforcement of declared resource limits, and preservation of core vocabulary) are formalized and machine-checked in Lean 4; a Rust reference parser and dialect engine with property-based and differential tests is provided, along with extraction of a verified parser binary. The central claim is that this homoiconic, DCFL-constrained design renders self-extending agent communication provably safe.
Significance. If the invariants hold as stated, the work offers a concrete, formally grounded approach to bounding the complexity of extensible agent protocols, which is relevant for oversight of autonomous systems. The machine-checked Lean 4 proofs for R1–R3 and the reproducible Rust implementation with tests constitute clear strengths, providing verifiable evidence for the syntactic safety properties rather than relying on informal arguments.
major comments (2)
- [Abstract] Abstract: the claim that the design is 'provably safe' for autonomous agents in open environments rests on R1–R3 being sufficient; however, these invariants are defined only over syntactic DCFL message structure and do not model semantic interpretation of adopted dialects, agent decision procedures for extension adoption, or concurrent multi-agent interactions.
- [Invariants R1–R3] Description of invariants R1–R3: while the syntactic constraints (no unbounded expansion, resource limits, core vocabulary preservation) are machine-checked, the manuscript provides no formalization or argument showing that syntactically valid extensions cannot still enable unsafe semantic behaviors or practical resource exhaustion once chained and interpreted by real agents.
minor comments (2)
- [Implementation] The Rust implementation section would benefit from explicit discussion of how the Lean-extracted parser is integrated with the dialect engine and what properties the differential tests actually cover.
- [Formalization] Notation for the DCFL grammar and the precise statement of R1–R3 could be clarified with a small example of a valid versus invalid extension message.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and positive evaluation of the formal verification and implementation aspects. We respond to the major comments point by point below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that the design is 'provably safe' for autonomous agents in open environments rests on R1–R3 being sufficient; however, these invariants are defined only over syntactic DCFL message structure and do not model semantic interpretation of adopted dialects, agent decision procedures for extension adoption, or concurrent multi-agent interactions.
Authors: We concur that the invariants R1–R3 are strictly syntactic and do not encompass semantic interpretation, agent adoption decisions, or multi-agent dynamics. The term 'provably safe' in the manuscript is intended to apply to the syntactic properties of the communication language. We will revise the abstract to qualify this as 'provably syntactically safe' and add explicit statements that semantic safety is not addressed. This revision will be made in the next version of the manuscript. revision: yes
-
Referee: [Invariants R1–R3] Description of invariants R1–R3: while the syntactic constraints (no unbounded expansion, resource limits, core vocabulary preservation) are machine-checked, the manuscript provides no formalization or argument showing that syntactically valid extensions cannot still enable unsafe semantic behaviors or practical resource exhaustion once chained and interpreted by real agents.
Authors: The manuscript does not claim or provide formalization that syntactically valid extensions prevent unsafe semantic behaviors or resource exhaustion in interpretation. Our proofs establish only that the language remains within DCFL and respects the stated syntactic invariants. We will include a new subsection in the discussion to explicitly delineate the boundary between syntactic safety (as proven) and semantic or practical safety (which requires further research). This addresses the concern without altering the core technical results. revision: yes
Circularity Check
No significant circularity; safety claims rest on independent Lean 4 verification against standard DCFL definition.
full rationale
The paper's central derivation formalizes CBCL messages (including extensions) as DCFL, defines invariants R1-R3 directly from DCFL properties (no unbounded expansion, resource limits, core vocabulary preservation), machine-checks them in Lean 4, and enforces them in a Rust reference parser with property-based tests and extraction. No equation, definition, or claim reduces the safety result to a fitted parameter, self-referential renaming, or load-bearing self-citation; the verification uses an external proof assistant and the standard definition of deterministic context-free languages. The derivation is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption All messages, including runtime dialect extensions, must belong to the deterministic context-free language class.
Reference graph
Works this paper leans on
-
[1]
Security applications of formal language theory,
L. Sassaman, M. L. Patterson, S. Bratus, and M. E. Locasto, “Security applications of formal language theory,”IEEE Systems Journal, vol. 7, no. 3, pp. 489–500, Sep. 2013
work page 2013
-
[2]
Exploit programming: From buffer overflows to ‘weird machines’ and theory of computation,
S. Bratus, M. E. Locasto, M. L. Patterson, L. Sassaman, and A. Shubina, “Exploit programming: From buffer overflows to ‘weird machines’ and theory of computation,”;login: USENIX Magazine, vol. 36, no. 6, Dec. 2011
work page 2011
-
[3]
KQML as an agent communication language,
T. Finin, R. Fritzson, D. McKay, and R. McEntire, “KQML as an agent communication language,” inProc. 3rd Int. Conf. on Information and Knowledge Management, pp. 456–463, 1994
work page 1994
-
[4]
The state of the art in agent communication languages,
M. T. Kone, A. Shimazu, and T. Nakajima, “The state of the art in agent communication languages,”Knowledge and Information Systems, vol. 2, no. 3, pp. 259–284, 2000
work page 2000
-
[5]
Trends in agent communication language,
B. Chaib-draa and F. Dignum, “Trends in agent communication language,”Computational Intelligence, vol. 18, no. 2, pp. 89–101, 2002
work page 2002
-
[6]
Foundation for Intelligent Physical Agents, “FIPA ACL specifi- cations,” 2002. [Online]. Available: https://web.archive.org/web/ 2023/http://www.fipa.org/repository/aclspecs.html
work page 2002
-
[7]
Model Context Protocol specification,
Anthropic, “Model Context Protocol specification,” 2025. [On- line]. Available: https://modelcontextprotocol.io/specification/ 2025-03-26
work page 2025
-
[8]
Some philosophical problems from thestandpointofartificialintelligence,
J. McCarthy and P. J. Hayes, “Some philosophical problems from thestandpointofartificialintelligence,”inMachine Intelligence 4 (B. Meltzer and D. Michie, eds.), pp. 463–502, Edinburgh University Press, 1969
work page 1969
-
[9]
Common business communication language,
J. McCarthy, “Common business communication language,” in Textverarbeitung und Bürosysteme(A. Endres and J. Reetz, eds.), R. Oldenbourg Verlag, 1982. [Online]. Available: https: //www-formal.stanford.edu/jmc/cbcl2.pdf
work page 1982
-
[10]
CBCL: A self-extensible agent communication language,
H. O’Connor, “CBCL: A self-extensible agent communication language,” Internet-Draft draft-cbcl-00, 2025
work page 2025
-
[11]
M. Flatt, “Creating languages in Racket,”Communications of the ACM, vol. 55, no. 1, pp. 48–56, 2012
work page 2012
-
[12]
A translation approach to portable ontology specifications,
T. R. Gruber, “A translation approach to portable ontology specifications,”Knowledge Acquisition, vol. 5, no. 2, pp. 199–220, 1993
work page 1993
-
[13]
J. E. Hopcroft, R. Motwani, and J. D. Ullman,Introduction to Automata Theory, Languages, and Computation, 3rd ed. Pearson, 2006
work page 2006
-
[14]
Simple Public Key Infrastructure (SPKI) S-expressions,
R. L. Rivest and D. E. Eastlake 3rd, “Simple Public Key Infrastructure (SPKI) S-expressions,” RFC 9804, 2025
work page 2025
-
[15]
The seven turrets of Babel: A taxonomy of LangSec errors and how to expunge them,
F. Momot, S. Bratus, S. M. Hallberg, and M. L. Patterson, “The seven turrets of Babel: A taxonomy of LangSec errors and how to expunge them,” inProc. IEEE Cybersecurity Development (SecDev), pp. 45–52, 2016
work page 2016
-
[16]
PKI layer cake: New collision attacks against the global X.509 infrastructure,
D. Kaminsky, M. L. Patterson, and L. Sassaman, “PKI layer cake: New collision attacks against the global X.509 infrastructure,” inFinancial Cryptography and Data Security, vol. 6052, pp. 289– 303, 2010
work page 2010
-
[17]
The halting problems of network stack insecurity,
L. Sassaman, M. L. Patterson, S. Bratus, and A. Shubina, “The halting problems of network stack insecurity,”;login: USENIX Magazine, vol. 36, no. 6, 2011
work page 2011
-
[18]
A language-based approach to prevent DDoS attacks in distributed financial agent systems,
E. Fazeldehkordi, O. Owe, and T. Ramezanifarkhani, “A language-based approach to prevent DDoS attacks in distributed financial agent systems,” inComputer Security, vol. 11981, pp. 258–277, 2020
work page 2020
-
[19]
Agent communication languages: Rethinking the principles,
M. P. Singh, “Agent communication languages: Rethinking the principles,”Computer, vol. 31, no. 12, pp. 40–47, 1998
work page 1998
-
[20]
On self-adaptation, self-expression, and self-awareness in autonomic service component ensembles,
F. Zambonelli, N. Bicocchi, G. Cabri, L. Leonardi, and M. Pu- viani, “On self-adaptation, self-expression, and self-awareness in autonomic service component ensembles,” inProc. IEEE 5th Conf. on Self-Adaptive and Self-Organizing Systems Workshops (SASOW), pp. 108–113, 2011
work page 2011
-
[21]
arXiv preprint arXiv:2504.16736 , year=
Y. Yanget al., “A survey of AI agent protocols,” arXiv:2504.16736, 2025
-
[22]
AI agents under threat: A survey of key security challenges and future pathways,
Z. Denget al., “AI agents under threat: A survey of key security challenges and future pathways,”ACM Computing Surveys, vol. 57, no. 7, pp. 1–36, 2025
work page 2025
-
[23]
Clawworm: Self-propagating attacks across llm agent ecosystems.arXiv preprint arXiv:2603.15727,
Y. Zhanget al., “ClawWorm: Self-propagating attacks across LLM agent ecosystems,” arXiv:2603.15727, 2026
-
[24]
Why do AI agents communicate in human language?
P. Zhou, Y. Feng, H. Julaiti, and Z. Yang, “Why do AI agents communicate in human language?” arXiv:2506.02739, 2025
-
[25]
Reliable reasoning beyond natural language: A neurosymbolic approach.arXiv:2407.11373, 2024
N. Borazjanizadeh and S. T. Piantadosi, “Reliable reasoning beyond natural language,” arXiv:2407.11373, 2024
-
[26]
Gossip-based aggregation in large dynamic networks,
M. Jelasity, A. Montresor, and O. Babaoglu, “Gossip-based aggregation in large dynamic networks,”ACM Trans. Comput. Syst., vol. 23, no. 3, pp. 219–252, 2005
work page 2005
-
[27]
Epidemic algorithms for replicated database maintenance,
A. Demerset al., “Epidemic algorithms for replicated database maintenance,” inProc. 6th ACM Symp. on Principles of Dis- tributed Computing (PODC), pp. 1–12, 1987
work page 1987
-
[28]
Research report: AI security is a LangSec problem,
M. Von Hippel and E. Miyazono, “Research report: AI security is a LangSec problem,” inProc. IEEE Security and Privacy Workshops (SPW), pp. 73–78, 2025
work page 2025
-
[29]
Threemodelsforthedescriptionoflanguage,
N.Chomsky,“Threemodelsforthedescriptionoflanguage,”IRE Transactions on Information Theory, vol. 2, no. 3, pp. 113–124, 1956
work page 1956
-
[30]
Nostr: Notes and Other Stuff Transmitted by Re- lays (NIP-01),
Fiatjaf, “Nostr: Notes and Other Stuff Transmitted by Re- lays (NIP-01),” 2020. [Online]. Available: https://github.com/ nostr-protocol/nips/blob/master/01.md
work page 2020
-
[31]
Extensibility in programming language design,
T. A. Standish, “Extensibility in programming language design,” inProc. National Computer Conference and Exposition, pp. 287– 290, 1975
work page 1975
-
[32]
R. Alur and P. Madhusudan, “Visibly pushdown languages,” inProc. 36th ACM Symp. on Theory of Computing (STOC), pp. 202–211, 2004
work page 2004
-
[33]
Deterministic context free lan- guages,
S. Ginsburg and S. Greibach, “Deterministic context free lan- guages,”Information and Control, vol. 9, no. 6, pp. 620–648, 1966
work page 1966
-
[34]
Keeping CALM: When dis- tributed consistency is easy,
J. M. Hellerstein and P. Alvaro, “Keeping CALM: When dis- tributed consistency is easy,”Communications of the ACM, vol. 63, no. 9, pp. 72–81, 2020
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.