Attested Tool-Server Admission: A Security Extension to the Model Context Protocol
Pith reviewed 2026-06-30 15:28 UTC · model grok-4.3
The pith
MCP hosts can verify a server's offline-signed clearance at a well-known URI, apply per-server tool allowlists, and enforce decisions via audit logs to admit external tool servers safely.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The security gap that makes an unmediated third-party MCP connection unsafe can be closed by three mechanisms without modifying the protocol: an offline-signed clearance assertion published at a well-known URI and verified against a pinned trust root, a deny-by-default per-server tool allowlist, and a flavor-gated enforcement mode that produces hard denials and writes every decision to a tamper-evident audit log.
What carries the argument
The attested clearance assertion: a small offline-signed document a server publishes at a well-known URI that a host verifies against a pinned trust root before dispatching any tool call.
If this is right
- Admitting a server no longer implies trusting every tool it offers, because the allowlist bounds the set that may be invoked.
- Every admission and tool-dispatch decision is written to a tamper-evident audit log for later inspection.
- The extension can be published as a normative MCP addendum because the schema, verification rules, error registry, and conformance vectors are supplied in RFC 2119 form.
- Unextended hosts continue to operate exactly as today, so incremental adoption requires no protocol change.
- Regulated deployments that must accredit external tool servers become feasible once the explicit trust model is in place.
Where Pith is reading between the lines
- The same clearance-plus-allowlist pattern could be reused in other agent-to-tool protocols that currently lack server admission controls.
- Organizations could publish their own clearance roots to enforce internal policy across multiple LLM agents without custom wrappers.
- The tamper-evident logs could feed directly into existing security monitoring systems for centralized oversight of tool usage.
- Widespread use might reduce reliance on per-application security shims by moving the trust boundary into the protocol extension itself.
Load-bearing premise
The host must hold a correctly pinned trust root and the server must publish a valid clearance document at the well-known URI before any tool dispatch occurs.
What would settle it
A test in which a host configured with a pinned trust root receives a forged clearance document at the well-known URI yet still dispatches a tool call, or in which a server never publishes the document and the host proceeds anyway.
Figures
read the original abstract
The Model Context Protocol (MCP) standardizes how a large-language-model (LLM) agent and an external tool server exchange messages, but not trust: a host reads a server's self-declared tool list and dispatches calls, with no notion of which servers it may use, at what sensitivity, or which of a server's tools are in bounds. This work grew out of a concrete need -- letting the Enclawed agent use Google's externally-operated MCP servers (Gmail, Calendar, Drive) safely, admitting the server and bounding the tools it may drive, without changing MCP or Enclawed's own tool application-programming interface (API). The mechanism we built, mcp-attested (shipped in both the open enclawed-oss distribution and the enclaved flavor), generalizes: the gap that makes an unmediated third-party connection unsafe for one user makes a regulated deployment impossible to accredit. We close it with three additive mechanisms: (1) a small, offline-signed clearance assertion a server publishes at a well-known Uniform Resource Identifier (URI) and a host verifies against a pinned trust root before any tool dispatch; (2) a deny-by-default per-server tool allowlist, so admitting a server is not trusting its every tool; and (3) a flavor-gated enforcement mode that turns the checks from warnings into hard denials, with every decision written to a tamper-evident audit log. We give the wire format, the verification algorithm, a security analysis, and an LLM-driven adversarial evaluation; we then state the design in normative Request-for-Comments (RFC 2119) form -- schema, verification rules, error registry, well-known registration, and machine-checkable conformance vectors -- so it can be adopted as an MCP addendum rather than reinvented. An unextended host ignores the well-known document and behaves exactly as today.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to present mcp-attested, a security extension to the Model Context Protocol (MCP) consisting of three mechanisms: (1) offline-signed clearance assertions published at a well-known URI and verified by hosts against a pinned trust root, (2) deny-by-default per-server tool allowlists, and (3) flavor-gated enforcement with tamper-evident audit logs. It supplies the wire format, verification algorithm, security analysis, an LLM-driven adversarial evaluation, and a full normative RFC-style specification (schema, verification rules, error registry, well-known registration, machine-checkable conformance vectors) for adoption as an MCP addendum, while preserving backward compatibility for unextended hosts.
Significance. If the design is sound, this work offers a practical solution for securing MCP-based LLM agent deployments in sensitive or regulated settings, such as integrating with external services like Google Workspace tools. The provision of machine-checkable conformance vectors and normative language is a notable strength that supports standardization and correct implementation. The approach avoids changes to the core MCP or tool APIs, enhancing its deployability.
major comments (2)
- [Evaluation section] The LLM-driven adversarial evaluation is described in the abstract but provides no quantitative results, error analysis, or specific metrics on attack success rates; this is load-bearing for assessing whether the three mechanisms meaningfully reduce the trust gap. (Evaluation section)
- [§ Security analysis] The central claim that the mechanisms close the MCP trust gap rests on hosts correctly pinning trust roots and servers publishing valid clearance documents; while the paper notes that unextended hosts revert to the unauthenticated state, the security analysis should include a concrete test case or failure-mode walkthrough to confirm the claim holds under partial adoption. (§ Security analysis)
minor comments (2)
- [Abstract] The term 'flavor-gated enforcement mode' appears without a short parenthetical expansion on first use in the abstract, reducing immediate clarity for readers.
- [Normative specification] Cross-references between the machine-checkable conformance vectors and the verification algorithm should be added in the normative section to make coverage explicit.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and the recommendation for minor revision. We address each major comment below.
read point-by-point responses
-
Referee: [Evaluation section] The LLM-driven adversarial evaluation is described in the abstract but provides no quantitative results, error analysis, or specific metrics on attack success rates; this is load-bearing for assessing whether the three mechanisms meaningfully reduce the trust gap. (Evaluation section)
Authors: We agree that quantitative results are needed to substantiate the evaluation. The current manuscript describes the setup and qualitative outcomes of the LLM-driven adversarial evaluation but does not report numerical attack success rates or error analysis. In revision we will expand the Evaluation section with concrete metrics (pre- and post-mechanism attack success rates, false-positive rates on legitimate calls, and breakdown by attack class) drawn from the existing evaluation runs. revision: yes
-
Referee: [§ Security analysis] The central claim that the mechanisms close the MCP trust gap rests on hosts correctly pinning trust roots and servers publishing valid clearance documents; while the paper notes that unextended hosts revert to the unauthenticated state, the security analysis should include a concrete test case or failure-mode walkthrough to confirm the claim holds under partial adoption. (§ Security analysis)
Authors: We will add a dedicated failure-mode walkthrough subsection to § Security analysis. It will enumerate three concrete partial-adoption scenarios (all hosts extended, mixed population, and server publishes but no host pins) and show, step by step, that adopting hosts still obtain the attested guarantees while non-adopting hosts fall back exactly to the original unauthenticated MCP behavior, with no new attack surface introduced for either population. revision: yes
Circularity Check
No circularity; independent protocol design
full rationale
The paper proposes an additive security extension to MCP consisting of a clearance assertion at a well-known URI, a per-server tool allowlist, and flavor-gated enforcement with audit logging. These are defined via explicit wire formats, verification algorithms, and RFC 2119 normative rules with no equations, fitted parameters, predictions, or self-citations that reduce the central claim to its own inputs. The design is self-contained against the stated assumptions about pinned trust roots and document publication; no load-bearing step collapses by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard cryptographic signature verification against a pinned trust root is reliable
invented entities (2)
-
clearance assertion document
no independent evidence
-
flavor-gated enforcement mode
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Introducing the model context protocol
Anthropic. Introducing the model context protocol. https://www.anthropic.com/news/ model-context-protocol, 2024
2024
-
[2]
Model context protocol specification
Anthropic. Model context protocol specification. https://modelcontextprotocol.io/ specification, 2024. Accessed 2026
2024
-
[3]
Elliott Bell and Leonard J
D. Elliott Bell and Leonard J. LaPadula. Secure computer systems: Mathematical foundations. MITRE Technical Report 2547, I, 1973
1973
-
[4]
SPIFFE: Secure production identity framework for everyone.https://spiffe.io, 2018
Cloud Native Computing Foundation. SPIFFE: Secure production identity framework for everyone.https://spiffe.io, 2018
2018
-
[5]
Enclawed — the compliance layer for agentic ai.https://enclawed.com, 2026
Enclawed LLC. Enclawed — the compliance layer for agentic ai.https://enclawed.com, 2026
2026
-
[6]
The OAuth 2.0 authorization framework
Dick Hardt. The OAuth 2.0 authorization framework. Technical Report RFC 6749, IETF, 2012
2012
-
[7]
The confused deputy: (or why capabilities might have been invented), 1988
Norm Hardy. The confused deputy: (or why capabilities might have been invented), 1988
1988
-
[8]
Agent-trust-bench: Differential defender-side profiles for multi- server mcp deployments.https://agent-trust-bench.algovoi.co.uk/, 2026
Christopher Hopley and AlgoVoi. Agent-trust-bench: Differential defender-side profiles for multi- server mcp deployments.https://agent-trust-bench.algovoi.co.uk/, 2026. Production adversarial test suite; differential profiles across 29-tool surfaces evaluating tool-name shadowing, dynamic schema drift, and registry poisoning under multiple model personas
2026
-
[9]
Edwards-curve digital signature algorithm (EdDSA)
Simon Josefsson and Ilari Liusvaara. Edwards-curve digital signature algorithm (EdDSA). Technical Report RFC 8032, IETF, 2017
2017
-
[10]
JSON-RPC 2.0 specification
JSON-RPC Working Group. JSON-RPC 2.0 specification. https://www.jsonrpc.org/ specification, 2013
2013
-
[11]
An Application-Layer Multi-Modal Covert-Channel Reference Monitor for LLM Agent Egress
Alfredo Metere. An application-layer multi-modal covert-channel reference monitor for LLM agent egress. arXiv,https://arxiv.org/abs/2605.20734, 2026. Enclawed LLC preprint. 21
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[12]
Alfredo Metere. Methods for formal verification of agent skills: Three layers toward a mechan- ically checkable capability-containment proof. Zenodo,https://doi.org/10.5281/zenodo. 20100248, 2026. Enclawed LLC preprint
-
[13]
MITRE ATLAS: Adversarial threat landscape for artificial-intelligence systems
MITRE. MITRE ATLAS: Adversarial threat landscape for artificial-intelligence systems. https://atlas.mitre.org, 2024
2024
-
[14]
Security requirements for cryptographic modules
National Institute of Standards and Technology. Security requirements for cryptographic modules. Technical Report FIPS PUB 140-3, NIST, 2019
2019
-
[15]
Security and privacy controls for information systems and organizations
National Institute of Standards and Technology. Security and privacy controls for information systems and organizations. Technical Report SP 800-53 Rev. 5, NIST, 2020
2020
-
[16]
Sigstore: Software signing for everybody
Zachary Newman, John Speed Meyers, and Santiago Torres-Arias. Sigstore: Software signing for everybody. InProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (CCS), pages 2353–2367, 2022
2022
-
[17]
Well-known uniform resource identifiers (URIs)
Mark Nottingham. Well-known uniform resource identifiers (URIs). Technical Report RFC 8615, IETF, 2019
2019
-
[18]
OWASP top 10 for large language model applications.https://owasp
OWASP Foundation. OWASP top 10 for large language model applications.https://owasp. org/www-project-top-10-for-large-language-model-applications/, 2025
2025
-
[19]
NeMo guardrails: A toolkit for controllable and safe LLM applications with pro- grammable rails
TraianRebedea, RazvanDinu, MakeshNarsimhanSreedhar, ChristopherParisien, andJonathan Cohen. NeMo guardrails: A toolkit for controllable and safe LLM applications with pro- grammable rails. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (EMNLP), pages 431–445, 2023
2023
-
[20]
The transport layer security (TLS) protocol version 1.3
Eric Rescorla. The transport layer security (TLS) protocol version 1.3. Technical Report RFC 8446, IETF, 2018
2018
-
[21]
Saltzer and Michael D
Jerome H. Saltzer and Michael D. Schroeder. The protection of information in computer systems. InProceedings of the IEEE, volume 63, pages 1278–1308, 1975
1975
-
[22]
Survivable key compromise in software update systems
Justin Samuel, Nick Mathewson, Justin Cappos, and Roger Dingledine. Survivable key compromise in software update systems. InProceedings of the 17th ACM Conference on Computer and Communications Security (CCS), pages 61–72, 2010
2010
-
[23]
in-toto: Providing farm-to-table guarantees for bits and bytes
Santiago Torres-Arias, Hammad Afzali, Trishank Karthik Kuppusamy, Reza Curtmola, and Justin Cappos. in-toto: Providing farm-to-table guarantees for bits and bytes. In28th USENIX Security Symposium (USENIX Security), pages 1393–1410, 2019. 22
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.