Attested Tool-Server Admission: A Security Extension to the Model Context Protocol

Alfredo Metere

arxiv: 2605.24248 · v2 · pith:RID3HGS7new · submitted 2026-05-22 · 💻 cs.CR · cs.AI· cs.SE

Attested Tool-Server Admission: A Security Extension to the Model Context Protocol

Alfredo Metere This is my paper

Pith reviewed 2026-06-30 15:28 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.SE

keywords Model Context ProtocolMCPtool server admissionattested clearanceLLM agent securityper-server allowlistaudit logtrust extension

0 comments

The pith

MCP hosts can verify a server's offline-signed clearance at a well-known URI, apply per-server tool allowlists, and enforce decisions via audit logs to admit external tool servers safely.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The Model Context Protocol standardizes message exchange between an LLM agent and external tool servers but supplies no trust controls, so a host accepts any self-declared tool list and dispatches calls without limits. This paper supplies three additive mechanisms that close the gap while leaving the base protocol and existing APIs unchanged. A server publishes a compact offline-signed clearance assertion at a standard URI; the host verifies it against a pinned trust root before any dispatch. A deny-by-default allowlist then restricts which of the server's tools may actually be invoked. A flavor-gated mode converts the checks into hard denials and records every decision in a tamper-evident log. An unmodified host simply ignores the new document and behaves exactly as before.

Core claim

The security gap that makes an unmediated third-party MCP connection unsafe can be closed by three mechanisms without modifying the protocol: an offline-signed clearance assertion published at a well-known URI and verified against a pinned trust root, a deny-by-default per-server tool allowlist, and a flavor-gated enforcement mode that produces hard denials and writes every decision to a tamper-evident audit log.

What carries the argument

The attested clearance assertion: a small offline-signed document a server publishes at a well-known URI that a host verifies against a pinned trust root before dispatching any tool call.

If this is right

Admitting a server no longer implies trusting every tool it offers, because the allowlist bounds the set that may be invoked.
Every admission and tool-dispatch decision is written to a tamper-evident audit log for later inspection.
The extension can be published as a normative MCP addendum because the schema, verification rules, error registry, and conformance vectors are supplied in RFC 2119 form.
Unextended hosts continue to operate exactly as today, so incremental adoption requires no protocol change.
Regulated deployments that must accredit external tool servers become feasible once the explicit trust model is in place.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same clearance-plus-allowlist pattern could be reused in other agent-to-tool protocols that currently lack server admission controls.
Organizations could publish their own clearance roots to enforce internal policy across multiple LLM agents without custom wrappers.
The tamper-evident logs could feed directly into existing security monitoring systems for centralized oversight of tool usage.
Widespread use might reduce reliance on per-application security shims by moving the trust boundary into the protocol extension itself.

Load-bearing premise

The host must hold a correctly pinned trust root and the server must publish a valid clearance document at the well-known URI before any tool dispatch occurs.

What would settle it

A test in which a host configured with a pinned trust root receives a forged clearance document at the well-known URI yet still dispatches a tool call, or in which a server never publishes the document and the host proceeds anyway.

Figures

Figures reproduced from arXiv: 2605.24248 by Alfredo Metere.

**Figure 2.** Figure 2: Runtime configuration of the adversarial campaign. A GPU-resident local model (Ollama / [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗

read the original abstract

The Model Context Protocol (MCP) standardizes how a large-language-model (LLM) agent and an external tool server exchange messages, but not trust: a host reads a server's self-declared tool list and dispatches calls, with no notion of which servers it may use, at what sensitivity, or which of a server's tools are in bounds. This work grew out of a concrete need -- letting the Enclawed agent use Google's externally-operated MCP servers (Gmail, Calendar, Drive) safely, admitting the server and bounding the tools it may drive, without changing MCP or Enclawed's own tool application-programming interface (API). The mechanism we built, mcp-attested (shipped in both the open enclawed-oss distribution and the enclaved flavor), generalizes: the gap that makes an unmediated third-party connection unsafe for one user makes a regulated deployment impossible to accredit. We close it with three additive mechanisms: (1) a small, offline-signed clearance assertion a server publishes at a well-known Uniform Resource Identifier (URI) and a host verifies against a pinned trust root before any tool dispatch; (2) a deny-by-default per-server tool allowlist, so admitting a server is not trusting its every tool; and (3) a flavor-gated enforcement mode that turns the checks from warnings into hard denials, with every decision written to a tamper-evident audit log. We give the wire format, the verification algorithm, a security analysis, and an LLM-driven adversarial evaluation; we then state the design in normative Request-for-Comments (RFC 2119) form -- schema, verification rules, error registry, well-known registration, and machine-checkable conformance vectors -- so it can be adopted as an MCP addendum rather than reinvented. An unextended host ignores the well-known document and behaves exactly as today.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A practical MCP security extension with signed clearances and allowlists, but thin on evaluation data and dependent on correct deployment of trust roots.

read the letter

The paper defines mcp-attested, an extension that adds three mechanisms to the Model Context Protocol: offline-signed clearance assertions published at a well-known URI, per-server deny-by-default tool allowlists, and a flavor-gated mode that enforces hard denials plus tamper-evident logs. It supplies the wire format, verification algorithm, and a full normative RFC 2119 specification so the thing could be adopted as an addendum.

What it does well is stay concrete and backward-compatible. The design came from a real use case with external Google MCP servers and keeps unextended hosts working exactly as they do today. The normative section with schemas, error registry, and machine-checkable conformance vectors is the kind of detail that makes adoption feasible rather than just another idea.

The soft spots are in the evidence and assumptions. There are no quantitative results from the LLM-driven adversarial evaluation, no error analysis, and no implementation metrics. The security claim only holds if hosts correctly pin trust roots and servers actually publish valid clearance documents; the paper notes that failure on either leaves the system in the original unauthenticated state. That is not a hidden flaw, but it does mean the benefit is conditional on operational discipline.

This is for people working on secure LLM agent tool use in enterprise or regulated settings. The citation pattern is appropriate since the work is defining new protocol elements rather than claiming to improve on fitted models.

It deserves peer review because the proposal is grounded in an actual protocol gap and comes with enough specification to be checked and potentially standardized.

Referee Report

2 major / 2 minor

Summary. The paper claims to present mcp-attested, a security extension to the Model Context Protocol (MCP) consisting of three mechanisms: (1) offline-signed clearance assertions published at a well-known URI and verified by hosts against a pinned trust root, (2) deny-by-default per-server tool allowlists, and (3) flavor-gated enforcement with tamper-evident audit logs. It supplies the wire format, verification algorithm, security analysis, an LLM-driven adversarial evaluation, and a full normative RFC-style specification (schema, verification rules, error registry, well-known registration, machine-checkable conformance vectors) for adoption as an MCP addendum, while preserving backward compatibility for unextended hosts.

Significance. If the design is sound, this work offers a practical solution for securing MCP-based LLM agent deployments in sensitive or regulated settings, such as integrating with external services like Google Workspace tools. The provision of machine-checkable conformance vectors and normative language is a notable strength that supports standardization and correct implementation. The approach avoids changes to the core MCP or tool APIs, enhancing its deployability.

major comments (2)

[Evaluation section] The LLM-driven adversarial evaluation is described in the abstract but provides no quantitative results, error analysis, or specific metrics on attack success rates; this is load-bearing for assessing whether the three mechanisms meaningfully reduce the trust gap. (Evaluation section)
[§ Security analysis] The central claim that the mechanisms close the MCP trust gap rests on hosts correctly pinning trust roots and servers publishing valid clearance documents; while the paper notes that unextended hosts revert to the unauthenticated state, the security analysis should include a concrete test case or failure-mode walkthrough to confirm the claim holds under partial adoption. (§ Security analysis)

minor comments (2)

[Abstract] The term 'flavor-gated enforcement mode' appears without a short parenthetical expansion on first use in the abstract, reducing immediate clarity for readers.
[Normative specification] Cross-references between the machine-checkable conformance vectors and the verification algorithm should be added in the normative section to make coverage explicit.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for minor revision. We address each major comment below.

read point-by-point responses

Referee: [Evaluation section] The LLM-driven adversarial evaluation is described in the abstract but provides no quantitative results, error analysis, or specific metrics on attack success rates; this is load-bearing for assessing whether the three mechanisms meaningfully reduce the trust gap. (Evaluation section)

Authors: We agree that quantitative results are needed to substantiate the evaluation. The current manuscript describes the setup and qualitative outcomes of the LLM-driven adversarial evaluation but does not report numerical attack success rates or error analysis. In revision we will expand the Evaluation section with concrete metrics (pre- and post-mechanism attack success rates, false-positive rates on legitimate calls, and breakdown by attack class) drawn from the existing evaluation runs. revision: yes
Referee: [§ Security analysis] The central claim that the mechanisms close the MCP trust gap rests on hosts correctly pinning trust roots and servers publishing valid clearance documents; while the paper notes that unextended hosts revert to the unauthenticated state, the security analysis should include a concrete test case or failure-mode walkthrough to confirm the claim holds under partial adoption. (§ Security analysis)

Authors: We will add a dedicated failure-mode walkthrough subsection to § Security analysis. It will enumerate three concrete partial-adoption scenarios (all hosts extended, mixed population, and server publishes but no host pins) and show, step by step, that adopting hosts still obtain the attested guarantees while non-adopting hosts fall back exactly to the original unauthenticated MCP behavior, with no new attack surface introduced for either population. revision: yes

Circularity Check

0 steps flagged

No circularity; independent protocol design

full rationale

The paper proposes an additive security extension to MCP consisting of a clearance assertion at a well-known URI, a per-server tool allowlist, and flavor-gated enforcement with audit logging. These are defined via explicit wire formats, verification algorithms, and RFC 2119 normative rules with no equations, fitted parameters, predictions, or self-citations that reduce the central claim to its own inputs. The design is self-contained against the stated assumptions about pinned trust roots and document publication; no load-bearing step collapses by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The design rests on standard cryptographic assumptions for signature verification and introduces new protocol artifacts without fitted parameters or data-driven constants.

axioms (1)

standard math Standard cryptographic signature verification against a pinned trust root is reliable
Invoked for clearance assertion verification before tool dispatch.

invented entities (2)

clearance assertion document no independent evidence
purpose: Attests server admission and bounds the allowable tools
New document format and well-known URI registration defined by the paper.
flavor-gated enforcement mode no independent evidence
purpose: Converts checks into hard denials with audit logging
New enforcement behavior introduced for regulated deployments.

pith-pipeline@v0.9.1-grok · 5870 in / 1422 out tokens · 42653 ms · 2026-06-30T15:28:54.461432+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 2 canonical work pages · 1 internal anchor

[1]

Introducing the model context protocol

Anthropic. Introducing the model context protocol. https://www.anthropic.com/news/ model-context-protocol, 2024

2024
[2]

Model context protocol specification

Anthropic. Model context protocol specification. https://modelcontextprotocol.io/ specification, 2024. Accessed 2026

2024
[3]

Elliott Bell and Leonard J

D. Elliott Bell and Leonard J. LaPadula. Secure computer systems: Mathematical foundations. MITRE Technical Report 2547, I, 1973

1973
[4]

SPIFFE: Secure production identity framework for everyone.https://spiffe.io, 2018

Cloud Native Computing Foundation. SPIFFE: Secure production identity framework for everyone.https://spiffe.io, 2018

2018
[5]

Enclawed — the compliance layer for agentic ai.https://enclawed.com, 2026

Enclawed LLC. Enclawed — the compliance layer for agentic ai.https://enclawed.com, 2026

2026
[6]

The OAuth 2.0 authorization framework

Dick Hardt. The OAuth 2.0 authorization framework. Technical Report RFC 6749, IETF, 2012

2012
[7]

The confused deputy: (or why capabilities might have been invented), 1988

Norm Hardy. The confused deputy: (or why capabilities might have been invented), 1988

1988
[8]

Agent-trust-bench: Differential defender-side profiles for multi- server mcp deployments.https://agent-trust-bench.algovoi.co.uk/, 2026

Christopher Hopley and AlgoVoi. Agent-trust-bench: Differential defender-side profiles for multi- server mcp deployments.https://agent-trust-bench.algovoi.co.uk/, 2026. Production adversarial test suite; differential profiles across 29-tool surfaces evaluating tool-name shadowing, dynamic schema drift, and registry poisoning under multiple model personas

2026
[9]

Edwards-curve digital signature algorithm (EdDSA)

Simon Josefsson and Ilari Liusvaara. Edwards-curve digital signature algorithm (EdDSA). Technical Report RFC 8032, IETF, 2017

2017
[10]

JSON-RPC 2.0 specification

JSON-RPC Working Group. JSON-RPC 2.0 specification. https://www.jsonrpc.org/ specification, 2013

2013
[11]

An Application-Layer Multi-Modal Covert-Channel Reference Monitor for LLM Agent Egress

Alfredo Metere. An application-layer multi-modal covert-channel reference monitor for LLM agent egress. arXiv,https://arxiv.org/abs/2605.20734, 2026. Enclawed LLC preprint. 21

work page internal anchor Pith review Pith/arXiv arXiv 2026
[12]

Methods for formal verification of agent skills: Three layers toward a mechan- ically checkable capability-containment proof

Alfredo Metere. Methods for formal verification of agent skills: Three layers toward a mechan- ically checkable capability-containment proof. Zenodo,https://doi.org/10.5281/zenodo. 20100248, 2026. Enclawed LLC preprint

work page doi:10.5281/zenodo 2026
[13]

MITRE ATLAS: Adversarial threat landscape for artificial-intelligence systems

MITRE. MITRE ATLAS: Adversarial threat landscape for artificial-intelligence systems. https://atlas.mitre.org, 2024

2024
[14]

Security requirements for cryptographic modules

National Institute of Standards and Technology. Security requirements for cryptographic modules. Technical Report FIPS PUB 140-3, NIST, 2019

2019
[15]

Security and privacy controls for information systems and organizations

National Institute of Standards and Technology. Security and privacy controls for information systems and organizations. Technical Report SP 800-53 Rev. 5, NIST, 2020

2020
[16]

Sigstore: Software signing for everybody

Zachary Newman, John Speed Meyers, and Santiago Torres-Arias. Sigstore: Software signing for everybody. InProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (CCS), pages 2353–2367, 2022

2022
[17]

Well-known uniform resource identifiers (URIs)

Mark Nottingham. Well-known uniform resource identifiers (URIs). Technical Report RFC 8615, IETF, 2019

2019
[18]

OWASP top 10 for large language model applications.https://owasp

OWASP Foundation. OWASP top 10 for large language model applications.https://owasp. org/www-project-top-10-for-large-language-model-applications/, 2025

2025
[19]

NeMo guardrails: A toolkit for controllable and safe LLM applications with pro- grammable rails

TraianRebedea, RazvanDinu, MakeshNarsimhanSreedhar, ChristopherParisien, andJonathan Cohen. NeMo guardrails: A toolkit for controllable and safe LLM applications with pro- grammable rails. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (EMNLP), pages 431–445, 2023

2023
[20]

The transport layer security (TLS) protocol version 1.3

Eric Rescorla. The transport layer security (TLS) protocol version 1.3. Technical Report RFC 8446, IETF, 2018

2018
[21]

Saltzer and Michael D

Jerome H. Saltzer and Michael D. Schroeder. The protection of information in computer systems. InProceedings of the IEEE, volume 63, pages 1278–1308, 1975

1975
[22]

Survivable key compromise in software update systems

Justin Samuel, Nick Mathewson, Justin Cappos, and Roger Dingledine. Survivable key compromise in software update systems. InProceedings of the 17th ACM Conference on Computer and Communications Security (CCS), pages 61–72, 2010

2010
[23]

in-toto: Providing farm-to-table guarantees for bits and bytes

Santiago Torres-Arias, Hammad Afzali, Trishank Karthik Kuppusamy, Reza Curtmola, and Justin Cappos. in-toto: Providing farm-to-table guarantees for bits and bytes. In28th USENIX Security Symposium (USENIX Security), pages 1393–1410, 2019. 22

2019

[1] [1]

Introducing the model context protocol

Anthropic. Introducing the model context protocol. https://www.anthropic.com/news/ model-context-protocol, 2024

2024

[2] [2]

Model context protocol specification

Anthropic. Model context protocol specification. https://modelcontextprotocol.io/ specification, 2024. Accessed 2026

2024

[3] [3]

Elliott Bell and Leonard J

D. Elliott Bell and Leonard J. LaPadula. Secure computer systems: Mathematical foundations. MITRE Technical Report 2547, I, 1973

1973

[4] [4]

SPIFFE: Secure production identity framework for everyone.https://spiffe.io, 2018

Cloud Native Computing Foundation. SPIFFE: Secure production identity framework for everyone.https://spiffe.io, 2018

2018

[5] [5]

Enclawed — the compliance layer for agentic ai.https://enclawed.com, 2026

Enclawed LLC. Enclawed — the compliance layer for agentic ai.https://enclawed.com, 2026

2026

[6] [6]

The OAuth 2.0 authorization framework

Dick Hardt. The OAuth 2.0 authorization framework. Technical Report RFC 6749, IETF, 2012

2012

[7] [7]

The confused deputy: (or why capabilities might have been invented), 1988

Norm Hardy. The confused deputy: (or why capabilities might have been invented), 1988

1988

[8] [8]

Agent-trust-bench: Differential defender-side profiles for multi- server mcp deployments.https://agent-trust-bench.algovoi.co.uk/, 2026

Christopher Hopley and AlgoVoi. Agent-trust-bench: Differential defender-side profiles for multi- server mcp deployments.https://agent-trust-bench.algovoi.co.uk/, 2026. Production adversarial test suite; differential profiles across 29-tool surfaces evaluating tool-name shadowing, dynamic schema drift, and registry poisoning under multiple model personas

2026

[9] [9]

Edwards-curve digital signature algorithm (EdDSA)

Simon Josefsson and Ilari Liusvaara. Edwards-curve digital signature algorithm (EdDSA). Technical Report RFC 8032, IETF, 2017

2017

[10] [10]

JSON-RPC 2.0 specification

JSON-RPC Working Group. JSON-RPC 2.0 specification. https://www.jsonrpc.org/ specification, 2013

2013

[11] [11]

An Application-Layer Multi-Modal Covert-Channel Reference Monitor for LLM Agent Egress

Alfredo Metere. An application-layer multi-modal covert-channel reference monitor for LLM agent egress. arXiv,https://arxiv.org/abs/2605.20734, 2026. Enclawed LLC preprint. 21

work page internal anchor Pith review Pith/arXiv arXiv 2026

[12] [12]

Methods for formal verification of agent skills: Three layers toward a mechan- ically checkable capability-containment proof

Alfredo Metere. Methods for formal verification of agent skills: Three layers toward a mechan- ically checkable capability-containment proof. Zenodo,https://doi.org/10.5281/zenodo. 20100248, 2026. Enclawed LLC preprint

work page doi:10.5281/zenodo 2026

[13] [13]

MITRE ATLAS: Adversarial threat landscape for artificial-intelligence systems

MITRE. MITRE ATLAS: Adversarial threat landscape for artificial-intelligence systems. https://atlas.mitre.org, 2024

2024

[14] [14]

Security requirements for cryptographic modules

National Institute of Standards and Technology. Security requirements for cryptographic modules. Technical Report FIPS PUB 140-3, NIST, 2019

2019

[15] [15]

Security and privacy controls for information systems and organizations

National Institute of Standards and Technology. Security and privacy controls for information systems and organizations. Technical Report SP 800-53 Rev. 5, NIST, 2020

2020

[16] [16]

Sigstore: Software signing for everybody

Zachary Newman, John Speed Meyers, and Santiago Torres-Arias. Sigstore: Software signing for everybody. InProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (CCS), pages 2353–2367, 2022

2022

[17] [17]

Well-known uniform resource identifiers (URIs)

Mark Nottingham. Well-known uniform resource identifiers (URIs). Technical Report RFC 8615, IETF, 2019

2019

[18] [18]

OWASP top 10 for large language model applications.https://owasp

OWASP Foundation. OWASP top 10 for large language model applications.https://owasp. org/www-project-top-10-for-large-language-model-applications/, 2025

2025

[19] [19]

NeMo guardrails: A toolkit for controllable and safe LLM applications with pro- grammable rails

TraianRebedea, RazvanDinu, MakeshNarsimhanSreedhar, ChristopherParisien, andJonathan Cohen. NeMo guardrails: A toolkit for controllable and safe LLM applications with pro- grammable rails. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (EMNLP), pages 431–445, 2023

2023

[20] [20]

The transport layer security (TLS) protocol version 1.3

Eric Rescorla. The transport layer security (TLS) protocol version 1.3. Technical Report RFC 8446, IETF, 2018

2018

[21] [21]

Saltzer and Michael D

Jerome H. Saltzer and Michael D. Schroeder. The protection of information in computer systems. InProceedings of the IEEE, volume 63, pages 1278–1308, 1975

1975

[22] [22]

Survivable key compromise in software update systems

Justin Samuel, Nick Mathewson, Justin Cappos, and Roger Dingledine. Survivable key compromise in software update systems. InProceedings of the 17th ACM Conference on Computer and Communications Security (CCS), pages 61–72, 2010

2010

[23] [23]

in-toto: Providing farm-to-table guarantees for bits and bytes

Santiago Torres-Arias, Hammad Afzali, Trishank Karthik Kuppusamy, Reza Curtmola, and Justin Cappos. in-toto: Providing farm-to-table guarantees for bits and bytes. In28th USENIX Security Symposium (USENIX Security), pages 1393–1410, 2019. 22

2019