From Tool Connection to Execution Control: Benchmarking Security Invariants in MCP-Style Agent Runtimes
Pith reviewed 2026-06-30 09:12 UTC · model grok-4.3
The pith
MCP-style agent systems require an execution-control layer with eight invariants beyond connection conventions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MCP-style agent systems need an execution-control layer in addition to connection-layer conventions. The eight invariants, when implemented in the HCP runtime through principals, resources, grants, capabilities, handles, policy decisions, data-pipe checks, and audit entries, block all ten modeled attacks while the connection baselines permit six or ten, and maintain forensic evidence with sub-millisecond operation latencies.
What carries the argument
Eight security invariants—metadata non-authority, grant-backed approval, canonical resources, principal binding, scoped capability invocation, source-and-target data-flow authorization, deny-path audit, and explicit protocol state—enforced via a capability-based runtime model in HCP.
If this is right
- Connection-layer mitigations such as metadata linting, session checks, and per-call approvals permit six of the ten modeled attacks.
- The invariants enable preservation of audit evidence for forensic analysis of blocked attacks.
- Policy, invocation, peek, and pipe operations incur sub-millisecond mean latencies in local in-memory microbenchmarks.
- Ablation studies identify which runtime components block specific attack classes.
Where Pith is reading between the lines
- If adopted, the invariants could reduce dependence on distributed approval dialogs and prompt-level security decisions.
- The benchmark method could be applied to evaluate other agent tool protocols or against additional attack classes.
- Similar execution-control layers might address security gaps in non-MCP tool-use ecosystems.
Load-bearing premise
The ten benchmark cases accurately model the relevant attack surface for MCP-style systems and that blocking them demonstrates the sufficiency of the eight invariants without introducing new vulnerabilities or missing attack classes.
What would settle it
An attack that succeeds against HCP while respecting the eight invariants, or a demonstration that the invariants block valid MCP workflows or create new exploitable paths not captured in the benchmarks.
Figures
read the original abstract
Model Context Protocol (MCP)-style ecosystems give language-model applications a practical connection layer for tools, resources, prompts, and transports. As agents move from connection to execution, security decisions often remain split across clients, servers, prompts, approval dialogs, OAuth deployments, and logs. This paper asks whether a runtime can make execution-layer invariants explicit and testable while preserving MCP-like workflows. We define eight invariants: metadata non-authority, grant-backed approval, canonical resources, principal binding, scoped capability invocation, source-and-target data-flow authorization, deny-path audit, and explicit protocol state. We implement these invariants in HCP, a Handle-Capability Protocol reference runtime for MCP-style agent execution that represents calls through principals, resources, grants, capabilities, handles, policy decisions, data-pipe checks, and audit entries. We evaluate HCP against two MCP-like baselines: a naive connection-layer runtime and a practice-informed connection-layer mitigation baseline with metadata linting, session checks, and per-call approvals. Across 10 benchmark cases, the naive baseline permits all modeled attacks, the mitigation baseline permits 6 of 10, and HCP blocks all 10 while preserving audit evidence. Ablations identify which runtime components block attacks and preserve forensic evidence. A local in-memory microbenchmark reports sub-millisecond mean latencies for measured policy, invocation, peek, and pipe operations. A bounded GitHub README-screening sample provides ecosystem signals, not vulnerability findings. The results support a narrow claim: MCP-style agent systems need an execution-control layer in addition to connection-layer conventions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that MCP-style agent systems require an execution-control layer beyond connection-layer conventions. It defines eight invariants (metadata non-authority, grant-backed approval, canonical resources, principal binding, scoped capability invocation, source-and-target data-flow authorization, deny-path audit, and explicit protocol state), implements them in the HCP reference runtime, and evaluates against a naive baseline and a mitigation baseline across 10 benchmark cases. HCP blocks all 10 attacks while the naive baseline permits all and the mitigation baseline permits 6; ablations identify contributing components, audit evidence is preserved, and microbenchmarks show sub-millisecond latencies.
Significance. If the benchmark results hold, the work supplies a concrete, testable set of invariants together with a reference implementation that isolates the gap between connection and execution security. The ablations and explicit audit preservation are strengths that allow readers to trace which runtime features block attacks and retain forensic value. This could inform the design of future MCP-style agent frameworks by demonstrating the insufficiency of connection-layer mitigations alone.
major comments (2)
- [§5 (Benchmark Evaluation)] §5 (Benchmark Evaluation): The central claim that the eight invariants suffice rests on HCP blocking all 10 modeled attacks while baselines permit 6–10. The paper reports ablations but provides no systematic derivation or completeness argument showing that the 10 cases exhaust relevant threats (OAuth misuse, resource poisoning, capability confusion, data-flow violations) or that HCP introduces no new attack surface; without this, the necessity/sufficiency conclusion remains unsupported.
- [Threat Model and Benchmark Selection] Threat Model and Benchmark Selection: The weakest assumption—that the 10 cases accurately model the full MCP-style attack surface—is load-bearing for the recommendation of an execution-control layer. A more explicit justification of case selection criteria and an argument against missing classes would be required to make the empirical comparison decisive.
minor comments (2)
- [Abstract] Abstract: the description of the 'practice-informed connection-layer mitigation baseline' is brief; a short enumeration of the specific linting, session, and approval mechanisms would improve clarity for readers unfamiliar with current MCP deployments.
- [Introduction / HCP Design] Notation: the mapping from the eight invariants to the concrete HCP components (principals, grants, handles, policy decisions, data-pipe checks) is introduced in the abstract but would benefit from an early table or diagram in the main text.
Simulated Author's Rebuttal
We thank the referee for highlighting the need for stronger justification around benchmark selection and threat model scope. Our responses below address each major comment directly. We maintain that the paper's core contribution is a narrow empirical demonstration of the gap between connection-layer and execution-layer controls, rather than a claim of exhaustive coverage.
read point-by-point responses
-
Referee: [§5 (Benchmark Evaluation)] §5 (Benchmark Evaluation): The central claim that the eight invariants suffice rests on HCP blocking all 10 modeled attacks while baselines permit 6–10. The paper reports ablations but provides no systematic derivation or completeness argument showing that the 10 cases exhaust relevant threats (OAuth misuse, resource poisoning, capability confusion, data-flow violations) or that HCP introduces no new attack surface; without this, the necessity/sufficiency conclusion remains unsupported.
Authors: We agree the manuscript does not contain a formal completeness proof or exhaustive enumeration of the threat space. The eight invariants are motivated by observed failure modes in MCP-style systems, and the evaluation shows they block the ten concrete attacks while connection-layer baselines do not. The abstract and conclusion explicitly frame the result as supporting only a narrow claim: that an execution-control layer is required in addition to connection conventions. We will revise §5 to add an explicit subsection on benchmark construction, mapping each of the ten cases to the categories listed by the referee and to the eight invariants. We will also add a short discussion of design choices intended to avoid introducing new surfaces (e.g., explicit protocol state and deny-path audit). These changes constitute a partial revision; a full formal argument that the cases are exhaustive or that HCP is free of new surfaces lies outside the paper's scope. revision: partial
-
Referee: [Threat Model and Benchmark Selection] Threat Model and Benchmark Selection: The weakest assumption—that the 10 cases accurately model the full MCP-style attack surface—is load-bearing for the recommendation of an execution-control layer. A more explicit justification of case selection criteria and an argument against missing classes would be required to make the empirical comparison decisive.
Authors: The ten cases were chosen to exercise each of the eight invariants at least once, drawing from documented patterns in OAuth token handling, resource naming collisions, capability delegation, and cross-component data flows. We will expand the threat-model section with a table that states the selection criteria (coverage of the invariants, representativeness of published MCP/OAuth issues, and feasibility of implementation in the three runtimes) and briefly notes why certain other classes (e.g., side-channel timing or prompt-injection variants) were omitted as orthogonal to the execution-layer focus. We do not claim the set is complete; the recommendation for an execution layer rests on the consistent gap observed across the modeled cases, not on universality. This justification will be added in revision. revision: partial
- A systematic derivation proving that the ten cases exhaust the relevant threat classes or that HCP introduces no new attack surface.
Circularity Check
No significant circularity; evaluation uses direct implementation comparisons on author-defined cases
full rationale
The paper defines eight invariants, implements them in HCP, and reports that HCP blocks all 10 benchmark attacks while baselines permit 6-10. These benchmarks are constructed around the invariants, but the result is a standard engineering validation rather than a self-referential fit or definition that reduces the central claim to its inputs by construction. No self-citations are load-bearing, no parameters are fitted then renamed as predictions, and no uniqueness theorems or ansatzes are smuggled in. The narrow claim of needing an execution-control layer rests on observable blocking counts, which are externally falsifiable via re-implementation even if the threat-model completeness is debatable.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The eight listed invariants capture the essential security properties for MCP-style agent execution.
invented entities (1)
-
Handle-Capability Protocol (HCP) runtime
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Specification - model context protocol,
Model Context Protocol, “Specification - model context protocol,” 2025, accessed 2026-06-22. [Online]. Available: https://modelcontextprotocol.io/specification/2025-11-25
2025
-
[2]
Tools - model context protocol,
——, “Tools - model context protocol,” 2025, accessed 2026-06-22. [Online]. Available: https: //modelcontextprotocol.io/specification/2025-11-25/server/tools 14
2025
-
[3]
Authorization - model context protocol,
——, “Authorization - model context protocol,” 2025, accessed 2026-06-22. [Online]. Available: https://modelcontextprotocol.io/specification/2025-11-25/basic/authorization
2025
-
[4]
Transports - model context protocol,
——, “Transports - model context protocol,” 2025, accessed 2026-06-22. [Online]. Available: https: //modelcontextprotocol.io/specification/2025-11-25/basic/transports
2025
-
[5]
Security best practices - model context protocol,
——, “Security best practices - model context protocol,” 2026, accessed 2026-06-22. [Online]. Available: https://modelcontextprotocol.io/docs/tutorials/security/security best practices
2026
-
[6]
Configure MCP servers for your repository,
GitHub Docs, “Configure MCP servers for your repository,” 2026, gitHub Docs; accessed 2026-06-
2026
-
[7]
Available: https://docs.github.com/enterprise-cloud@latest/copilot/how-tos/copilot-on-github/ customize-copilot/customize-cloud-agent/extend-cloud-agent-with-mcp
[Online]. Available: https://docs.github.com/enterprise-cloud@latest/copilot/how-tos/copilot-on-github/ customize-copilot/customize-cloud-agent/extend-cloud-agent-with-mcp
-
[8]
Resource indicators for OAuth 2.0,
B. Campbell, J. Bradley, and H. Tschofenig, “Resource indicators for OAuth 2.0,” RFC 8707, 2020. [Online]. Available: https://www.rfc-editor.org/info/rfc8707/
2020
-
[9]
Best current practice for OAuth 2.0 security,
T. Lodderstedt, J. Bradley, A. Labunets, and D. Fett, “Best current practice for OAuth 2.0 security,” RFC 9700, 2025. [Online]. Available: https://www.rfc-editor.org/info/rfc9700/
2025
-
[10]
The confused deputy: (or why capabilities might have been invented),
N. Hardy, “The confused deputy: (or why capabilities might have been invented),”ACM SIGOPS Operating Systems Review, vol. 22, no. 4, pp. 36–38, 1988. [Online]. Available: https://dl.acm.org/doi/10.1145/54289. 871709
-
[11]
On access control, capabilities, their equivalence, and confused deputy attacks,
V . Rajani, D. Garg, and T. Rezk, “On access control, capabilities, their equivalence, and confused deputy attacks,” in2016 IEEE 29th Computer Security Foundations Symposium (CSF), 2016, pp. 150–163. [Online]. Available: https://doi.org/10.1109/CSF.2016.18
-
[12]
Communications of the ACM , volume =
D. E. Denning, “A lattice model of secure information flow,”Communications of the ACM, vol. 19, no. 5, pp. 236–243, 1976. [Online]. Available: https://dl.acm.org/doi/10.1145/360051.360056
-
[13]
Jflow: Practical mostly-static information flow control,
A. C. Myers, “Jflow: Practical mostly-static information flow control,” inProceedings of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1999, pp. 228–241. [Online]. Available: https://dl.acm.org/doi/10.1145/292540.292561
-
[14]
PROV-DM: The PROV data model,
W3C, “PROV-DM: The PROV data model,” 2013, w3C Recommendation; accessed 2026-06-22. [Online]. Available: https://www.w3.org/TR/prov-dm/
2013
-
[15]
Provenance-aware storage systems,
K.-K. Muniswamy-Reddy, D. A. Holland, U. Braun, and M. Seltzer, “Provenance-aware storage systems,” inProceedings of the 2006 USENIX Annual Technical Conference, 2006. [Online]. Available: https://www.usenix.org/event/usenix06/tech/full papers/muniswamy-reddy/muniswamy-reddy.pdf
2006
-
[16]
Certificate transparency version 2.0,
B. Laurie, E. Messeri, and R. Stradling, “Certificate transparency version 2.0,” RFC 9162, 2021. [Online]. Available: https://www.rfc-editor.org/info/rfc9162/
2021
-
[17]
C. Huang, X. Huang, N. P. Tran, and A. M. Fard, “Model context protocol threat modeling and analyzing vulnerabilities to prompt injection with tool poisoning,” 2026. [Online]. Available: https://arxiv.org/abs/2603.22489
-
[18]
MCPTox: A benchmark for tool poisoning on real-world MCP servers,
Z. Wang, Y . Gao, Y . Wang, S. Liu, H. Sun, H. Cheng, G. Shi, H. Du, and X. Li, “MCPTox: A benchmark for tool poisoning on real-world MCP servers,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 42, 2026, pp. 35 811–35 819. [Online]. Available: https: //ojs.aaai.org/index.php/AAAI/article/view/40895
2026
-
[19]
MCP security notification: Tool poisoning attacks,
Invariant Labs, “MCP security notification: Tool poisoning attacks,” 2025, practi- tioner security report; accessed 2026-06-22. [Online]. Available: https://invariantlabs.ai/blog/ mcp-security-notification-tool-poisoning-attacks 15
2025
-
[20]
InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,
Q. Zhan, Z. Liang, Z. Ying, and D. Kang, “InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,” inFindings of the Association for Computational Linguistics: ACL 2024, 2024, pp. 10 471–10 506. [Online]. Available: https://aclanthology.org/2024.findings-acl.624/
2024
-
[21]
AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents
E. Debenedetti, J. Zhang, M. Balunovic, L. Beurer-Kellner, M. Fischer, and F. Tramer, “AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents,” inThe Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2024, arXiv:2406.13352. [Online]. Available: https://openreview.net/...
work page internal anchor Pith review Pith/arXiv arXiv 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.