From Tool Connection to Execution Control: Benchmarking Security Invariants in MCP-Style Agent Runtimes

Ting Liu

arxiv: 2606.29073 · v1 · pith:UIU77J2Gnew · submitted 2026-06-27 · 💻 cs.CR · cs.AI

From Tool Connection to Execution Control: Benchmarking Security Invariants in MCP-Style Agent Runtimes

Ting Liu This is my paper

Pith reviewed 2026-06-30 09:12 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords Model Context Protocolagent runtime securityexecution controlsecurity invariantscapability-based securitytool callingaudit loggingbenchmark attacks

0 comments

The pith

MCP-style agent systems require an execution-control layer with eight invariants beyond connection conventions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines security in Model Context Protocol style agent systems as they transition from tool connections to actual execution. It proposes eight invariants for execution control, including metadata non-authority and scoped capability invocation, and implements them in a reference runtime called HCP. Testing shows that this approach blocks all ten benchmark attacks, unlike naive and mitigated connection-layer systems which allow most or all. The work claims this demonstrates the need for explicit execution-layer controls alongside existing connection conventions to maintain security and auditability.

Core claim

MCP-style agent systems need an execution-control layer in addition to connection-layer conventions. The eight invariants, when implemented in the HCP runtime through principals, resources, grants, capabilities, handles, policy decisions, data-pipe checks, and audit entries, block all ten modeled attacks while the connection baselines permit six or ten, and maintain forensic evidence with sub-millisecond operation latencies.

What carries the argument

Eight security invariants—metadata non-authority, grant-backed approval, canonical resources, principal binding, scoped capability invocation, source-and-target data-flow authorization, deny-path audit, and explicit protocol state—enforced via a capability-based runtime model in HCP.

If this is right

Connection-layer mitigations such as metadata linting, session checks, and per-call approvals permit six of the ten modeled attacks.
The invariants enable preservation of audit evidence for forensic analysis of blocked attacks.
Policy, invocation, peek, and pipe operations incur sub-millisecond mean latencies in local in-memory microbenchmarks.
Ablation studies identify which runtime components block specific attack classes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If adopted, the invariants could reduce dependence on distributed approval dialogs and prompt-level security decisions.
The benchmark method could be applied to evaluate other agent tool protocols or against additional attack classes.
Similar execution-control layers might address security gaps in non-MCP tool-use ecosystems.

Load-bearing premise

The ten benchmark cases accurately model the relevant attack surface for MCP-style systems and that blocking them demonstrates the sufficiency of the eight invariants without introducing new vulnerabilities or missing attack classes.

What would settle it

An attack that succeeds against HCP while respecting the eight invariants, or a demonstration that the invariants block valid MCP workflows or create new exploitable paths not captured in the benchmarks.

Figures

Figures reproduced from arXiv: 2606.29073 by Ting Liu.

**Figure 2.** Figure 2: Threat model and trust boundaries used by the benchmark. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Policy decision flow for a capability invocation. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: DataPipe enforcement checks source ownership and target capability policy before moving data. [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Reproducible benchmark pipeline from case fixtures to manuscript evidence. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

read the original abstract

Model Context Protocol (MCP)-style ecosystems give language-model applications a practical connection layer for tools, resources, prompts, and transports. As agents move from connection to execution, security decisions often remain split across clients, servers, prompts, approval dialogs, OAuth deployments, and logs. This paper asks whether a runtime can make execution-layer invariants explicit and testable while preserving MCP-like workflows. We define eight invariants: metadata non-authority, grant-backed approval, canonical resources, principal binding, scoped capability invocation, source-and-target data-flow authorization, deny-path audit, and explicit protocol state. We implement these invariants in HCP, a Handle-Capability Protocol reference runtime for MCP-style agent execution that represents calls through principals, resources, grants, capabilities, handles, policy decisions, data-pipe checks, and audit entries. We evaluate HCP against two MCP-like baselines: a naive connection-layer runtime and a practice-informed connection-layer mitigation baseline with metadata linting, session checks, and per-call approvals. Across 10 benchmark cases, the naive baseline permits all modeled attacks, the mitigation baseline permits 6 of 10, and HCP blocks all 10 while preserving audit evidence. Ablations identify which runtime components block attacks and preserve forensic evidence. A local in-memory microbenchmark reports sub-millisecond mean latencies for measured policy, invocation, peek, and pipe operations. A bounded GitHub README-screening sample provides ecosystem signals, not vulnerability findings. The results support a narrow claim: MCP-style agent systems need an execution-control layer in addition to connection-layer conventions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives eight concrete invariants for MCP agent execution and a runtime that blocks all 10 of its test attacks while baselines do not.

read the letter

The core takeaway is that connection-layer fixes are not enough for MCP-style agents and that explicit execution invariants can be stated and enforced. The paper lists eight of them—metadata non-authority, grant-backed approval, canonical resources, principal binding, scoped capability invocation, source-and-target data-flow authorization, deny-path audit, and explicit protocol state—and implements them in the HCP runtime using principals, grants, capabilities, handles, and policy checks.

What stands out is the direct comparison: the naive baseline allows all 10 modeled attacks, the practice-informed mitigation baseline allows 6, and HCP blocks all 10 while keeping audit records. Ablations show which pieces of the runtime stop which attacks, and the microbenchmark reports sub-millisecond latencies for the main operations. That is more concrete than most security papers at this stage.

The soft spot is the 10 benchmark cases themselves. The claim that an execution layer is required rests on HCP stopping everything in that set. If those cases miss important classes—OAuth misuse patterns, resource poisoning variants, or capability confusion outside the modeled flows—then the necessity argument is only partially supported. The paper does not provide an independent argument that the set is exhaustive, and it is not clear whether HCP itself opens new surfaces. The GitHub README sample is too narrow to help here.

This is for people building or securing agent runtimes who need testable properties rather than high-level advice. A reader working on MCP or similar tool-calling systems will find the invariants and the HCP design useful to examine even if they disagree with the exact eight. The work is coherent on its own terms and shows clear engagement with the practical problem, so it deserves a serious referee rather than a desk reject.

Referee Report

2 major / 2 minor

Summary. The paper claims that MCP-style agent systems require an execution-control layer beyond connection-layer conventions. It defines eight invariants (metadata non-authority, grant-backed approval, canonical resources, principal binding, scoped capability invocation, source-and-target data-flow authorization, deny-path audit, and explicit protocol state), implements them in the HCP reference runtime, and evaluates against a naive baseline and a mitigation baseline across 10 benchmark cases. HCP blocks all 10 attacks while the naive baseline permits all and the mitigation baseline permits 6; ablations identify contributing components, audit evidence is preserved, and microbenchmarks show sub-millisecond latencies.

Significance. If the benchmark results hold, the work supplies a concrete, testable set of invariants together with a reference implementation that isolates the gap between connection and execution security. The ablations and explicit audit preservation are strengths that allow readers to trace which runtime features block attacks and retain forensic value. This could inform the design of future MCP-style agent frameworks by demonstrating the insufficiency of connection-layer mitigations alone.

major comments (2)

[§5 (Benchmark Evaluation)] §5 (Benchmark Evaluation): The central claim that the eight invariants suffice rests on HCP blocking all 10 modeled attacks while baselines permit 6–10. The paper reports ablations but provides no systematic derivation or completeness argument showing that the 10 cases exhaust relevant threats (OAuth misuse, resource poisoning, capability confusion, data-flow violations) or that HCP introduces no new attack surface; without this, the necessity/sufficiency conclusion remains unsupported.
[Threat Model and Benchmark Selection] Threat Model and Benchmark Selection: The weakest assumption—that the 10 cases accurately model the full MCP-style attack surface—is load-bearing for the recommendation of an execution-control layer. A more explicit justification of case selection criteria and an argument against missing classes would be required to make the empirical comparison decisive.

minor comments (2)

[Abstract] Abstract: the description of the 'practice-informed connection-layer mitigation baseline' is brief; a short enumeration of the specific linting, session, and approval mechanisms would improve clarity for readers unfamiliar with current MCP deployments.
[Introduction / HCP Design] Notation: the mapping from the eight invariants to the concrete HCP components (principals, grants, handles, policy decisions, data-pipe checks) is introduced in the abstract but would benefit from an early table or diagram in the main text.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for highlighting the need for stronger justification around benchmark selection and threat model scope. Our responses below address each major comment directly. We maintain that the paper's core contribution is a narrow empirical demonstration of the gap between connection-layer and execution-layer controls, rather than a claim of exhaustive coverage.

read point-by-point responses

Referee: [§5 (Benchmark Evaluation)] §5 (Benchmark Evaluation): The central claim that the eight invariants suffice rests on HCP blocking all 10 modeled attacks while baselines permit 6–10. The paper reports ablations but provides no systematic derivation or completeness argument showing that the 10 cases exhaust relevant threats (OAuth misuse, resource poisoning, capability confusion, data-flow violations) or that HCP introduces no new attack surface; without this, the necessity/sufficiency conclusion remains unsupported.

Authors: We agree the manuscript does not contain a formal completeness proof or exhaustive enumeration of the threat space. The eight invariants are motivated by observed failure modes in MCP-style systems, and the evaluation shows they block the ten concrete attacks while connection-layer baselines do not. The abstract and conclusion explicitly frame the result as supporting only a narrow claim: that an execution-control layer is required in addition to connection conventions. We will revise §5 to add an explicit subsection on benchmark construction, mapping each of the ten cases to the categories listed by the referee and to the eight invariants. We will also add a short discussion of design choices intended to avoid introducing new surfaces (e.g., explicit protocol state and deny-path audit). These changes constitute a partial revision; a full formal argument that the cases are exhaustive or that HCP is free of new surfaces lies outside the paper's scope. revision: partial
Referee: [Threat Model and Benchmark Selection] Threat Model and Benchmark Selection: The weakest assumption—that the 10 cases accurately model the full MCP-style attack surface—is load-bearing for the recommendation of an execution-control layer. A more explicit justification of case selection criteria and an argument against missing classes would be required to make the empirical comparison decisive.

Authors: The ten cases were chosen to exercise each of the eight invariants at least once, drawing from documented patterns in OAuth token handling, resource naming collisions, capability delegation, and cross-component data flows. We will expand the threat-model section with a table that states the selection criteria (coverage of the invariants, representativeness of published MCP/OAuth issues, and feasibility of implementation in the three runtimes) and briefly notes why certain other classes (e.g., side-channel timing or prompt-injection variants) were omitted as orthogonal to the execution-layer focus. We do not claim the set is complete; the recommendation for an execution layer rests on the consistent gap observed across the modeled cases, not on universality. This justification will be added in revision. revision: partial

standing simulated objections not resolved

A systematic derivation proving that the ten cases exhaust the relevant threat classes or that HCP introduces no new attack surface.

Circularity Check

0 steps flagged

No significant circularity; evaluation uses direct implementation comparisons on author-defined cases

full rationale

The paper defines eight invariants, implements them in HCP, and reports that HCP blocks all 10 benchmark attacks while baselines permit 6-10. These benchmarks are constructed around the invariants, but the result is a standard engineering validation rather than a self-referential fit or definition that reduces the central claim to its inputs by construction. No self-citations are load-bearing, no parameters are fitted then renamed as predictions, and no uniqueness theorems or ansatzes are smuggled in. The narrow claim of needing an execution-control layer rests on observable blocking counts, which are externally falsifiable via re-implementation even if the threat-model completeness is debatable.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on domain assumptions about the completeness of the eight invariants and the representativeness of the ten attack cases rather than on prior formal results or external benchmarks. The HCP runtime is introduced as a new artifact to make the invariants operational.

axioms (1)

domain assumption The eight listed invariants capture the essential security properties for MCP-style agent execution.
These invariants are defined by the paper and used as the basis for both the HCP design and the benchmark evaluation.

invented entities (1)

Handle-Capability Protocol (HCP) runtime no independent evidence
purpose: Reference implementation that makes the eight invariants explicit and testable while preserving MCP-like workflows.
HCP is newly introduced in the paper as the vehicle for enforcing the invariants through principals, resources, grants, capabilities, handles, policy decisions, data-pipe checks, and audit entries.

pith-pipeline@v0.9.1-grok · 7387 in / 1461 out tokens · 87455 ms · 2026-06-30T09:12:35.399916+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

21 extracted references · 6 canonical work pages · 1 internal anchor

[1]

Specification - model context protocol,

Model Context Protocol, “Specification - model context protocol,” 2025, accessed 2026-06-22. [Online]. Available: https://modelcontextprotocol.io/specification/2025-11-25

2025
[2]

Tools - model context protocol,

——, “Tools - model context protocol,” 2025, accessed 2026-06-22. [Online]. Available: https: //modelcontextprotocol.io/specification/2025-11-25/server/tools 14

2025
[3]

Authorization - model context protocol,

——, “Authorization - model context protocol,” 2025, accessed 2026-06-22. [Online]. Available: https://modelcontextprotocol.io/specification/2025-11-25/basic/authorization

2025
[4]

Transports - model context protocol,

——, “Transports - model context protocol,” 2025, accessed 2026-06-22. [Online]. Available: https: //modelcontextprotocol.io/specification/2025-11-25/basic/transports

2025
[5]

Security best practices - model context protocol,

——, “Security best practices - model context protocol,” 2026, accessed 2026-06-22. [Online]. Available: https://modelcontextprotocol.io/docs/tutorials/security/security best practices

2026
[6]

Configure MCP servers for your repository,

GitHub Docs, “Configure MCP servers for your repository,” 2026, gitHub Docs; accessed 2026-06-

2026
[7]

Available: https://docs.github.com/enterprise-cloud@latest/copilot/how-tos/copilot-on-github/ customize-copilot/customize-cloud-agent/extend-cloud-agent-with-mcp

[Online]. Available: https://docs.github.com/enterprise-cloud@latest/copilot/how-tos/copilot-on-github/ customize-copilot/customize-cloud-agent/extend-cloud-agent-with-mcp
[8]

Resource indicators for OAuth 2.0,

B. Campbell, J. Bradley, and H. Tschofenig, “Resource indicators for OAuth 2.0,” RFC 8707, 2020. [Online]. Available: https://www.rfc-editor.org/info/rfc8707/

2020
[9]

Best current practice for OAuth 2.0 security,

T. Lodderstedt, J. Bradley, A. Labunets, and D. Fett, “Best current practice for OAuth 2.0 security,” RFC 9700, 2025. [Online]. Available: https://www.rfc-editor.org/info/rfc9700/

2025
[10]

The confused deputy: (or why capabilities might have been invented),

N. Hardy, “The confused deputy: (or why capabilities might have been invented),”ACM SIGOPS Operating Systems Review, vol. 22, no. 4, pp. 36–38, 1988. [Online]. Available: https://dl.acm.org/doi/10.1145/54289. 871709

work page doi:10.1145/54289 1988
[11]

On access control, capabilities, their equivalence, and confused deputy attacks,

V . Rajani, D. Garg, and T. Rezk, “On access control, capabilities, their equivalence, and confused deputy attacks,” in2016 IEEE 29th Computer Security Foundations Symposium (CSF), 2016, pp. 150–163. [Online]. Available: https://doi.org/10.1109/CSF.2016.18

work page doi:10.1109/csf.2016.18 2016
[12]

Communications of the ACM , volume =

D. E. Denning, “A lattice model of secure information flow,”Communications of the ACM, vol. 19, no. 5, pp. 236–243, 1976. [Online]. Available: https://dl.acm.org/doi/10.1145/360051.360056

work page doi:10.1145/360051.360056 1976
[13]

Jflow: Practical mostly-static information flow control,

A. C. Myers, “Jflow: Practical mostly-static information flow control,” inProceedings of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1999, pp. 228–241. [Online]. Available: https://dl.acm.org/doi/10.1145/292540.292561

work page doi:10.1145/292540.292561 1999
[14]

PROV-DM: The PROV data model,

W3C, “PROV-DM: The PROV data model,” 2013, w3C Recommendation; accessed 2026-06-22. [Online]. Available: https://www.w3.org/TR/prov-dm/

2013
[15]

Provenance-aware storage systems,

K.-K. Muniswamy-Reddy, D. A. Holland, U. Braun, and M. Seltzer, “Provenance-aware storage systems,” inProceedings of the 2006 USENIX Annual Technical Conference, 2006. [Online]. Available: https://www.usenix.org/event/usenix06/tech/full papers/muniswamy-reddy/muniswamy-reddy.pdf

2006
[16]

Certificate transparency version 2.0,

B. Laurie, E. Messeri, and R. Stradling, “Certificate transparency version 2.0,” RFC 9162, 2021. [Online]. Available: https://www.rfc-editor.org/info/rfc9162/

2021
[17]

Model context protocol threat modeling and analyzing vulnerabilities to prompt injection with tool poisoning,

C. Huang, X. Huang, N. P. Tran, and A. M. Fard, “Model context protocol threat modeling and analyzing vulnerabilities to prompt injection with tool poisoning,” 2026. [Online]. Available: https://arxiv.org/abs/2603.22489

work page arXiv 2026
[18]

MCPTox: A benchmark for tool poisoning on real-world MCP servers,

Z. Wang, Y . Gao, Y . Wang, S. Liu, H. Sun, H. Cheng, G. Shi, H. Du, and X. Li, “MCPTox: A benchmark for tool poisoning on real-world MCP servers,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 42, 2026, pp. 35 811–35 819. [Online]. Available: https: //ojs.aaai.org/index.php/AAAI/article/view/40895

2026
[19]

MCP security notification: Tool poisoning attacks,

Invariant Labs, “MCP security notification: Tool poisoning attacks,” 2025, practi- tioner security report; accessed 2026-06-22. [Online]. Available: https://invariantlabs.ai/blog/ mcp-security-notification-tool-poisoning-attacks 15

2025
[20]

InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,

Q. Zhan, Z. Liang, Z. Ying, and D. Kang, “InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,” inFindings of the Association for Computational Linguistics: ACL 2024, 2024, pp. 10 471–10 506. [Online]. Available: https://aclanthology.org/2024.findings-acl.624/

2024
[21]

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents

E. Debenedetti, J. Zhang, M. Balunovic, L. Beurer-Kellner, M. Fischer, and F. Tramer, “AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents,” inThe Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2024, arXiv:2406.13352. [Online]. Available: https://openreview.net/...

work page internal anchor Pith review Pith/arXiv arXiv 2024

[1] [1]

Specification - model context protocol,

Model Context Protocol, “Specification - model context protocol,” 2025, accessed 2026-06-22. [Online]. Available: https://modelcontextprotocol.io/specification/2025-11-25

2025

[2] [2]

Tools - model context protocol,

——, “Tools - model context protocol,” 2025, accessed 2026-06-22. [Online]. Available: https: //modelcontextprotocol.io/specification/2025-11-25/server/tools 14

2025

[3] [3]

Authorization - model context protocol,

——, “Authorization - model context protocol,” 2025, accessed 2026-06-22. [Online]. Available: https://modelcontextprotocol.io/specification/2025-11-25/basic/authorization

2025

[4] [4]

Transports - model context protocol,

——, “Transports - model context protocol,” 2025, accessed 2026-06-22. [Online]. Available: https: //modelcontextprotocol.io/specification/2025-11-25/basic/transports

2025

[5] [5]

Security best practices - model context protocol,

——, “Security best practices - model context protocol,” 2026, accessed 2026-06-22. [Online]. Available: https://modelcontextprotocol.io/docs/tutorials/security/security best practices

2026

[6] [6]

Configure MCP servers for your repository,

GitHub Docs, “Configure MCP servers for your repository,” 2026, gitHub Docs; accessed 2026-06-

2026

[7] [7]

Available: https://docs.github.com/enterprise-cloud@latest/copilot/how-tos/copilot-on-github/ customize-copilot/customize-cloud-agent/extend-cloud-agent-with-mcp

[Online]. Available: https://docs.github.com/enterprise-cloud@latest/copilot/how-tos/copilot-on-github/ customize-copilot/customize-cloud-agent/extend-cloud-agent-with-mcp

[8] [8]

Resource indicators for OAuth 2.0,

B. Campbell, J. Bradley, and H. Tschofenig, “Resource indicators for OAuth 2.0,” RFC 8707, 2020. [Online]. Available: https://www.rfc-editor.org/info/rfc8707/

2020

[9] [9]

Best current practice for OAuth 2.0 security,

T. Lodderstedt, J. Bradley, A. Labunets, and D. Fett, “Best current practice for OAuth 2.0 security,” RFC 9700, 2025. [Online]. Available: https://www.rfc-editor.org/info/rfc9700/

2025

[10] [10]

The confused deputy: (or why capabilities might have been invented),

N. Hardy, “The confused deputy: (or why capabilities might have been invented),”ACM SIGOPS Operating Systems Review, vol. 22, no. 4, pp. 36–38, 1988. [Online]. Available: https://dl.acm.org/doi/10.1145/54289. 871709

work page doi:10.1145/54289 1988

[11] [11]

On access control, capabilities, their equivalence, and confused deputy attacks,

V . Rajani, D. Garg, and T. Rezk, “On access control, capabilities, their equivalence, and confused deputy attacks,” in2016 IEEE 29th Computer Security Foundations Symposium (CSF), 2016, pp. 150–163. [Online]. Available: https://doi.org/10.1109/CSF.2016.18

work page doi:10.1109/csf.2016.18 2016

[12] [12]

Communications of the ACM , volume =

D. E. Denning, “A lattice model of secure information flow,”Communications of the ACM, vol. 19, no. 5, pp. 236–243, 1976. [Online]. Available: https://dl.acm.org/doi/10.1145/360051.360056

work page doi:10.1145/360051.360056 1976

[13] [13]

Jflow: Practical mostly-static information flow control,

A. C. Myers, “Jflow: Practical mostly-static information flow control,” inProceedings of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 1999, pp. 228–241. [Online]. Available: https://dl.acm.org/doi/10.1145/292540.292561

work page doi:10.1145/292540.292561 1999

[14] [14]

PROV-DM: The PROV data model,

W3C, “PROV-DM: The PROV data model,” 2013, w3C Recommendation; accessed 2026-06-22. [Online]. Available: https://www.w3.org/TR/prov-dm/

2013

[15] [15]

Provenance-aware storage systems,

K.-K. Muniswamy-Reddy, D. A. Holland, U. Braun, and M. Seltzer, “Provenance-aware storage systems,” inProceedings of the 2006 USENIX Annual Technical Conference, 2006. [Online]. Available: https://www.usenix.org/event/usenix06/tech/full papers/muniswamy-reddy/muniswamy-reddy.pdf

2006

[16] [16]

Certificate transparency version 2.0,

B. Laurie, E. Messeri, and R. Stradling, “Certificate transparency version 2.0,” RFC 9162, 2021. [Online]. Available: https://www.rfc-editor.org/info/rfc9162/

2021

[17] [17]

Model context protocol threat modeling and analyzing vulnerabilities to prompt injection with tool poisoning,

C. Huang, X. Huang, N. P. Tran, and A. M. Fard, “Model context protocol threat modeling and analyzing vulnerabilities to prompt injection with tool poisoning,” 2026. [Online]. Available: https://arxiv.org/abs/2603.22489

work page arXiv 2026

[18] [18]

MCPTox: A benchmark for tool poisoning on real-world MCP servers,

Z. Wang, Y . Gao, Y . Wang, S. Liu, H. Sun, H. Cheng, G. Shi, H. Du, and X. Li, “MCPTox: A benchmark for tool poisoning on real-world MCP servers,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 42, 2026, pp. 35 811–35 819. [Online]. Available: https: //ojs.aaai.org/index.php/AAAI/article/view/40895

2026

[19] [19]

MCP security notification: Tool poisoning attacks,

Invariant Labs, “MCP security notification: Tool poisoning attacks,” 2025, practi- tioner security report; accessed 2026-06-22. [Online]. Available: https://invariantlabs.ai/blog/ mcp-security-notification-tool-poisoning-attacks 15

2025

[20] [20]

InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,

Q. Zhan, Z. Liang, Z. Ying, and D. Kang, “InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents,” inFindings of the Association for Computational Linguistics: ACL 2024, 2024, pp. 10 471–10 506. [Online]. Available: https://aclanthology.org/2024.findings-acl.624/

2024

[21] [21]

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents

E. Debenedetti, J. Zhang, M. Balunovic, L. Beurer-Kellner, M. Fischer, and F. Tramer, “AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents,” inThe Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2024, arXiv:2406.13352. [Online]. Available: https://openreview.net/...

work page internal anchor Pith review Pith/arXiv arXiv 2024