MCPXKIT: The Unified Toolkit for Analyzing Model Context Protocol Security
read the original abstract
The Model Context Protocol (MCP) has emerged as a universal standard that enables AI agents to seamlessly connect with external tools, significantly enhancing their functionality. However, while MCP brings notable benefits, it also introduces significant vulnerabilities, such as Tool Poisoning Attacks (TPA), where hidden malicious instructions exploit the sycophancy of large language models (LLMs) to manipulate agent behavior. Despite these risks, current academic research on MCP security remains limited, with most studies focusing on narrow or qualitative analyses that fail to capture the diversity of real-world threats. To address this gap, we present the MCP eXploit Toolkit (MCPXKIT), which categorizes and implements 31 distinct attack methods under four key classifications: direct tool injection, indirect tool injection, malicious user attacks, and LLM inherent attack. We further conduct a quantitative analysis of the efficacy of each attack. Our experiments reveal key insights into MCP vulnerabilities, including agents' blind reliance on tool descriptions, sensitivity to file-based attacks, chain attacks exploiting shared context, and difficulty distinguishing external data from executable commands. These insights, validated through attack experiments, underscore the urgency for robust defense strategies and informed MCP design. Our contributions include 1) constructing a comprehensive MCP attack taxonomy, 2) introducing a unified attack framework, MCPXKIT, and 3) conducting empirical vulnerability analysis to enhance MCP security mechanisms. This work provides a foundational framework, supporting the secure evolution of MCP ecosystems.
This paper has not been read by Pith yet.
Forward citations
Cited by 13 Pith papers
-
A First Measurement Study on Authentication Security in Real-World Remote MCP Servers
First measurement study of 7,973 remote MCP servers finds 40.55% lack authentication and all 119 tested OAuth servers have flaws that risk data leaks or account takeover.
-
Parasites in the Toolchain: A Large-Scale Analysis of Attacks on the MCP Ecosystem
This paper defines a new Parasitic Toolchain Attack pattern (MCP-UPD) that assembles legitimate tools into privacy-exfiltrating workflows and reports the first large-scale scan of 12230 MCP tools across 1360 servers r...
-
ShareLock: A Stealthy Multi-Tool Threshold Poisoning Attack Against MCP
ShareLock applies Shamir's threshold scheme to distribute poisoning payloads across multiple MCP tool descriptions, achieving information-theoretic secrecy and over 90% average attack success rate in multi-tool scenarios.
-
"What Happens Locally, Leaks Globally": Detecting Privacy Leakage Risks in MCP Servers
MCPPrivacyDetector applies cross-language taint analysis to detect protocol-induced privacy leaks in MCP servers, reporting >10% leakage rate across 10,655 real-world instances.
-
A Taxonomy of Runtime Faults in Model Context Protocol Servers
An empirical taxonomy of 11 top-level categories and 27 subcategories of runtime faults in MCP servers, derived via open coding of GitHub threads and validated by a survey of 55 developers.
-
Sealing the Audit-Runtime Gap for LLM Skills
SIGIL cryptographically seals the audit-runtime gap for LLM skills via an on-chain registry with four publication types, DAO vetting, and a runtime verification loader that enforces integrity and permissions.
-
MCP-DPT: A Defense-Placement Taxonomy and Coverage Analysis for Model Context Protocol Security
MCP-DPT creates a defense-placement taxonomy that organizes MCP threats and defenses across six architectural layers, revealing mostly tool-centric protections and gaps at orchestration, transport, and supply-chain layers.
-
From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers
Presents a component-centric PoC dataset of malicious MCP servers and a two-stage behavioral deviation detector Connor achieving 94.6% F1-score.
-
What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems
Formalizes stored prompt injection in agentic systems, develops a taxonomy and benchmark to show how adversarial prompts can persist across sessions via persistent state artifacts.
-
Machine Learning-Based Detection of MCP Attacks
Supervised ML models including SVC and BERT achieve 100% F1 on binary malicious/benign MCP tool detection and up to 90.56% on multiclass attack typing, outperforming rule-based baselines.
-
Lingering Authority: Revocable Resource-and-Effect Capabilities for Coding Agents
PORTICO is a revocable capability reference monitor for coding agents that enforces task contracts via grant-invoke-closure lifecycles and rejects post-closure reuses while preserving task success.
-
Think Twice Before You Act: Protecting LLM Agents Against Tool Description Poisoning via Isolated Planning
Tool-Guard uses isolated planning to quarantine suspicious tools, reducing tool description poisoning attacks on LLM agents while preserving task utility on AgentDojo and ASB benchmarks.
-
Securing the AI Agent: A Unified Framework for Multi-Layer Agent Red Teaming
AI-Infra-Guard is an open-source multi-layer red-teaming framework that pairs deterministic rules, LLM auditing, black-box testing, and jailbreak harnesses with the infrastructure, protocol, behavior, and model layers...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.