pith. sign in

arxiv: 2508.12538 · v2 · pith:UZZMZVEYnew · submitted 2025-08-18 · 💻 cs.CR · cs.AI· cs.SE

MCPXKIT: The Unified Toolkit for Analyzing Model Context Protocol Security

classification 💻 cs.CR cs.AIcs.SE
keywords attackattackstoolcontextmcpxkitsecurityagentsanalysis
0
0 comments X
read the original abstract

The Model Context Protocol (MCP) has emerged as a universal standard that enables AI agents to seamlessly connect with external tools, significantly enhancing their functionality. However, while MCP brings notable benefits, it also introduces significant vulnerabilities, such as Tool Poisoning Attacks (TPA), where hidden malicious instructions exploit the sycophancy of large language models (LLMs) to manipulate agent behavior. Despite these risks, current academic research on MCP security remains limited, with most studies focusing on narrow or qualitative analyses that fail to capture the diversity of real-world threats. To address this gap, we present the MCP eXploit Toolkit (MCPXKIT), which categorizes and implements 31 distinct attack methods under four key classifications: direct tool injection, indirect tool injection, malicious user attacks, and LLM inherent attack. We further conduct a quantitative analysis of the efficacy of each attack. Our experiments reveal key insights into MCP vulnerabilities, including agents' blind reliance on tool descriptions, sensitivity to file-based attacks, chain attacks exploiting shared context, and difficulty distinguishing external data from executable commands. These insights, validated through attack experiments, underscore the urgency for robust defense strategies and informed MCP design. Our contributions include 1) constructing a comprehensive MCP attack taxonomy, 2) introducing a unified attack framework, MCPXKIT, and 3) conducting empirical vulnerability analysis to enhance MCP security mechanisms. This work provides a foundational framework, supporting the secure evolution of MCP ecosystems.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A First Measurement Study on Authentication Security in Real-World Remote MCP Servers

    cs.CR 2026-05 conditional novelty 8.0

    First measurement study of 7,973 remote MCP servers finds 40.55% lack authentication and all 119 tested OAuth servers have flaws that risk data leaks or account takeover.

  2. Parasites in the Toolchain: A Large-Scale Analysis of Attacks on the MCP Ecosystem

    cs.CR 2025-09 unverdicted novelty 8.0

    This paper defines a new Parasitic Toolchain Attack pattern (MCP-UPD) that assembles legitimate tools into privacy-exfiltrating workflows and reports the first large-scale scan of 12230 MCP tools across 1360 servers r...

  3. Sealing the Audit-Runtime Gap for LLM Skills

    cs.CR 2026-05 unverdicted novelty 7.0

    SIGIL cryptographically seals the audit-runtime gap for LLM skills via an on-chain registry with four publication types, DAO vetting, and a runtime verification loader that enforces integrity and permissions.

  4. MCP-DPT: A Defense-Placement Taxonomy and Coverage Analysis for Model Context Protocol Security

    cs.CR 2026-04 conditional novelty 7.0

    MCP-DPT creates a defense-placement taxonomy that organizes MCP threats and defenses across six architectural layers, revealing mostly tool-centric protections and gaps at orchestration, transport, and supply-chain layers.

  5. From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers

    cs.CR 2026-04 unverdicted novelty 7.0

    Presents a component-centric PoC dataset of malicious MCP servers and a two-stage behavioral deviation detector Connor achieving 94.6% F1-score.

  6. Machine Learning-Based Detection of MCP Attacks

    cs.CR 2026-04 unverdicted novelty 6.0

    Supervised ML models including SVC and BERT achieve 100% F1 on binary malicious/benign MCP tool detection and up to 90.56% on multiclass attack typing, outperforming rule-based baselines.