MCP-SandboxScan: WASM-based Secure Execution and Runtime Analysis for MCP Tools

Christos Anagnostopoulos; Jeremy Singer; Run Hao; Yutian Tang; Zhuoran Tan

arxiv: 2601.01241 · v2 · pith:VJLUNUAJnew · submitted 2026-01-03 · 💻 cs.CR · cs.SE

MCP-SandboxScan: WASM-based Secure Execution and Runtime Analysis for MCP Tools

Zhuoran Tan , Run Hao , Jeremy Singer , Yutian Tang , Christos Anagnostopoulos This is my paper

classification 💻 cs.CR cs.SE

keywords toolssandscopesemanticexecutionprofilingrepositoriesruntimetool

0 comments

read the original abstract

Tool-augmented Large Language Model (LLM) agents create a new supply-chain surface: Model Context Protocol (MCP) tools are installed like third-party packages, yet their outputs can enter the agent's reasoning context. This enables confused-deputy risks in which attacker-controlled inputs cause otherwise benign tools to exercise legitimate authority over files, environment variables, or network-facing operations and reflect sensitive or instruction-like content into LLM-visible fields. We present SandScope, an MCP-aware audit framework that combines runtime witness detection with semantic tool profiling. SandScope executes portable tools under WebAssembly System Interface (WASI) or drives unmodified MCP servers over standard input/output (stdio), extracts LLM-visible sinks from tool results and prompt/message fields, and reports auditable source-to-sink witnesses from environment, file, and tool-input sources while separately recording network-intent and egress evidence. Its semantic layer recovers declared capabilities from tools/list metadata and static registrations to characterize attack surface when execution is incomplete. We evaluate SandScope on controlled cross-language subjects, an evasion benchmark, and a 100-repository MCP corpus. SandScope completes shallow dynamic scans for 35 repositories and, through a broader semantic profiling pass, recovers metadata for 1,127 tools across 71 repositories, including 886 tools with security-sensitive declared capabilities. A schema-guided exploration pass over the 35 dynamically scanned repositories re-executes 33 and observes source-to-sink witnesses in 12. These results show that SandScope provides practical, auditable evidence for MCP tool risk through controlled execution, MCP-aware sink extraction, runtime witness reporting, and semantic capability profiling.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

MCP-DPT: A Defense-Placement Taxonomy and Coverage Analysis for Model Context Protocol Security
cs.CR 2026-04 conditional novelty 7.0

MCP-DPT creates a defense-placement taxonomy that organizes MCP threats and defenses across six architectural layers, revealing mostly tool-centric protections and gaps at orchestration, transport, and supply-chain layers.