hub Contested

Mcpsecbench: A systematic security benchmark and playground for testing model context protocols

Yang, Yixuan, Wu, Daoyuan, Chen, Yufan , year = · 2025 · arXiv 2508.13220

Contested. 1 Pith paper cite this work to dispute or refute its claims.

13 Pith papers citing it

Contested 1 dispute or refute

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 5

citation-polarity summary

background 3 contest 1 unclear 1

representative citing papers

Comment and Control: Hijacking Agentic Workflows via Context-Grounded Evolution

cs.CR · 2026-05-11 · unverdicted · novelty 8.0

JAW uses hybrid program analysis to evolve inputs that hijack agentic workflows, successfully compromising 4714 GitHub workflows and eight n8n templates to enable actions like credential exfiltration.

Parasites in the Toolchain: A Large-Scale Analysis of Attacks on the MCP Ecosystem

cs.CR · 2025-09-08 · unverdicted · novelty 8.0

This paper defines a new Parasitic Toolchain Attack pattern (MCP-UPD) that assembles legitimate tools into privacy-exfiltrating workflows and reports the first large-scale scan of 12230 MCP tools across 1360 servers revealing systemic vulnerabilities from missing isolation and least-privilege in the

Same Payload, Different Channel: Measuring Trust Asymmetry in Tool-Using Language Models

cs.LG · 2026-05-30 · unverdicted · novelty 7.0

Agent-native LLMs are substantially more vulnerable to adversarial instructions arriving in tool descriptions than user messages (with the pattern reversing for general-purpose models and inverting again for tool outputs), as quantified by the new Safety Asymmetry Score across six models and three a

MCP-DPT: A Defense-Placement Taxonomy and Coverage Analysis for Model Context Protocol Security

cs.CR · 2026-04-08 · conditional · novelty 7.0

MCP-DPT creates a defense-placement taxonomy that organizes MCP threats and defenses across six architectural layers, revealing mostly tool-centric protections and gaps at orchestration, transport, and supply-chain layers.

ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems

cs.AI · 2026-04-06 · unverdicted · novelty 7.0

ShieldNet detects supply-chain poisoned tools in LLM agents by monitoring network interactions with a MITM proxy and lightweight classifier, reaching 0.995 F1 and 0.8% false positives on a new benchmark of 25+ attack types.

From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers

cs.CR · 2026-04-02 · unverdicted · novelty 7.0

Presents a component-centric PoC dataset of malicious MCP servers and a two-stage behavioral deviation detector Connor achieving 94.6% F1-score.

Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions

cs.CR · 2025-03-30 · unverdicted · novelty 7.0

MCP lifecycle is defined with four phases and 16 activities; a threat taxonomy of 16 scenarios is constructed, validated via case studies, and paired with phase-specific safeguards.

Unsafe by Flow: Uncovering Bidirectional Data-Flow Risks in MCP Ecosystem

cs.SE · 2026-05-08 · unverdicted · novelty 6.0

MCP-BiFlow detects 93.8% of known bidirectional data-flow vulnerabilities in MCP servers and identifies 118 confirmed issues across 87 real-world servers from a scan of 15,452 repositories.

MCPThreatHive: Automated Threat Intelligence for Model Context Protocol Ecosystems

cs.CR · 2026-04-15 · unverdicted · novelty 6.0

MCPThreatHive automates the full lifecycle of threat intelligence for MCP agentic systems using a new 38-pattern taxonomy mapped to STRIDE and OWASP frameworks plus composite risk scoring.

A Formal Security Framework for MCP-Based AI Agents: Threat Taxonomy, Verification Models, and Defense Mechanisms

cs.CR · 2026-04-07 · unverdicted · novelty 6.0

MCPSHIELD offers a threat taxonomy of 23 attack vectors, a labeled transition system verification model, and a defense-in-depth architecture claiming 91% coverage for MCP-based AI agents.

Formalizing the Safety, Security, and Functional Properties of Agentic AI Systems

cs.AI · 2025-10-15 · unverdicted · novelty 6.0

Introduces host agent and task lifecycle models plus 30 temporal logic properties to enable formal verification of liveness, safety, completeness, and fairness in agentic AI systems.

From Language to Action: Enhancing LLM Task Efficiency with Task-Aware MCP Server Recommendation

cs.SE · 2026-04-19 · unverdicted · novelty 5.0

Introduces Task2MCP dataset and T2MRec model for recommending MCP servers to LLM agents based on task semantics and engineering constraints.

Toward Secure LLM Agents: Threat Surfaces, Attacks, Defenses, and Evaluation

cs.CR · 2026-06-09 · unverdicted · novelty 3.0

A synthesis of 247 papers on LLM agent security identifies prompt injection and tool hijacking as dominant threats, notes weakly compositional defenses, and argues for trust boundaries and realistic evaluations.

citing papers explorer

Showing 13 of 13 citing papers.

Comment and Control: Hijacking Agentic Workflows via Context-Grounded Evolution cs.CR · 2026-05-11 · unverdicted · none · ref 33
JAW uses hybrid program analysis to evolve inputs that hijack agentic workflows, successfully compromising 4714 GitHub workflows and eight n8n templates to enable actions like credential exfiltration.
Parasites in the Toolchain: A Large-Scale Analysis of Attacks on the MCP Ecosystem cs.CR · 2025-09-08 · unverdicted · none · ref 51
This paper defines a new Parasitic Toolchain Attack pattern (MCP-UPD) that assembles legitimate tools into privacy-exfiltrating workflows and reports the first large-scale scan of 12230 MCP tools across 1360 servers revealing systemic vulnerabilities from missing isolation and least-privilege in the
Same Payload, Different Channel: Measuring Trust Asymmetry in Tool-Using Language Models cs.LG · 2026-05-30 · unverdicted · none · ref 2
Agent-native LLMs are substantially more vulnerable to adversarial instructions arriving in tool descriptions than user messages (with the pattern reversing for general-purpose models and inverting again for tool outputs), as quantified by the new Safety Asymmetry Score across six models and three a
MCP-DPT: A Defense-Placement Taxonomy and Coverage Analysis for Model Context Protocol Security cs.CR · 2026-04-08 · conditional · none · ref 63
MCP-DPT creates a defense-placement taxonomy that organizes MCP threats and defenses across six architectural layers, revealing mostly tool-centric protections and gaps at orchestration, transport, and supply-chain layers.
ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems cs.AI · 2026-04-06 · unverdicted · none · ref 7
ShieldNet detects supply-chain poisoned tools in LLM agents by monitoring network interactions with a MITM proxy and lightweight classifier, reaching 0.995 F1 and 0.8% false positives on a new benchmark of 25+ attack types.
From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers cs.CR · 2026-04-02 · unverdicted · none · ref 72
Presents a component-centric PoC dataset of malicious MCP servers and a two-stage behavioral deviation detector Connor achieving 94.6% F1-score.
Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions cs.CR · 2025-03-30 · unverdicted · none · ref 75
MCP lifecycle is defined with four phases and 16 activities; a threat taxonomy of 16 scenarios is constructed, validated via case studies, and paired with phase-specific safeguards.
Unsafe by Flow: Uncovering Bidirectional Data-Flow Risks in MCP Ecosystem cs.SE · 2026-05-08 · unverdicted · none · ref 62
MCP-BiFlow detects 93.8% of known bidirectional data-flow vulnerabilities in MCP servers and identifies 118 confirmed issues across 87 real-world servers from a scan of 15,452 repositories.
MCPThreatHive: Automated Threat Intelligence for Model Context Protocol Ecosystems cs.CR · 2026-04-15 · unverdicted · none · ref 8
MCPThreatHive automates the full lifecycle of threat intelligence for MCP agentic systems using a new 38-pattern taxonomy mapped to STRIDE and OWASP frameworks plus composite risk scoring.
A Formal Security Framework for MCP-Based AI Agents: Threat Taxonomy, Verification Models, and Defense Mechanisms cs.CR · 2026-04-07 · unverdicted · none · ref 13
MCPSHIELD offers a threat taxonomy of 23 attack vectors, a labeled transition system verification model, and a defense-in-depth architecture claiming 91% coverage for MCP-based AI agents.
Formalizing the Safety, Security, and Functional Properties of Agentic AI Systems cs.AI · 2025-10-15 · unverdicted · none · ref 49
Introduces host agent and task lifecycle models plus 30 temporal logic properties to enable formal verification of liveness, safety, completeness, and fairness in agentic AI systems.
From Language to Action: Enhancing LLM Task Efficiency with Task-Aware MCP Server Recommendation cs.SE · 2026-04-19 · unverdicted · none · ref 21
Introduces Task2MCP dataset and T2MRec model for recommending MCP servers to LLM agents based on task semantics and engineering constraints.
Toward Secure LLM Agents: Threat Surfaces, Attacks, Defenses, and Evaluation cs.CR · 2026-06-09 · unverdicted · none · ref 224
A synthesis of 247 papers on LLM agent security identifies prompt injection and tool hijacking as dominant threats, notes weakly compositional defenses, and argues for trust boundaries and realistic evaluations.

Mcpsecbench: A systematic security benchmark and playground for testing model context protocols

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer