{"total":13,"items":[{"citing_arxiv_id":"2606.10749","ref_index":224,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Toward Secure LLM Agents: Threat Surfaces, Attacks, Defenses, and Evaluation","primary_cat":"cs.CR","submitted_at":"2026-06-09T12:01:07+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":3.0,"formal_verification":"none","one_line_summary":"A synthesis of 247 papers on LLM agent security identifies prompt injection and tool hijacking as dominant threats, notes weakly compositional defenses, and argues for trust boundaries and realistic evaluations.","context_count":1,"top_context_role":"background","top_context_polarity":"unclear","context_text":"[222] Miao Yu, Shilong Wang, Guibin Zhang, Junyuan Mao, Chenlong Yin, Qijiong Liu, Qingsong Wen, Kun Wang, and Yang Wang. 2024. NetSafe: Exploring the Topological Safety of Multi-agent Networks. arXiv:2410.15686 [cs.MA] doi:10.48550/arXiv.2410.15686 [223] Qiang Yu, Xinran Cheng, and Chuanyi Liu. 2026. Defense Against Indirect Prompt Injection via Tool Result Parsing. arXiv:2601.04795 [cs.AI] doi:10.48550/arXiv.2601.04795 [224] Tongxin Yuan, Zhiwei He, Lingzhong Dong, Yiming Wang, Ruijie Zhao, Tian Xia, Lizhen Xu, Binglin Zhou, Fangqi Li, Zhuosheng Zhang, Rui Wang, and Gongshen Liu. 2024. R-Judge: Benchmarking Safety Risk Awareness for LLM Agents. InFindings of the Association for Computational Linguistics: EMNLP 2024. Association for Computational Linguistics, 1467-1490."},{"citing_arxiv_id":"2606.00566","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Same Payload, Different Channel: Measuring Trust Asymmetry in Tool-Using Language Models","primary_cat":"cs.LG","submitted_at":"2026-05-30T06:38:41+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Agent-native LLMs are substantially more vulnerable to adversarial instructions arriving in tool descriptions than user messages (with the pattern reversing for general-purpose models and inverting again for tool outputs), as quantified by the new Safety Asymmetry Score across six models and three a","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.11229","ref_index":33,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Comment and Control: Hijacking Agentic Workflows via Context-Grounded Evolution","primary_cat":"cs.CR","submitted_at":"2026-05-11T20:45:31+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"JAW uses hybrid program analysis to evolve inputs that hijack agentic workflows, successfully compromising 4714 GitHub workflows and eight n8n templates to enable actions like credential exfiltration.","context_count":1,"top_context_role":"background","top_context_polarity":"contest","context_text":"GUS [25] applied static analysis to find vulnerabilities of traditional workflows for CI/CD pipelines. However, this approach targets classic taint-style vulnerabilities and cannot track how potentially malicious data is fed into an LLM within an agentic workflow. On the other side, existing jailbreaking approaches against either LLMs [1, 4, 12, 23, 37, 40, 41] or LLM agents [ 33, 36, 38] assume that the inputs to an LLM are known, e.g., the prompt structure and which part of the prompt is accessible to an adversary. How- ever, the LLM inputs within an agentic workflow are formed with complicated procedures that often involve multiple workflows and programs in different languages [18]. arXiv:2605.11229v1 [cs.CR] 11 May 2026"},{"citing_arxiv_id":"2605.07836","ref_index":62,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Unsafe by Flow: Uncovering Bidirectional Data-Flow Risks in MCP Ecosystem","primary_cat":"cs.SE","submitted_at":"2026-05-08T15:03:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MCP-BiFlow detects 93.8% of known bidirectional data-flow vulnerabilities in MCP servers and identifies 118 confirmed issues across 87 real-world servers from a scan of 15,452 repositories.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Framework for Model Context Protocol Integrity in Large Language Model Applications.CoRRabs/2508.10991 (2025). arXiv:2508.10991 doi:10.48550/ARXIV. 2508.10991 [61] Yixuan Yang, Daoyuan Wu, and Yufan Chen. 2025. MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols.CoRR abs/2508.13220 (2025). arXiv:2508.13220 doi:10.48550/ARXIV.2508.13220 [62] Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2022. ReAct: Synergizing Reasoning and Acting in Language Models.CoRRabs/2210.03629 (2022). arXiv:2210.03629 doi:10.48550/ARXIV.2210. 03629 [63] Miao Yu, Fanci Meng, Xinyun Zhou, Shilong Wang, Junyuan Mao, Linsey Pang, Tianlong Chen, Kun Wang, Xinfeng Li, Yongfeng Zhang, et al."},{"citing_arxiv_id":"2604.17234","ref_index":21,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"From Language to Action: Enhancing LLM Task Efficiency with Task-Aware MCP Server Recommendation","primary_cat":"cs.SE","submitted_at":"2026-04-19T03:38:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Introduces Task2MCP dataset and T2MRec model for recommending MCP servers to LLM agents based on task semantics and engineering constraints.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.13849","ref_index":8,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MCPThreatHive: Automated Threat Intelligence for Model Context Protocol Ecosystems","primary_cat":"cs.CR","submitted_at":"2026-04-15T13:19:22+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MCPThreatHive automates the full lifecycle of threat intelligence for MCP agentic systems using a new 38-pattern taxonomy mapped to STRIDE and OWASP frameworks plus composite risk scoring.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.07551","ref_index":63,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"MCP-DPT: A Defense-Placement Taxonomy and Coverage Analysis for Model Context Protocol Security","primary_cat":"cs.CR","submitted_at":"2026-04-08T19:53:26+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"MCP-DPT creates a defense-placement taxonomy that organizes MCP threats and defenses across six architectural layers, revealing mostly tool-centric protections and gaps at orchestration, transport, and supply-chain layers.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents. Advances in Neural Information Processing Systems37 (2024), 100938-100964. [62] Yixuan Yang, Daoyuan Wu, and Yufan Chen. 2025. MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols. arXiv:2508.13220 [cs.CR] doi:10.48550/arXiv.2508.13220 [63] S. Yao et al. 2023. ReAct: Synergizing Reasoning and Acting in Language Models. arXiv:2210.03629 doi:10.48550/arXiv.2210.03629 [64] C. Yu, Z. Cheng, H. Cui, Y. Gao, Z. Luo, Y. Wang, H. Zheng, and Y. Zhao. 2025. A Survey on Agent Workflow-Status and Future. In2025 8th International Conference on Artificial Intelligence and Big Data (ICAIBD). IEEE, 770-781."},{"citing_arxiv_id":"2604.05969","ref_index":13,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Formal Security Framework for MCP-Based AI Agents: Threat Taxonomy, Verification Models, and Defense Mechanisms","primary_cat":"cs.CR","submitted_at":"2026-04-07T15:02:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MCPSHIELD offers a threat taxonomy of 23 attack vectors, a labeled transition system verification model, and a defense-in-depth architecture claiming 91% coverage for MCP-based AI agents.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.04426","ref_index":7,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems","primary_cat":"cs.AI","submitted_at":"2026-04-06T05:15:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"ShieldNet detects supply-chain poisoned tools in LLM agents by monitoring network interactions with a MITM proxy and lightweight classifier, reaching 0.995 F1 and 0.8% false positives on a new benchmark of 25+ attack types.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.01905","ref_index":72,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers","primary_cat":"cs.CR","submitted_at":"2026-04-02T11:22:07+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Presents a component-centric PoC dataset of malicious MCP servers and a two-stage behavioral deviation detector Connor achieving 94.6% F1-score.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2510.14133","ref_index":49,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Formalizing the Safety, Security, and Functional Properties of Agentic AI Systems","primary_cat":"cs.AI","submitted_at":"2025-10-15T22:02:30+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Introduces host agent and task lifecycle models plus 30 temporal logic properties to enable formal verification of liveness, safety, completeness, and fairness in agentic AI systems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2509.06572","ref_index":51,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Parasites in the Toolchain: A Large-Scale Analysis of Attacks on the MCP Ecosystem","primary_cat":"cs.CR","submitted_at":"2025-09-08T11:35:32+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"This paper defines a new Parasitic Toolchain Attack pattern (MCP-UPD) that assembles legitimate tools into privacy-exfiltrating workflows and reports the first large-scale scan of 12230 MCP tools across 1360 servers revealing systemic vulnerabilities from missing isolation and least-privilege in the","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2503.23278","ref_index":75,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions","primary_cat":"cs.CR","submitted_at":"2025-03-30T01:58:22+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"MCP lifecycle is defined with four phases and 16 activities; a threat taxonomy of 16 scenarios is constructed, validated via case studies, and paired with phase-specific safeguards.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"[25] demonstrate that obfuscated adversarial prompts can lead LLM agents to misuse tools, enabling attacks such as data exfiltration and unauthorized command execution. These vulnerabilities are particularly concerning as they generalize across models and modalities. A growing body of work has begun to categorize and analyze these risks. Gan et al.[26] and Yu et al.[75] propose taxonomies for threats across agent components and stages, while the OWASP Agentic Security Initiative [34] provides practical threat modeling frameworks. To support detection and mitigation, Chen et al.[9] introduce AgentGuard, which automatically discovers unsafe workflows and generates safety constraints, and ToolFuzz[46] identifies failures stemming from ambiguous or underspecified tool documentation."}],"limit":50,"offset":0}