hub Mixed citations

Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions

Xinyi Hou, Yanjie Zhao, Shenao Wang, Haoyu Wang · 2025 · cs.CR · arXiv 2503.23278

Mixed citation behavior. Most common role is background (62%).

47 Pith papers citing it

Background 62% of classified citations

open full Pith review browse 47 citing papers arXiv PDF

abstract

The Model Context Protocol (MCP) is an emerging open standard that defines a unified, bi-directional communication and dynamic discovery protocol between AI models and external tools or resources, aiming to enhance interoperability and reduce fragmentation across diverse systems. This paper presents a systematic study of MCP from both architectural and security perspectives. We first define the full lifecycle of an MCP server, comprising four phases (creation, deployment, operation, and maintenance), further decomposed into 16 key activities that capture its functional evolution. Building on this lifecycle analysis, we construct a comprehensive threat taxonomy that categorizes security and privacy risks across four major attacker types: malicious developers, external attackers, malicious users, and security flaws, encompassing 16 distinct threat scenarios. To validate these risks, we develop and analyze real-world case studies that demonstrate concrete attack surfaces and vulnerability manifestations within MCP implementations. Based on these findings, the paper proposes a set of fine-grained, actionable security safeguards tailored to each lifecycle phase and threat category, offering practical guidance for secure MCP adoption. We also analyze the current MCP landscape, covering industry adoption, integration patterns, and supporting tools, to identify its technological strengths as well as existing limitations that constrain broader deployment. Finally, we outline future research and development directions aimed at strengthening MCP's standardization, trust boundaries, and sustainable growth within the evolving ecosystem of tool-augmented AI systems.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 10 method 2 baseline 1

citation-polarity summary

background 8 use method 2 baseline 1 support 1 unclear 1

representative citing papers

Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain

cs.CR · 2026-04-09 · unverdicted · novelty 8.0

Malicious LLM API routers actively perform payload injection and secret exfiltration, with 9 of 428 tested routers showing malicious behavior and further poisoning risks from leaked credentials.

Parasites in the Toolchain: A Large-Scale Analysis of Attacks on the MCP Ecosystem

cs.CR · 2025-09-08 · unverdicted · novelty 8.0

This paper defines a new Parasitic Toolchain Attack pattern (MCP-UPD) that assembles legitimate tools into privacy-exfiltrating workflows and reports the first large-scale scan of 12230 MCP tools across 1360 servers revealing systemic vulnerabilities from missing isolation and least-privilege in the

Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers

cs.SE · 2025-06-16 · conditional · novelty 8.0

First study of 1,899 MCP servers finds eight distinct vulnerabilities (only three traditional), 7.2% with general issues, 5.5% with tool poisoning, and 66% with code smells, urging MCP-specific security practices.

Grid-Orch: An LLM-Powered Orchestrator for Distribution Grid Simulation and Analytics

eess.SY · 2026-05-12 · conditional · novelty 7.0

Grid-Orch is an LLM-orchestrated system with 36 tools that lets users perform distribution grid simulations and optimizations through conversation, matching direct scripting results.

Five Attacks on x402 Agentic Payment Protocol

cs.CR · 2026-05-12 · conditional · novelty 7.0

Five practical attacks on the x402 agentic payment protocol are demonstrated across authorization, binding, replay protection, and web handling, validated on local chains, Base Sepolia, live endpoints, and three open-source SDKs.

DADL: A Declarative Description Language for Enterprise Tool Libraries in LLM Agent Systems

cs.SE · 2026-05-04 · unverdicted · novelty 7.0

DADL is a declarative YAML format that lets a single runtime handle many REST API tools for LLM agents, cutting tool advertisement context cost by 142x from 142,000 to 1,000 tokens on a catalog of 1,833 definitions.

Enforcing Benign Trajectories: A Behavioral Firewall for Structured-Workflow AI Agents

cs.CR · 2026-04-29 · unverdicted · novelty 7.0

A parameterized DFA firewall enforces safe tool sequences for structured AI agents, reducing attack success rates to 2.2% in tested workflows with low added latency.

A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework

cs.CR · 2026-04-25 · unverdicted · novelty 7.0

A new 7x4 taxonomy organizes agentic AI security threats by architectural layer and persistence timescale, revealing under-explored upper layers and missing defenses after surveying 116 papers.

From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company

cs.AI · 2026-04-24 · unverdicted · novelty 7.0

OMC framework turns multi-agent AI into self-organizing companies with Talents, Talent Market, and E²R search, achieving 84.67% success on PRDBench (15.48 points above prior art).

Breaking MCP with Function Hijacking Attacks: Novel Threats for Function Calling and Agentic Models

cs.CR · 2026-04-22 · unverdicted · novelty 7.0

A novel function hijacking attack achieves 70-100% success rates in forcing specific function calls across five LLMs on the BFCL benchmark and is robust to context semantics.

AgileLog: A Forkable Shared Log for Agents on Data Streams

cs.DC · 2026-04-16 · unverdicted · novelty 7.0

AgileLog introduces forkable shared logs with cheap forking and isolation to support AI agents on data streams.

Listening Alone, Understanding Together: Collaborative Context Recovery for Privacy-Aware AI

cs.AI · 2026-04-14 · conditional · novelty 7.0

CONCORD recovers conversation context in privacy-preserving AI assistants via spatio-temporal resolution, gap detection, and minimal relationship-aware A2A exchanges, achieving 91.4% gap recall, 96% relationship accuracy, and 97% true negative disclosure rate.

Multi-Agent Orchestration for High-Throughput Materials Screening on a Leadership-Class System

cs.AI · 2026-04-09 · unverdicted · novelty 7.0

A planner-executor multi-agent system using gpt-oss-120b and Parsl orchestrates scalable high-throughput MOF screening on the Aurora supercomputer with low overhead.

MCP-DPT: A Defense-Placement Taxonomy and Coverage Analysis for Model Context Protocol Security

cs.CR · 2026-04-08 · conditional · novelty 7.0

MCP-DPT creates a defense-placement taxonomy that organizes MCP threats and defenses across six architectural layers, revealing mostly tool-centric protections and gaps at orchestration, transport, and supply-chain layers.

Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study

cs.CR · 2026-04-03 · accept · novelty 7.0

Analysis of 17k LLM agent skills reveals 520 vulnerable ones with 1,708 leakage issues, primarily from debug output exposure, with a 10-pattern taxonomy and released dataset for future detection.

From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers

cs.CR · 2026-04-02 · unverdicted · novelty 7.0

Presents a component-centric PoC dataset of malicious MCP servers and a two-stage behavioral deviation detector Connor achieving 94.6% F1-score.

ProAgent: Harnessing On-Demand Sensory Contexts for Proactive LLM Agent Systems in the Wild

cs.AI · 2025-12-07 · conditional · novelty 7.0

ProAgent uses on-demand tiered perception and context-aware LLM reasoning to deliver proactive assistance on AR glasses, achieving up to 27.7% higher prediction accuracy and 20.5% lower false detections than baselines.

Can Language Models Go Beyond Coding? Assessing the Capability of Language Models to Build Real-World Systems

cs.SE · 2025-11-02 · unverdicted · novelty 7.0

Build-bench is the first architecture-aware benchmark that evaluates LLMs on repairing cross-ISA build failures via iterative tool-augmented reasoning, with the best model reaching 63.19% success.

AgentBound: Securing Execution Boundaries of AI Agents

cs.CR · 2025-10-24 · conditional · novelty 7.0

AgentBound is the first declarative access control framework for Model Context Protocol servers that generates policies from source code at 80.9% accuracy and blocks most threats in malicious servers with negligible overhead.

From REST to MCP: An Empirical Study of API Wrapping and Automated Server Generation for LLM Agents

cs.SE · 2025-07-21 · unverdicted · novelty 7.0

First large-scale empirical analysis of MCP server construction shows predominant REST wrapping with low operation exposure, plus an AutoMCP pipeline that improves automated generation success and reduces tool complexity.

Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

cs.AI · 2025-06-04 · unverdicted · novelty 7.0

Orak is a foundational benchmark providing training data, interfaces, and evaluation tools for LLM agents across diverse video game genres.

ColPackAgent: Agent-Skill-Guided Hard-Particle Monte Carlo Workflows for Colloidal Packing

cs.AI · 2026-05-15 · unverdicted · novelty 6.0

ColPackAgent integrates a custom colpack Python package wrapping HOOMD-blue with MCP tools and an agent skill to enable reliable autonomous workflows for colloidal packing simulations across interactive, prompt-driven, and autoresearch modes.

OpenAaaS: An Open Agent-as-a-Service Framework for Distributed Materials-Informatics Research

cond-mat.mtrl-sci · 2026-05-13 · unverdicted · novelty 6.0

OpenAaaS is a hierarchical agent-as-a-service system that enables secure multi-agent collaboration for materials informatics by moving code to data rather than data to code.

ComplexMCP: Evaluation of LLM Agents in Dynamic, Interdependent, and Large-Scale Tool Sandbox

cs.AI · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

ComplexMCP benchmark shows top LLM agents achieve under 60% success on dynamic interdependent tool tasks versus 90% for humans, due to tool retrieval saturation, over-confidence, and strategic defeatism.

citing papers explorer

Showing 47 of 47 citing papers.

Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain cs.CR · 2026-04-09 · unverdicted · none · ref 20 · internal anchor
Malicious LLM API routers actively perform payload injection and secret exfiltration, with 9 of 428 tested routers showing malicious behavior and further poisoning risks from leaked credentials.
Parasites in the Toolchain: A Large-Scale Analysis of Attacks on the MCP Ecosystem cs.CR · 2025-09-08 · unverdicted · none · ref 33 · internal anchor
This paper defines a new Parasitic Toolchain Attack pattern (MCP-UPD) that assembles legitimate tools into privacy-exfiltrating workflows and reports the first large-scale scan of 12230 MCP tools across 1360 servers revealing systemic vulnerabilities from missing isolation and least-privilege in the
Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers cs.SE · 2025-06-16 · conditional · none · ref 64 · internal anchor
First study of 1,899 MCP servers finds eight distinct vulnerabilities (only three traditional), 7.2% with general issues, 5.5% with tool poisoning, and 66% with code smells, urging MCP-specific security practices.
Grid-Orch: An LLM-Powered Orchestrator for Distribution Grid Simulation and Analytics eess.SY · 2026-05-12 · conditional · none · ref 22 · internal anchor
Grid-Orch is an LLM-orchestrated system with 36 tools that lets users perform distribution grid simulations and optimizations through conversation, matching direct scripting results.
Five Attacks on x402 Agentic Payment Protocol cs.CR · 2026-05-12 · conditional · none · ref 34 · internal anchor
Five practical attacks on the x402 agentic payment protocol are demonstrated across authorization, binding, replay protection, and web handling, validated on local chains, Base Sepolia, live endpoints, and three open-source SDKs.
DADL: A Declarative Description Language for Enterprise Tool Libraries in LLM Agent Systems cs.SE · 2026-05-04 · unverdicted · none · ref 3 · internal anchor
DADL is a declarative YAML format that lets a single runtime handle many REST API tools for LLM agents, cutting tool advertisement context cost by 142x from 142,000 to 1,000 tokens on a catalog of 1,833 definitions.
Enforcing Benign Trajectories: A Behavioral Firewall for Structured-Workflow AI Agents cs.CR · 2026-04-29 · unverdicted · none · ref 2 · internal anchor
A parameterized DFA firewall enforces safe tool sequences for structured AI agents, reducing attack success rates to 2.2% in tested workflows with low added latency.
A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework cs.CR · 2026-04-25 · unverdicted · none · ref 13 · internal anchor
A new 7x4 taxonomy organizes agentic AI security threats by architectural layer and persistence timescale, revealing under-explored upper layers and missing defenses after surveying 116 papers.
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company cs.AI · 2026-04-24 · unverdicted · none · ref 24 · internal anchor
OMC framework turns multi-agent AI into self-organizing companies with Talents, Talent Market, and E²R search, achieving 84.67% success on PRDBench (15.48 points above prior art).
Breaking MCP with Function Hijacking Attacks: Novel Threats for Function Calling and Agentic Models cs.CR · 2026-04-22 · unverdicted · none · ref 8 · internal anchor
A novel function hijacking attack achieves 70-100% success rates in forcing specific function calls across five LLMs on the BFCL benchmark and is robust to context semantics.
AgileLog: A Forkable Shared Log for Agents on Data Streams cs.DC · 2026-04-16 · unverdicted · none · ref 69 · internal anchor
AgileLog introduces forkable shared logs with cheap forking and isolation to support AI agents on data streams.
Listening Alone, Understanding Together: Collaborative Context Recovery for Privacy-Aware AI cs.AI · 2026-04-14 · conditional · none · ref 2 · internal anchor
CONCORD recovers conversation context in privacy-preserving AI assistants via spatio-temporal resolution, gap detection, and minimal relationship-aware A2A exchanges, achieving 91.4% gap recall, 96% relationship accuracy, and 97% true negative disclosure rate.
Multi-Agent Orchestration for High-Throughput Materials Screening on a Leadership-Class System cs.AI · 2026-04-09 · unverdicted · none · ref 23 · internal anchor
A planner-executor multi-agent system using gpt-oss-120b and Parsl orchestrates scalable high-throughput MOF screening on the Aurora supercomputer with low overhead.
MCP-DPT: A Defense-Placement Taxonomy and Coverage Analysis for Model Context Protocol Security cs.CR · 2026-04-08 · conditional · none · ref 20 · internal anchor
MCP-DPT creates a defense-placement taxonomy that organizes MCP threats and defenses across six architectural layers, revealing mostly tool-centric protections and gaps at orchestration, transport, and supply-chain layers.
Credential Leakage in LLM Agent Skills: A Large-Scale Empirical Study cs.CR · 2026-04-03 · accept · none · ref 23 · internal anchor
Analysis of 17k LLM agent skills reveals 520 vulnerable ones with 1,708 leakage issues, primarily from debug output exposure, with a 10-pattern taxonomy and released dataset for future detection.
From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers cs.CR · 2026-04-02 · unverdicted · none · ref 34 · internal anchor
Presents a component-centric PoC dataset of malicious MCP servers and a two-stage behavioral deviation detector Connor achieving 94.6% F1-score.
ProAgent: Harnessing On-Demand Sensory Contexts for Proactive LLM Agent Systems in the Wild cs.AI · 2025-12-07 · conditional · none · ref 20 · internal anchor
ProAgent uses on-demand tiered perception and context-aware LLM reasoning to deliver proactive assistance on AR glasses, achieving up to 27.7% higher prediction accuracy and 20.5% lower false detections than baselines.
Can Language Models Go Beyond Coding? Assessing the Capability of Language Models to Build Real-World Systems cs.SE · 2025-11-02 · unverdicted · none · ref 24 · internal anchor
Build-bench is the first architecture-aware benchmark that evaluates LLMs on repairing cross-ISA build failures via iterative tool-augmented reasoning, with the best model reaching 63.19% success.
AgentBound: Securing Execution Boundaries of AI Agents cs.CR · 2025-10-24 · conditional · none · ref 20 · internal anchor
AgentBound is the first declarative access control framework for Model Context Protocol servers that generates policies from source code at 80.9% accuracy and blocks most threats in malicious servers with negligible overhead.
From REST to MCP: An Empirical Study of API Wrapping and Automated Server Generation for LLM Agents cs.SE · 2025-07-21 · unverdicted · none · ref 11 · internal anchor
First large-scale empirical analysis of MCP server construction shows predominant REST wrapping with low operation exposure, plus an AutoMCP pipeline that improves automated generation success and reduces tool complexity.
Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games cs.AI · 2025-06-04 · unverdicted · none · ref 12 · internal anchor
Orak is a foundational benchmark providing training data, interfaces, and evaluation tools for LLM agents across diverse video game genres.
ColPackAgent: Agent-Skill-Guided Hard-Particle Monte Carlo Workflows for Colloidal Packing cs.AI · 2026-05-15 · unverdicted · none · ref 57 · internal anchor
ColPackAgent integrates a custom colpack Python package wrapping HOOMD-blue with MCP tools and an agent skill to enable reliable autonomous workflows for colloidal packing simulations across interactive, prompt-driven, and autoresearch modes.
OpenAaaS: An Open Agent-as-a-Service Framework for Distributed Materials-Informatics Research cond-mat.mtrl-sci · 2026-05-13 · unverdicted · none · ref 46 · internal anchor
OpenAaaS is a hierarchical agent-as-a-service system that enables secure multi-agent collaboration for materials informatics by moving code to data rather than data to code.
ComplexMCP: Evaluation of LLM Agents in Dynamic, Interdependent, and Large-Scale Tool Sandbox cs.AI · 2026-05-11 · unverdicted · none · ref 18 · 2 links · internal anchor
ComplexMCP benchmark shows top LLM agents achieve under 60% success on dynamic interdependent tool tasks versus 90% for humans, due to tool retrieval saturation, over-confidence, and strategic defeatism.
EvidenT: An Evidence-Preserving Framework for Iterative System-Level Package Repair cs.SE · 2026-05-09 · unverdicted · none · ref 21 · internal anchor
EvidenT repairs 53.88% of real-world RISC-V system-level package build failures by preserving repair history and build artifacts in a closed-loop validation system, outperforming baselines by a wide margin.
When Child Inherits: Modeling and Exploiting Subagent Spawn in Multi-Agent Networks cs.CR · 2026-05-08 · unverdicted · none · ref 10 · internal anchor
Multi-agent LLM frameworks can spread compromises across agent boundaries via insecure memory inheritance during subagent spawning.
Unsafe by Flow: Uncovering Bidirectional Data-Flow Risks in MCP Ecosystem cs.SE · 2026-05-08 · unverdicted · none · ref 24 · internal anchor
MCP-BiFlow detects 93.8% of known bidirectional data-flow vulnerabilities in MCP servers and identifies 118 confirmed issues across 87 real-world servers from a scan of 15,452 repositories.
Augmenting Interface Usability Heuristics for Reliable Computer-Use Agents cs.HC · 2026-05-04 · unverdicted · none · ref 4 · internal anchor
Augmented Nielsen heuristics improve computer-use agent task completion on varied interfaces while preserving human usability, as shown in UI-Verse experiments and human studies.
Less Is More: Measuring How LLM Involvement affects Chatbot Accuracy in Static Analysis cs.SE · 2026-04-23 · unverdicted · none · ref 6 · internal anchor
A structured JSON intermediate representation for LLM-generated static analysis queries outperforms both direct generation and agentic tool use, with gains of 15-25 percentage points on large models.
Diagnosing CFG Interpretation in LLMs cs.AI · 2026-04-22 · unverdicted · none · ref 23 · internal anchor
LLMs maintain surface syntax for novel CFGs but fail to preserve semantics under recursion and branching, relying on keyword bootstrapping rather than pure symbolic reasoning.
How Adversarial Environments Mislead Agentic AI? cs.AI · 2026-04-20 · unverdicted · none · ref 18 · internal anchor
Adversarial compromise of tool outputs misleads agentic AI via breadth and depth attacks, revealing that epistemic and navigational robustness are distinct and often trade off against each other.
QRAFTI: An Agentic Framework for Empirical Research in Quantitative Finance cs.MA · 2026-04-20 · unverdicted · none · ref 28 · internal anchor
QRAFTI is a multi-agent framework using tool-calling and reflection-based planning to emulate quant research tasks like factor replication and signal testing on financial data.
Towards Verifiable and Self-Correcting AI Physicists for Quantum Many-Body Simulations physics.comp-ph · 2026-03-31 · unverdicted · none · ref 25 · internal anchor
QMP-Bench supplies a realistic test set for AI on quantum many-body problems while PhysVEC uses integrated verifiers to turn unreliable LLM generations into code that passes both syntax and physics checks, outperforming baselines.
Beyond Individual Mimicry: Constructing Human-Like Social network with Graph-Augmented LLM Agents cs.SI · 2026-03-31 · unverdicted · none · ref 24 · internal anchor
GraphMind equips LLM agents with graph awareness to construct human-like social networks, producing botnets that substantially degrade performance of both text-based and graph-based detectors.
Bridging Protocol and Production: Design Patterns for Deploying AI Agents with Model Context Protocol cs.SE · 2026-03-12 · unverdicted · none · ref 10 · internal anchor
The paper proposes Context-Aware Broker Protocol, Adaptive Timeout Budget Allocation, and Structured Error Recovery Framework to address gaps in identity, budgeting, and error handling for production AI agent deployments using MCP.
Semantic Attacks on Tool-Augmented LLMs: Securing the Model Context Protocol Against Descriptor-Level Manipulation cs.CR · 2025-12-06 · unverdicted · none · ref 19 · internal anchor
Descriptor-level manipulation in the Model Context Protocol can drive LLMs to unsafe tool selections in up to 36% of cases; a layered defense of integrity checks, auxiliary-LLM vetting, and runtime guardrails reduces this to 15% and raises blocking to 74%.
Formalizing the Safety, Security, and Functional Properties of Agentic AI Systems cs.AI · 2025-10-15 · unverdicted · none · ref 42 · internal anchor
Introduces host agent and task lifecycle models plus 30 temporal logic properties to enable formal verification of liveness, safety, completeness, and fairness in agentic AI systems.
"Your AI, My Shell": Demystifying Prompt Injection Attacks on Agentic AI Coding Editors cs.CR · 2025-09-26 · unverdicted · none · ref 18 · internal anchor
Prompt injection attacks on agentic AI coding editors like Cursor and GitHub Copilot reach up to 84% success in executing malicious commands by poisoning external development resources.
AI Failures in the Eyes of the Downstream Developer: A First Look at Concerns, Practices, and Challenges cs.SE · 2025-03-25 · unverdicted · none · ref 47 · internal anchor
Mixed-methods study maps downstream developers' concerns, practices, and challenges with AI failures in PTM-based software.
VIPER-MCP: Detecting and Exploiting Taint-Style Vulnerabilities in Model Context Protocol Servers cs.CR · 2026-05-20 · unverdicted · none · ref 12 · internal anchor
VIPER-MCP detects and exploits taint-style vulnerabilities in Model Context Protocol servers via anchor-query static analysis and feedback-driven prompt evolution, uncovering 106 zero-day vulnerabilities across 39,884 repositories with 67 CVEs assigned.
A Prompt-Aware Structuring Framework for Reliable Reuse of AI-Generated Content in the Agentic Web cs.AI · 2026-05-10 · unverdicted · none · ref 10 · internal anchor
A framework structures AI-generated content with prompt-aware metadata and verifiable credentials to support reliable assessment and reuse by agents.
Think Before You Act -- A Neurocognitive Governance Model for Autonomous AI Agents cs.AI · 2026-04-28 · unverdicted · none · ref 41 · internal anchor
A neurocognitive governance model formalizes a Pre-Action Governance Reasoning Loop that consults global, workflow, agent, and situational rules before each action, yielding 95% compliance accuracy with zero false escalations in a retail supply-chain implementation.
Train the Trainers -- An Agentic AI Framework for Peer-Based Mental Health Support in Battlefield Environments cs.HC · 2026-03-31 · unverdicted · none · ref 42 · internal anchor
The paper introduces an agentic AI platform to train and support recovered soldiers as peer facilitators providing mental health triage and interventions in austere battlefield environments.
Security Threat Modeling for Emerging AI-Agent Protocols: A Comparative Analysis of MCP, A2A, Agora, and ANP cs.CR · 2026-02-11 · unverdicted · none · ref 21 · internal anchor
The paper identifies twelve protocol-level security risks across MCP, A2A, Agora, and ANP and quantifies wrong-provider tool execution risk in MCP via a measurement-driven case study on multi-server composition.
Toward a Safe Internet of Agents cs.MA · 2025-11-29 · unverdicted · none · ref 11 · internal anchor
The paper proposes a bottom-up framework for safe agentic AI systems that treats each component as a dual-use interface where added capabilities also expand attack surfaces across single agents, multi-agent systems, and interoperable ecosystems.
From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review cs.AI · 2025-04-28 · accept · none · ref 225 · internal anchor
A survey consolidating benchmarks, agent frameworks, real-world applications, and protocols for LLM-based autonomous agents into a proposed taxonomy with recommendations for future research.
Flowr -- Scaling Up Retail Supply Chain Operations Through Agentic AI in Large Scale Supermarket Chains cs.AI · 2026-04-07 · unverdicted · none · ref 13 · internal anchor
Flowr is an agentic AI framework that decomposes retail supply chain workflows into coordinated LLM-based agents with human-in-the-loop oversight to automate operations in large supermarket chains.

Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer