Agent Skills has structural security weaknesses from missing data-instruction boundaries, single-approval persistent trust, and absent marketplace reviews that require fundamental redesign.
hub Canonical reference
Navigating the risks: A survey of security, privacy, and ethics threats in llm-based agents
Canonical reference. 80% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
MCP lifecycle is defined with four phases and 16 activities; a threat taxonomy of 16 scenarios is constructed, validated via case studies, and paired with phase-specific safeguards.
Memory-equipped LLM agents exhibit increasing safety violation rates as memory accumulates across independent tasks, termed temporal memory contamination, detected via a new trigger-probe protocol.
MCP-BiFlow detects 93.8% of known bidirectional data-flow vulnerabilities in MCP servers and identifies 118 confirmed issues across 87 real-world servers from a scan of 15,452 repositories.
BlindGuard introduces an unsupervised hierarchical agent encoder plus corruption-guided contrastive detector that identifies malicious agents in LLM-based multi-agent systems without any attack labels or prior knowledge of malicious behaviors.
A three-layer probabilistic assume-guarantee architecture is structurally required for safe LLM agent deployment.
Privacy attacks on LLMs show strong signals for membership inference and backdoors but weaker performance for attribute inference and data extraction, with risks highly dependent on system configuration.
LARA-HPC introduces a validation-first agentic system with dry-run verification and multi-phase refinement that improves robustness of AI-generated DFT workflows on HPC systems.
A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.
citing papers explorer
-
Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis
Agent Skills has structural security weaknesses from missing data-instruction boundaries, single-approval persistent trust, and absent marketplace reviews that require fundamental redesign.
-
Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions
MCP lifecycle is defined with four phases and 16 activities; a threat taxonomy of 16 scenarios is constructed, validated via case studies, and paired with phase-specific safeguards.
-
Remembering More, Risking More: Longitudinal Safety Risks in Memory-Equipped LLM Agents
Memory-equipped LLM agents exhibit increasing safety violation rates as memory accumulates across independent tasks, termed temporal memory contamination, detected via a new trigger-probe protocol.
-
Unsafe by Flow: Uncovering Bidirectional Data-Flow Risks in MCP Ecosystem
MCP-BiFlow detects 93.8% of known bidirectional data-flow vulnerabilities in MCP servers and identifies 118 confirmed issues across 87 real-world servers from a scan of 15,452 repositories.
-
BlindGuard: Safeguarding LLM-based Multi-Agent Systems under Unknown Attacks
BlindGuard introduces an unsupervised hierarchical agent encoder plus corruption-guided contrastive detector that identifies malicious agents in LLM-based multi-agent systems without any attack labels or prior knowledge of malicious behaviors.
-
Position: A Three-Layer Probabilistic Assume-Guarantee Architecture Is Structurally Required for Safe LLM Agent Deployment
A three-layer probabilistic assume-guarantee architecture is structurally required for safe LLM agent deployment.
-
On the Privacy of LLMs: An Ablation Study
Privacy attacks on LLMs show strong signals for membership inference and backdoors but weaker performance for attribute inference and data extraction, with risks highly dependent on system configuration.
-
LARA: Validation-Driven Agentic Supercomputer Workflows for Atomistic Modeling
LARA-HPC introduces a validation-first agentic system with dry-run verification and multi-phase refinement that improves robustness of AI-generated DFT workflows on HPC systems.
-
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.
- Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses