{"total":16,"items":[{"citing_arxiv_id":"2606.30755","ref_index":33,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Understanding and Evaluating Claw-like Agent Security Through a Computer-Systems Lens","primary_cat":"cs.CR","submitted_at":"2026-06-29T18:00:45+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"The paper introduces SafeClawArena, a 406-task benchmark evaluating security failures in three Claw-like agent platforms across skill supply-chain, state exploitation, data flow, and prompt injection surfaces.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.29142","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Agent Security Meets Regulatory Reality -- A Practitioner Systematization of Autonomous-Agent Threats and Controls in Regulated Financial Systems","primary_cat":"cs.CY","submitted_at":"2026-06-28T01:11:00+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Maps agent threats to ECOA, EU AI Act, GDPR, and FINRA rules, reports four production patterns from KYC automation that handled four in five cases same-day, and notes three negative results including audit failures.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.13385","ref_index":30,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Who Pays the Price? Stakeholder-Centric Prompt Injection Benchmarking for Real-world Web Agents","primary_cat":"cs.CR","submitted_at":"2026-06-11T14:12:43+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Introduces a stakeholder-centric benchmark showing current web agents fail all tested prompt injection objectives, with failures falling into stealthy parasitism, misaligned disruption, or compounded failure modes.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03024","ref_index":34,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"SkillGuard: A Permission Framework for Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-06-02T02:01:53+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SkillGuard presents a dual-plane permission framework for agent skills that achieves 99.76% taxonomy coverage and reduces attack success rates in evaluations on 315 skills.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02668","ref_index":30,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"What You Approve Is What Executes: Consent Integrity for Black-Box LLM Agents","primary_cat":"cs.CR","submitted_at":"2026-06-01T11:08:17+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"The paper introduces Consent Integrity as the property that actions shown for approval must be rendered by a trusted mediator from the real boundary action over an unspoofable path and bound to execution, with uninspectable actions surfaced rather than silently approved.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.26497","ref_index":21,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Aligning Provenance with Authorization: A Dual-Graph Defense for LLM Agents","primary_cat":"cs.CR","submitted_at":"2026-05-26T03:20:23+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"AuthGraph aligns an execution provenance graph with a clean authorization graph to detect parameter-source deviations from user intent, reducing attack success rates to 1-2% on AgentDojo and AgentDyn while retaining most task utility.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.24309","ref_index":56,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Reframing LLM Agent Security as an Agent-Human Interaction Problem","primary_cat":"cs.CR","submitted_at":"2026-05-23T00:36:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"LLM agent security is reframed as an agent-human interaction issue, supported by a survey showing industry preference for human-centric mechanisms over academic favorites and proposing a new research agenda.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.17830","ref_index":105,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Remembering More, Risking More: Longitudinal Safety Risks in Memory-Equipped LLM Agents","primary_cat":"cs.AI","submitted_at":"2026-05-18T04:06:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Memory-equipped LLM agents exhibit increasing safety violation rates as memory accumulates across independent tasks, termed temporal memory contamination, detected via a new trigger-probe protocol.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.13044","ref_index":2,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"No Attack Required: Semantic Fuzzing for Specification Violations in Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-05-13T05:57:06+00:00","verdict":"UNVERDICTED","verdict_confidence":"MODERATE","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Sefz discovers specification violations in 29.9% of 402 real-world agent skills by translating guardrails into reachability goals and guiding LLM mutations with a multi-armed bandit.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.11039","ref_index":27,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"The Granularity Mismatch in Agent Security: Argument-Level Provenance Solves Enforcement and Isolates the LLM Reasoning Bottleneck","primary_cat":"cs.CR","submitted_at":"2026-05-11T04:09:11+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"PACT achieves perfect security and utility under oracle provenance by enforcing argument-level trust contracts based on semantic roles and cross-step provenance tracking, outperforming invocation-level monitors in AgentDojo evaluations.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Adaptools: Adaptive tool-based indirect prompt injection attacks on agentic llms.arXiv preprint arXiv:2602.20720, 2026. [26] Peiran Wang, Yang Liu, Yunfei Lu, Yifeng Cai, Hongbo Chen, Qingyou Yang, Jie Zhang, Jue Hong, and Ye Wu. Agentarmor: Enforcing program analysis on agent runtime trace to defend against prompt injection.arXiv preprint arXiv:2508.01249, 2025. [27] Peiran Wang, Xinfeng Li, Chong Xiang, Jinghuai Zhang, Ying Li, Lixia Zhang, Xiaofeng Wang, and Yuan Tian. The landscape of prompt injection threats in llm agents: From taxonomy to analysis.arXiv preprint arXiv:2602.10453, 2026. [28] Tong Wu, Shujian Zhang, Kaiqiang Song, Silei Xu, Sanqiang Zhao, Ravi Agrawal, Sathish Reddy Indurthi, Chong Xiang, Prateek Mittal, and Wenxuan Zhou."},{"citing_arxiv_id":"2605.06393","ref_index":32,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation","primary_cat":"cs.CR","submitted_at":"2026-05-07T15:08:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A TEE-backed architecture isolates security-critical decisions in self-hosted AI agents to prevent host-level abuse from malicious inputs while maintaining allowed functionality.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Todmal, S. Mallik, and S. Mishra, \"Memory poisoning attack and defense on memory based LLM-agents,\" Jan. 2026. [Online]. Available: https://arxiv.org/abs/2601. 05504 [31] H. Wang, C. M. Poskitt, and J. Sun, \"Agentspec: Customizable runtime enforcement for safe and reliable LLM agents,\" Mar. 2025. [Online]. Available: https://arxiv.org/abs/2503.18666 [32] N. Palumbo, S. Choudhary, J. Choi, P. Chalasani, and S. Jha, \"Policy compiler for secure agentic systems,\" Feb. 2026. [Online]. Available: https://arxiv.org/abs/2602.16708 [33] C. L. Wang, T. Singhal, A. Kelkar, and J. Tuo, \"MI9: An integrated runtime governance framework for agentic AI,\" Aug. 2025. [Online]. Available: https://arxiv.org/abs/2508."},{"citing_arxiv_id":"2605.05846","ref_index":43,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"LoopTrap: Termination Poisoning Attacks on LLM Agents","primary_cat":"cs.CR","submitted_at":"2026-05-07T08:21:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LoopTrap is an automated red-teaming framework that crafts termination-poisoning prompts to amplify LLM agent steps by 3.57x on average (up to 25x) across 8 agents.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"independently executed 5 times to account for stochastic variation, resulting in 3,000 experimental runs per model. Target Models. To capture behavioral diversity across both open- source and proprietary LLMs, we evaluate the following models as the reasoning engine of the target agent: Gemini-3-Pro [10], GPT- 4o, GPT-4o-mini [29], DeepSeek-R1 [12], Kimi-K2-Thinking [19], GLM-5 [9], Grok-4 [43], and Claude Sonnet 4.5 [2]. Agent Framework. We implement a unified ReAct-style agent framework for our main experiments. The agent follows the stan- dard Thought-Action-Observation loop described in §2 and is equipped with a consistent set of tools across all experiments. While our main evaluation is conducted in this ReAct setting, the attack can gen-"},{"citing_arxiv_id":"2605.03378","ref_index":16,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection","primary_cat":"cs.CR","submitted_at":"2026-05-05T05:37:00+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"ARGUS defends LLM agents from context-aware prompt injections by tracking information provenance and verifying decisions against trustworthy evidence, reducing attack success to 3.8% while retaining 87.5% task utility.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.13860","ref_index":9,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"The Moltbook Observatory Archive: an incremental dataset of agent-only social network activity","primary_cat":"cs.SI","submitted_at":"2026-04-16T20:29:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"The Moltbook Observatory Archive is the first large-scale dataset from a social network populated exclusively by autonomous AI agents, covering 78 days with 2.6 million posts and 1.2 million comments.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.22819","ref_index":24,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A pragmatic approach to regulating AI agents","primary_cat":"cs.CY","submitted_at":"2026-04-16T13:04:52+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"AI agents require distinct regulation as AI systems under the EU AI Act with orchestration-layer oversight and a risk-based traffic light authorization system in contract law to preserve human accountability.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.04604","ref_index":114,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"AI Agents Under EU Law","primary_cat":"cs.CY","submitted_at":"2026-04-06T11:47:38+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"AI agent providers face an exhaustive inventory requirement for actions and data flows, as high-risk systems with untraceable behavioral drift cannot meet the AI Act's essential requirements.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Third, the AEPD adopts the \"rule of 2\" heuristic for agentic risk assessment: an agent should not simultaneously combine all three of the following without human oversight-processing untrusted input, accessing sensitive data, and taking autonomous action affecting individuals. The concept originates with Simon Willison's \"lethal trifecta,\" who identified these three properties as the conditions enabling data exfiltration via prompt injection [114]. Meta's security framework operationalised this in agent-specific form on 31 October 2025, extending Willison's framing by replacing \"external communication\" with the broader \"changing state,\" and explicitly acknowledging that combining all three properties without human oversight constitutes an unacceptable prompt injection risk [87]. The AEPD's contribution is applying"}],"limit":50,"offset":0}