{"total":17,"items":[{"citing_arxiv_id":"2606.11007","ref_index":18,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Understanding and mitigating the risks of OpenClaw for non-technical users: A practical guide with Skill","primary_cat":"cs.CR","submitted_at":"2026-06-09T15:41:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":2.0,"formal_verification":"none","one_line_summary":"This work categorizes seven risks of OpenClaw for non-technical users, provides plain-language mitigations, and supplies a companion Skill to automate security configurations.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.10484","ref_index":12,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"AgentCanary: A Security Evaluation Framework for Autonomous AI Agents in Real Executable Environments","primary_cat":"cs.CR","submitted_at":"2026-06-09T06:55:37+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"AgentCanary introduces an Entry × Impact risk taxonomy, high-fidelity real tool environments with persistent state, and multi-dimensional trajectory evaluation to assess AI agent security across models and attacks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.04329","ref_index":1,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agents","primary_cat":"cs.CR","submitted_at":"2026-06-03T01:04:13+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"The study identifies four memory write channels and nine structural vulnerabilities in LLM agents, proposes a taxonomy of six attack classes, introduces MPBench, and finds that aggressive memory use increases exploitability while existing defenses fail.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03518","ref_index":33,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Overlaying Governance: A Compositional Authorization Framework for Delegation and Scope in Agentic AI","primary_cat":"cs.AI","submitted_at":"2026-06-02T11:39:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Introduces a compositional governance framework defining delegation types, resource scope attenuation, and an overlay operator for agentic AI authorization policies.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02302","ref_index":12,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"SeClaw: Spec-Driven Security Task Synthesis for Evaluating Autonomous Agents","primary_cat":"cs.CR","submitted_at":"2026-06-01T14:23:42+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SeClaw provides spec-driven synthesis of security tasks and an execution-based docker testbed for evaluating unsafe behaviors in autonomous LLM agents.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.01166","ref_index":31,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"BraveGuard: From Open-World Threats to Safer Computer-Use Agents","primary_cat":"cs.CR","submitted_at":"2026-05-31T11:16:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"BraveGuard trains guard models on realistic agent trajectories derived from open-world threats, raising detection accuracy on AgentHazard from 38.79% to 82.38%.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.25435","ref_index":43,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Security of OpenClaw Agents: Fundamentals, Attacks, and Countermeasures","primary_cat":"cs.AI","submitted_at":"2026-05-25T05:25:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":2.0,"formal_verification":"none","one_line_summary":"A survey that categorizes threats to OpenClaw agents including skill poisoning and cognitive manipulation and reviews defense mechanisms.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.24309","ref_index":46,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Reframing LLM Agent Security as an Agent-Human Interaction Problem","primary_cat":"cs.CR","submitted_at":"2026-05-23T00:36:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"LLM agent security is reframed as an agent-human interaction issue, supported by a survey showing industry preference for human-centric mechanisms over academic favorites and proposing a new research agenda.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.22321","ref_index":24,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions","primary_cat":"cs.CR","submitted_at":"2026-05-21T11:07:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A3S-Bench evaluates LLM agents against temporal, spatial, and semantic evasions, raising average risk trigger rates from 28.3% to 52.6% across 2,254 trajectories and 20 scenarios.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.18673","ref_index":69,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Generative AI Advertising as a Problem of Trustworthy Commercial Intervention","primary_cat":"cs.CY","submitted_at":"2026-05-18T17:15:06+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Generative AI advertising is reframed as a problem of trustworthy commercial intervention on the generative process, with a taxonomy of influence tiers from product mentions to long-term preference shaping.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.17986","ref_index":14,"ref_count":2,"confidence":0.9,"is_internal_anchor":false,"paper_title":"LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injection","primary_cat":"cs.CR","submitted_at":"2026-05-18T07:41:35+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LivePI benchmark reports indirect prompt injection success rates of 10.7-29.6% across five models on seven input surfaces and shows a two-layer defense blocking all malicious completions while preserving utility.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.10365","ref_index":15,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values","primary_cat":"cs.AI","submitted_at":"2026-05-11T11:09:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"Agent-ValueBench is the first dedicated benchmark for agent values, showing they diverge from LLM values, form a homogeneous 'Value Tide' across models, and bend under harnesses and skill steering.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Yet every gain in agentic capability brings commensurate risk, and whether such systems advance or undermine human interests now determines whether the technology benefits or threatens society [11, 12, 13, 14]. Crucially, the visible boundary of \"safe\" behavior is itself a downstream symptom of a deeper invariant, namelyvalues, the trans-situational priorities that quietly steer how an agent acts [15, 16, 17, 18]. As we enter the agent era, we contend that the systematic study ofagent valueshas emerged as a critical imperative for the field. Agent Values Are Not Identical to LLM Values.While a substantial body of work has charted the values embedded in LLMs (e.g., ValueBench [19] and ValueCompass [20]), we emphasize that the values exhibited by an agent can meaningfully diverge from those of its underlying LLM."},{"citing_arxiv_id":"2605.06393","ref_index":19,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation","primary_cat":"cs.CR","submitted_at":"2026-05-07T15:08:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"A TEE-backed architecture isolates security-critical decisions in self-hosted AI agents to prevent host-level abuse from malicious inputs while maintaining allowed functionality.","context_count":1,"top_context_role":"baseline","top_context_polarity":"baseline","context_text":"XX, MONTH 2026 15 TABLE III COMPARISON WITH REPRESENTATIVE EXISTING WORK. Work category Representative works Targets SHCUA host- level abuse Operation- level risk modeling Trusted classifica- tion / decision path Remote terminal verification SHCUA/OpenClaw se- curity analysis [12], [13], [16], [25] Yes Partial No No Computer-use agent at- tack benchmarks [14], [19], [20], [22], [24] Mostly yes Partial No No Input/model-side defenses [27]-[30] Partial No No No Policy/runtime enforce- ment for agents [31]-[33], [35]-[39] Partial Partial Usually no No Sandboxing and con- strained execution [11], [41], [42] Partial No / partial No No General system-level protection [43]-[46] No No Partial No This work-Yes Yes Yes Yes"},{"citing_arxiv_id":"2604.06296","ref_index":22,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent","primary_cat":"cs.LG","submitted_at":"2026-04-07T17:13:47+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"AgentOpt introduces a framework-agnostic package that uses algorithms like UCB-E to find cost-effective model assignments in multi-step LLM agent pipelines, cutting evaluation budgets by 62-76% while maintaining near-optimal accuracy on benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.03131","ref_index":4,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"A Systematic Security Evaluation of OpenClaw and Its Variants","primary_cat":"cs.CR","submitted_at":"2026-04-03T15:52:36+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"All six evaluated OpenClaw agent frameworks exhibit substantial security vulnerabilities, with reconnaissance behaviors as the most common weakness and agent systems proving significantly riskier than isolated backbone models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.02947","ref_index":23,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents","primary_cat":"cs.AI","submitted_at":"2026-04-03T10:29:31+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"AgentHazard benchmark shows computer-use agents remain highly vulnerable, with attack success rates reaching 73.63% on models like Qwen3-Coder powering Claude Code.","context_count":1,"top_context_role":"background","top_context_polarity":"unclear","context_text":"Benchmarkingcorrectnessandsecurityinmulti-turncodegeneration. arXivpreprintarXiv:2510.13859, 2025. [22] Paul Röttger, Fabio Pernisi, Bertie Vidgen, and Dirk Hovy. Safetyprompts: a systematic review of open datasets for evaluating and improving large language model safety. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 27617-27627, 2025. [23] Zhengyang Shan, Jiayun Xin, Yue Zhang, and Minghui Xu. Don't let the claw grip your hand: A security analysis and defense framework for openclaw.arXivpreprintarXiv:2603.10387, 2026. [24] Kimi Team, Yifan Bai, Yiping Bao, Y Charles, Cheng Chen, Guanduo Chen, Haiting Chen, Huarong Chen, Jiahao Chen, Ningxin Chen, et al. Kimi k2: Open agentic intelligence."},{"citing_arxiv_id":"2603.23064","ref_index":11,"ref_count":1,"confidence":0.9,"is_internal_anchor":false,"paper_title":"Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution","primary_cat":"cs.CR","submitted_at":"2026-03-24T11:01:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Claw AI agents' heartbeat background execution shares memory context with user sessions, allowing ordinary social misinformation to silently pollute long-term memory and shape behavior at rates up to 76% across sessions.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}