arXiv preprint arXiv:2602.14364 , year =

Tianyu Chen, Dongrui Liu, Xia Hu, Jingyi Yu, Wenjie Wang · 2026 · arXiv 2602.14364

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

SafeClawBench: Separating Semantic, Audit-Evidence, and Sandbox Harm in Tool-Using LLM Agents

cs.CR · 2026-06-16 · accept · novelty 7.0

SafeClawBench supplies 600 staged adversarial tasks and three separate endpoints that show semantic acceptance, audit evidence, and sandbox-observed harm are distinct failure modes in tool-using LLM agents.

Understanding and Evaluating Claw-like Agent Security Through a Computer-Systems Lens

cs.CR · 2026-06-29 · unverdicted · novelty 6.0

The paper introduces SafeClawArena, a 406-task benchmark evaluating security failures in three Claw-like agent platforms across skill supply-chain, state exploitation, data flow, and prompt injection surfaces.

Remembering More, Risking More: Longitudinal Safety Risks in Memory-Equipped LLM Agents

cs.AI · 2026-05-18 · unverdicted · novelty 6.0

Memory-equipped LLM agents exhibit increasing safety violation rates as memory accumulates across independent tasks, termed temporal memory contamination, detected via a new trigger-probe protocol.

Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

cs.AI · 2026-04-05 · unverdicted · novelty 6.0

The paper introduces the Agentic Risk Standard (ARS) as a payment settlement framework that delivers predefined compensation for AI agent execution failures, misalignment, or unintended outcomes.

Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution

cs.CR · 2026-03-24 · unverdicted · novelty 6.0

Claw AI agents' heartbeat background execution shares memory context with user sessions, allowing ordinary social misinformation to silently pollute long-term memory and shape behavior at rates up to 76% across sessions.

BraveGuard: From Open-World Threats to Safer Computer-Use Agents

cs.CR · 2026-05-31 · unverdicted · novelty 5.0

BraveGuard trains guard models on realistic agent trajectories derived from open-world threats, raising detection accuracy on AgentHazard from 38.79% to 82.38%.

Security, Privacy, and Ethical Risks in OpenClaw

cs.CR · 2026-05-22 · unverdicted · novelty 3.0

The paper analyzes security, privacy, and ethical risks in the OpenClaw AI agent system arising from its architecture, storage, tool use, and integrations, arguing these form major barriers to trustworthy adoption.

Understanding and mitigating the risks of OpenClaw for non-technical users: A practical guide with Skill

cs.CR · 2026-06-09 · unverdicted · novelty 2.0

This work categorizes seven risks of OpenClaw for non-technical users, provides plain-language mitigations, and supplies a companion Skill to automate security configurations.

Security of OpenClaw Agents: Fundamentals, Attacks, and Countermeasures

cs.AI · 2026-05-25 · unverdicted · novelty 2.0

A survey that categorizes threats to OpenClaw agents including skill poisoning and cognitive manipulation and reviews defense mechanisms.

citing papers explorer

Showing 1 of 1 citing paper after filters.

SafeClawBench: Separating Semantic, Audit-Evidence, and Sandbox Harm in Tool-Using LLM Agents cs.CR · 2026-06-16 · accept · none · ref 7
SafeClawBench supplies 600 staged adversarial tasks and three separate endpoints that show semantic acceptance, audit evidence, and sandbox-observed harm are distinct failure modes in tool-using LLM agents.

arXiv preprint arXiv:2602.14364 , year =

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer