Don’t let the claw grip your hand: A security analysis and defense framework for OpenClaw

· 2026 · arXiv 2603.10387

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 2 baseline 1

citation-polarity summary

background 1 baseline 1 unclear 1

representative citing papers

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

cs.AI · 2026-05-11 · unverdicted · novelty 8.0

Agent-ValueBench is the first dedicated benchmark for agent values, showing they diverge from LLM values, form a homogeneous 'Value Tide' across models, and bend under harnesses and skill steering.

AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents

cs.AI · 2026-04-03 · unverdicted · novelty 7.0

AgentHazard benchmark shows computer-use agents remain highly vulnerable, with attack success rates reaching 73.63% on models like Qwen3-Coder powering Claude Code.

Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions

cs.CR · 2026-05-21 · unverdicted · novelty 6.0

A3S-Bench evaluates LLM agents against temporal, spatial, and semantic evasions, raising average risk trigger rates from 28.3% to 52.6% across 2,254 trajectories and 20 scenarios.

A Systematic Security Evaluation of OpenClaw and Its Variants

cs.CR · 2026-04-03 · unverdicted · novelty 6.0

All six evaluated OpenClaw agent frameworks exhibit substantial security vulnerabilities, with reconnaissance behaviors as the most common weakness and agent systems proving significantly riskier than isolated backbone models.

Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution

cs.CR · 2026-03-24 · unverdicted · novelty 6.0

Claw AI agents' heartbeat background execution shares memory context with user sessions, allowing ordinary social misinformation to silently pollute long-term memory and shape behavior at rates up to 76% across sessions.

Generative AI Advertising as a Problem of Trustworthy Commercial Intervention

cs.CY · 2026-05-18 · unverdicted · novelty 5.0

Generative AI advertising is reframed as a problem of trustworthy commercial intervention on the generative process, with a taxonomy of influence tiers from product mentions to long-term preference shaping.

Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation

cs.CR · 2026-05-07 · unverdicted · novelty 5.0

A TEE-backed architecture isolates security-critical decisions in self-hosted AI agents to prevent host-level abuse from malicious inputs while maintaining allowed functionality.

AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent

cs.LG · 2026-04-07 · unverdicted · novelty 5.0

AgentOpt introduces a framework-agnostic package that uses algorithms like UCB-E to find cost-effective model assignments in multi-step LLM agent pipelines, cutting evaluation budgets by 62-76% while maintaining near-optimal accuracy on benchmarks.

LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injection

cs.CR · 2026-05-18

citing papers explorer

Showing 9 of 9 citing papers.

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values cs.AI · 2026-05-11 · unverdicted · none · ref 15
Agent-ValueBench is the first dedicated benchmark for agent values, showing they diverge from LLM values, form a homogeneous 'Value Tide' across models, and bend under harnesses and skill steering.
AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents cs.AI · 2026-04-03 · unverdicted · none · ref 23
AgentHazard benchmark shows computer-use agents remain highly vulnerable, with attack success rates reaching 73.63% on models like Qwen3-Coder powering Claude Code.
Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions cs.CR · 2026-05-21 · unverdicted · none · ref 24
A3S-Bench evaluates LLM agents against temporal, spatial, and semantic evasions, raising average risk trigger rates from 28.3% to 52.6% across 2,254 trajectories and 20 scenarios.
A Systematic Security Evaluation of OpenClaw and Its Variants cs.CR · 2026-04-03 · unverdicted · none · ref 4
All six evaluated OpenClaw agent frameworks exhibit substantial security vulnerabilities, with reconnaissance behaviors as the most common weakness and agent systems proving significantly riskier than isolated backbone models.
Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution cs.CR · 2026-03-24 · unverdicted · none · ref 11
Claw AI agents' heartbeat background execution shares memory context with user sessions, allowing ordinary social misinformation to silently pollute long-term memory and shape behavior at rates up to 76% across sessions.
Generative AI Advertising as a Problem of Trustworthy Commercial Intervention cs.CY · 2026-05-18 · unverdicted · none · ref 69
Generative AI advertising is reframed as a problem of trustworthy commercial intervention on the generative process, with a taxonomy of influence tiers from product mentions to long-term preference shaping.
Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation cs.CR · 2026-05-07 · unverdicted · none · ref 19
A TEE-backed architecture isolates security-critical decisions in self-hosted AI agents to prevent host-level abuse from malicious inputs while maintaining allowed functionality.
AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent cs.LG · 2026-04-07 · unverdicted · none · ref 22
AgentOpt introduces a framework-agnostic package that uses algorithms like UCB-E to find cost-effective model assignments in multi-step LLM agent pipelines, cutting evaluation budgets by 62-76% while maintaining near-optimal accuracy on benchmarks.
LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injection cs.CR · 2026-05-18 · unreviewed · ref 14

Don’t let the claw grip your hand: A security analysis and defense framework for OpenClaw

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer