LivePI benchmark shows indirect prompt injection attack success rates of 10.7% to 29.6% across five AI models in live test environments covering seven input surfaces and multiple malicious goals.
Trojan’s whisper: Stealthy manipulation of openclaw through injected bootstrapped guidance
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CR 2years
2026 2representative citing papers
The survey organizes security threats and defenses in autonomous LLM agents into four layers and identifies that risks can propagate across layers from inputs to ecosystem impacts.
citing papers explorer
-
LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injectio
LivePI benchmark shows indirect prompt injection attack success rates of 10.7% to 29.6% across five AI models in live test environments covering seven input surfaces and multiple malicious goals.
-
Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study
The survey organizes security threats and defenses in autonomous LLM agents into four layers and identifies that risks can propagate across layers from inputs to ecosystem impacts.