Roughly 1% of real resumes contain hidden prompt injections against LLM screeners, prevalence has risen over 1-2 years, and over 90% avoid explicit instructions.
WAInjectBench: Benchmarking prompt injection detections for web agents
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 9verdicts
UNVERDICTED 9representative citing papers
AgentThread analyzes five agent protocols with formal TLA+ invariants and SDK tests, reporting 35 specification findings, 80 implementation tests, 30 composition-only failures, and a cross-protocol responsibility gap in security enforcement.
The paper builds SOPBench showing frequent SOP violations in agentic browsers and introduces SOPGuard to enforce the policy with low overhead in BrowserOS.
Frontier browser agents show strong resistance to hand-crafted multi-step prompt injections (0/140 success), unlike coding agents (up to 100%), indicating domain-conditioned safety and that prior high ASR reports may not generalize.
SCOUT adaptively allocates heterogeneous prompt-injection detectors via pre-hoc reliability prediction, cutting attack success 46% and wall-clock 40% versus always-on GPT-4o on new SCOUT-450 benchmark at modest utility cost, with transfer to other sets.
Web agents should default to planning a complete task program before observing live web content to reduce prompt injection exposure, since WebArena tasks are compatible and 80% need no runtime LLM calls.
SnapGuard detects prompt injection attacks on screenshot-based web agents via visual stability indicators and contrast-polarity textual signals, reaching F1 0.75 while running 8x faster than GPT-4o with no added memory cost.
The paper develops a unified framework that organizes computer-use agent reliability around perception-decision-execution layers and creation-deployment-operation-maintenance stages to map security and alignment interventions.
A synthesis of 247 papers on LLM agent security identifies prompt injection and tool hijacking as dominant threats, notes weakly compositional defenses, and argues for trust boundaries and realistic evaluations.
citing papers explorer
-
Measuring Real-World Prompt Injection Attacks in LLM-based Resume Screening
Roughly 1% of real resumes contain hidden prompt injections against LLM screeners, prevalence has risen over 1-2 years, and over 90% avoid explicit instructions.
-
Formal Security Analysis of Agent Protocol Composition
AgentThread analyzes five agent protocols with formal TLA+ invariants and SDK tests, reporting 35 specification findings, 80 implementation tests, 30 composition-only failures, and a cross-protocol responsibility gap in security enforcement.
-
Same-Origin Policy for Agentic Browsers
The paper builds SOPBench showing frequent SOP violations in agentic browsers and introduces SOPGuard to enforce the policy with low overhead in BrowserOS.
-
Domain-Conditioned Safety in Frontier Computer-Using Agents: A 793-Episode Browser Benchmark, a Coding-Domain Cross-Reference, and a Reproducibility Audit of Recent Red-Teaming
Frontier browser agents show strong resistance to hand-crafted multi-step prompt injections (0/140 success), unlike coding agents (up to 100%), indicating domain-conditioned safety and that prior high ASR reports may not generalize.
-
Send a SCOUT First: Pre-hoc Reasoning for Adaptive Detector Allocation in Prompt-Injection Defense
SCOUT adaptively allocates heterogeneous prompt-injection detectors via pre-hoc reliability prediction, cutting attack success 46% and wall-clock 40% versus always-on GPT-4o on new SCOUT-450 benchmark at modest utility cost, with transfer to other sets.
-
Web Agents Should Adopt the Plan-Then-Execute Paradigm
Web agents should default to planning a complete task program before observing live web content to reduce prompt injection exposure, since WebArena tasks are compatible and 80% need no runtime LLM calls.
-
SnapGuard: Lightweight Prompt Injection Detection for Screenshot-Based Web Agents
SnapGuard detects prompt injection attacks on screenshot-based web agents via visual stability indicators and contrast-polarity textual signals, reaching F1 0.75 while running 8x faster than GPT-4o with no added memory cost.
-
Securing Computer-Use Agents: A Unified Architecture-Lifecycle Framework for Deployment-Grounded Reliability
The paper develops a unified framework that organizes computer-use agent reliability around perception-decision-execution layers and creation-deployment-operation-maintenance stages to map security and alignment interventions.
-
Toward Secure LLM Agents: Threat Surfaces, Attacks, Defenses, and Evaluation
A synthesis of 247 papers on LLM agent security identifies prompt injection and tool hijacking as dominant threats, notes weakly compositional defenses, and argues for trust boundaries and realistic evaluations.