Agentvigil: Generic black-box red-teaming for indirect prompt injection against LLM agents

Zhun Wang, Vincent Siu, Zhe Ye, Tianneng Shi, Yuzhou Nie, Xuandong Zhao, Chenguang Wang, Wenbo Guo, Dawn Song · 2025 · arXiv 2505.05849

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Hallucination as Exploit: Evidence-Carrying Multimodal Agents

cs.AI · 2026-05-18 · unverdicted · novelty 6.0 · 2 refs

Evidence-carrying multimodal agents decompose tool calls into predicates, obtain certificates from DOM/OCR/AX verifiers, and use a deterministic gate to authorize actions only when certificates support them, achieving zero unsafe executions in tested tasks.

Progent: Securing AI Agents with Privilege Control

cs.CR · 2025-04-16 · unverdicted · novelty 6.0

Progent introduces a privilege-control framework for AI agents that uses LLM-generated symbolic rules over tools, SMT-solver-enforced monotonic updates, and deterministic checks to reduce attack success rates on AgentDojo and ASB benchmarks.

Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation

cs.CR · 2026-05-07 · unverdicted · novelty 5.0

A TEE-backed architecture isolates security-critical decisions in self-hosted AI agents to prevent host-level abuse from malicious inputs while maintaining allowed functionality.

Agent Security is a Systems Problem

cs.CR · 2026-05-18 · unverdicted · novelty 4.0 · 2 refs

The paper argues that agent security is best addressed as a systems problem by applying principles from operating systems, networks, and formal methods rather than relying solely on model robustness improvements.

citing papers explorer

Showing 4 of 4 citing papers.

Hallucination as Exploit: Evidence-Carrying Multimodal Agents cs.AI · 2026-05-18 · unverdicted · none · ref 13 · 2 links
Evidence-carrying multimodal agents decompose tool calls into predicates, obtain certificates from DOM/OCR/AX verifiers, and use a deterministic gate to authorize actions only when certificates support them, achieving zero unsafe executions in tested tasks.
Progent: Securing AI Agents with Privilege Control cs.CR · 2025-04-16 · unverdicted · none · ref 64
Progent introduces a privilege-control framework for AI agents that uses LLM-generated symbolic rules over tools, SMT-solver-enforced monotonic updates, and deterministic checks to reduce attack success rates on AgentDojo and ASB benchmarks.
Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation cs.CR · 2026-05-07 · unverdicted · none · ref 52
A TEE-backed architecture isolates security-critical decisions in self-hosted AI agents to prevent host-level abuse from malicious inputs while maintaining allowed functionality.
Agent Security is a Systems Problem cs.CR · 2026-05-18 · unverdicted · none · ref 75 · 2 links
The paper argues that agent security is best addressed as a systems problem by applying principles from operating systems, networks, and formal methods rather than relying solely on model robustness improvements.

Agentvigil: Generic black-box red-teaming for indirect prompt injection against LLM agents

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer