The paper defines causality laundering as an attack leaking information from denial outcomes in LLM tool calls and proposes the Agentic Reference Monitor to block it using denial-aware provenance graphs.
Agentguardian: Learning access control policies to govern ai agent behavior
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6verdicts
UNVERDICTED 6roles
background 3polarities
background 3representative citing papers
No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.
An LLM-based generator-critic loop autoformalizes natural language policies into Cedar policies that cover substantially more of the source specification than hand-coded symbolic enforcement on MedAgentBench.
LLM agent security is reframed as an agent-human interaction issue, supported by a survey showing industry preference for human-centric mechanisms over academic favorites and proposing a new research agenda.
LiSA improves AI guardrails lifelong by inducing conservative policies from sparse noisy failure reports via structured memory, conflict-aware rules, and posterior lower-bound gating.
Symbolic guardrails enforce 74% of specified safety policies in agent benchmarks and boost safety without hurting utility.
citing papers explorer
-
Causality Laundering: Denial-Feedback Leakage in Tool-Calling LLM Agents
The paper defines causality laundering as an attack leaking information from denial outcomes in LLM tool calls and proposes the Agentic Reference Monitor to block it using denial-aware provenance graphs.
-
Security Considerations for Multi-agent Systems
No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.
-
Autoformalization of Agent Instructions into Policy-as-Code
An LLM-based generator-critic loop autoformalizes natural language policies into Cedar policies that cover substantially more of the source specification than hand-coded symbolic enforcement on MedAgentBench.
-
Reframing LLM Agent Security as an Agent-Human Interaction Problem
LLM agent security is reframed as an agent-human interaction issue, supported by a survey showing industry preference for human-centric mechanisms over academic favorites and proposing a new research agenda.
-
LiSA: Lifelong Safety Adaptation via Conservative Policy Induction
LiSA improves AI guardrails lifelong by inducing conservative policies from sparse noisy failure reports via structured memory, conflict-aware rules, and posterior lower-bound gating.
-
Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility
Symbolic guardrails enforce 74% of specified safety policies in agent benchmarks and boost safety without hurting utility.