NeuroTaint is the first taint tracking framework for LLM agents that uses offline auditing of semantic, causal, and persistent context to detect flows from untrusted sources to privileged sinks.
hub
Titzer, Heather Miller, and Phillip B
10 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 4polarities
background 4representative citing papers
PIIGuard uses optimized hidden HTML fragments on webpages to block LLMs from leaking contact PII via indirect prompt injection, achieving at least 97% defense success across tested models while preserving benign QA utility.
The paper defines causality laundering as an attack leaking information from denial outcomes in LLM tool calls and proposes the Agentic Reference Monitor to block it using denial-aware provenance graphs.
AgentDyn benchmark demonstrates that current AI agent defenses against prompt injection fail to handle dynamic real-world conditions.
Semia synthesizes Datalog representations of agent skills via constraint-guided loops to enable reachability queries for semantic risks, finding critical issues in over half of 13,728 real skills with 97.7% recall on expert-labeled samples.
GAAP guarantees confidentiality of private user data for AI agents by enforcing user-specified permissions deterministically through persistent information flow tracking, without trusting the agent or requiring attack-free models.
The method prompts LLMs to output both answers and references to the executed instructions, then filters out any answers not linked to the original input instructions, reducing attack success rates to zero in tested scenarios while preserving utility.
A survey providing a taxonomy of TEE platforms, an agent-centric threat model, and open challenges for applying confidential computing to secure agentic AI systems.
Symbolic guardrails enforce 74% of specified safety policies in agent benchmarks and boost safety without hurting utility.
A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.
citing papers explorer
-
Ghost in the Agent: Redefining Information Flow Tracking for LLM Agents
NeuroTaint is the first taint tracking framework for LLM agents that uses offline auditing of semantic, causal, and persistent context to detect flows from untrusted sources to privileged sinks.
-
PIIGuard: Mitigating PII Harvesting under Adversarial Sanitization
PIIGuard uses optimized hidden HTML fragments on webpages to block LLMs from leaking contact PII via indirect prompt injection, achieving at least 97% defense success across tested models while preserving benign QA utility.
-
Causality Laundering: Denial-Feedback Leakage in Tool-Calling LLM Agents
The paper defines causality laundering as an attack leaking information from denial outcomes in LLM tool calls and proposes the Agentic Reference Monitor to block it using denial-aware provenance graphs.
-
AgentDyn: Are Your Agent Security Defenses Deployable in Real-World Dynamic Environments?
AgentDyn benchmark demonstrates that current AI agent defenses against prompt injection fail to handle dynamic real-world conditions.
-
Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis
Semia synthesizes Datalog representations of agent skills via constraint-guided loops to enable reachability queries for semantic risks, finding critical issues in over half of 13,728 real skills with 97.7% recall on expert-labeled samples.
-
An AI Agent Execution Environment to Safeguard User Data
GAAP guarantees confidentiality of private user data for AI agents by enforcing user-specified permissions deterministically through persistent information flow tracking, without trusting the agent or requiring attack-free models.
-
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction
The method prompts LLMs to output both answers and references to the executed instructions, then filters out any answers not linked to the original input instructions, reducing attack success rates to zero in tested scenarios while preserving utility.
-
When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI
A survey providing a taxonomy of TEE platforms, an agent-centric threat model, and open challenges for applying confidential computing to secure agentic AI systems.
-
Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility
Symbolic guardrails enforce 74% of specified safety policies in agent benchmarks and boost safety without hurting utility.
-
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.