ShadowMerge exploits relation-channel conflicts to poison graph-based agent memory, achieving 93.8% average attack success rate on Mem0 and real-world datasets while bypassing existing defenses.
Fath: Authentication-based test-time defense against indirect prompt injection attacks
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CR 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
The method prompts LLMs to output both answers and references to the executed instructions, then filters out any answers not linked to the original input instructions, reducing attack success rates to zero in tested scenarios while preserving utility.
LLM agent security is reframed as an agent-human interaction issue, supported by a survey showing industry preference for human-centric mechanisms over academic favorites and proposing a new research agenda.
citing papers explorer
-
ShadowMerge: A Novel Poisoning Attack on Graph-Based Agent Memory via Relation-Channel Conflicts
ShadowMerge exploits relation-channel conflicts to poison graph-based agent memory, achieving 93.8% average attack success rate on Mem0 and real-world datasets while bypassing existing defenses.
-
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction
The method prompts LLMs to output both answers and references to the executed instructions, then filters out any answers not linked to the original input instructions, reducing attack success rates to zero in tested scenarios while preserving utility.
-
Reframing LLM Agent Security as an Agent-Human Interaction Problem
LLM agent security is reframed as an agent-human interaction issue, supported by a survey showing industry preference for human-centric mechanisms over academic favorites and proposing a new research agenda.