MemAudit combines counterfactual causal influence scores with memory consistency graphs to identify poisoned records in LLM agent memory, reducing MINJA attack success from 70% to 0% in QA and 83.3% to 0% in reasoning tasks.
Advances in Neural Information Processing Systems , volume=
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4representative citing papers
Memory-equipped LLM agents exhibit increasing safety violation rates as memory accumulates across independent tasks, termed temporal memory contamination, detected via a new trigger-probe protocol.
ARGUS defends LLM agents from context-aware prompt injections by tracking information provenance and verifying decisions against trustworthy evidence, reducing attack success to 3.8% while retaining 87.5% task utility.
citing papers explorer
-
MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection
MemAudit combines counterfactual causal influence scores with memory consistency graphs to identify poisoned records in LLM agent memory, reducing MINJA attack success from 70% to 0% in QA and 83.3% to 0% in reasoning tasks.
-
Remembering More, Risking More: Longitudinal Safety Risks in Memory-Equipped LLM Agents
Memory-equipped LLM agents exhibit increasing safety violation rates as memory accumulates across independent tasks, termed temporal memory contamination, detected via a new trigger-probe protocol.
-
ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection
ARGUS defends LLM agents from context-aware prompt injections by tracking information provenance and verifying decisions against trustworthy evidence, reducing attack success to 3.8% while retaining 87.5% task utility.
- SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces