hub

Memory poisoning attack and defense on memory based llm-agents

· 2026 · arXiv 2601.05504

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

ShadowMerge: A Novel Poisoning Attack on Graph-Based Agent Memory via Relation-Channel Conflicts

cs.CR · 2026-05-09 · unverdicted · novelty 8.0 · 3 refs

ShadowMerge exploits relation-channel conflicts to poison graph-based agent memory, achieving 93.8% average attack success rate on Mem0 and real-world datasets while bypassing existing defenses.

MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection

cs.AI · 2026-05-22 · unverdicted · novelty 6.0

MemAudit combines counterfactual causal influence scores with memory consistency graphs to identify poisoned records in LLM agent memory, reducing MINJA attack success from 70% to 0% in QA and 83.3% to 0% in reasoning tasks.

OEP: Poisoning Self-Evolving LLM Agents via Locally Correct but Non-Transferable Experiences

cs.CR · 2026-05-18 · unverdicted · novelty 6.0

OEP poisons self-evolving LLM agents by constructing clean edge-case experiences that appear locally valid yet cause harmful over-generalization during reflection, achieving over 50% attack success rate on GPT-4o agents across three domains.

The Trap of Trajectory: Towards Understanding and Mitigating Spurious Correlations in Agentic Memory

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

Agentic memory improves clean reasoning but worsens performance when spurious patterns are present in stored trajectories; CAMEL calibration reduces this reliance while preserving clean performance.

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

cs.AI · 2026-04-27 · unverdicted · novelty 6.0

SSRP separates planning from execution in LLM agents to overcome the Attention Latch, delivering 715X resilience gains over ReAct baselines on MultiWOZ tasks.

A Systematic Security Evaluation of OpenClaw and Its Variants

cs.CR · 2026-04-03 · unverdicted · novelty 6.0

All six evaluated OpenClaw agent frameworks exhibit substantial security vulnerabilities, with reconnaissance behaviors as the most common weakness and agent systems proving significantly riskier than isolated backbone models.

PYTHALAB-MERA: Validation-Grounded Memory, Retrieval, and Acceptance Control for Frozen-LLM Coding Agents

cs.CL · 2026-05-08 · unverdicted · novelty 5.0

An external controller for frozen LLMs raises strict validation success on three RL coding tasks from 0/9 to 8/9 by selecting memory records and skills, running fail-fast checks, and propagating credit via eligibility traces.

SoK: Security of Autonomous LLM Agents in Agentic Commerce

cs.CR · 2026-04-15 · unverdicted · novelty 5.0

The paper systematizes security for LLM agents in agentic commerce into five threat dimensions, identifies 12 cross-layer attack vectors, and proposes a layered defense architecture.

Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study

cs.CR · 2026-04-30 · conditional · novelty 4.0

The survey organizes security threats and defenses in autonomous LLM agents into four layers and identifies that risks can propagate across layers from inputs to ecosystem impacts.

Security, Privacy, and Ethical Risks in OpenClaw

cs.CR · 2026-05-22 · unverdicted · novelty 3.0

The paper analyzes security, privacy, and ethical risks in the OpenClaw AI agent system arising from its architecture, storage, tool use, and integrations, arguing these form major barriers to trustworthy adoption.

citing papers explorer

Showing 10 of 10 citing papers.

ShadowMerge: A Novel Poisoning Attack on Graph-Based Agent Memory via Relation-Channel Conflicts cs.CR · 2026-05-09 · unverdicted · none · ref 49 · 3 links
ShadowMerge exploits relation-channel conflicts to poison graph-based agent memory, achieving 93.8% average attack success rate on Mem0 and real-world datasets while bypassing existing defenses.
MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Detection cs.AI · 2026-05-22 · unverdicted · none · ref 28
MemAudit combines counterfactual causal influence scores with memory consistency graphs to identify poisoned records in LLM agent memory, reducing MINJA attack success from 70% to 0% in QA and 83.3% to 0% in reasoning tasks.
OEP: Poisoning Self-Evolving LLM Agents via Locally Correct but Non-Transferable Experiences cs.CR · 2026-05-18 · unverdicted · none · ref 30
OEP poisons self-evolving LLM agents by constructing clean edge-case experiences that appear locally valid yet cause harmful over-generalization during reflection, achieving over 50% attack success rate on GPT-4o agents across three domains.
The Trap of Trajectory: Towards Understanding and Mitigating Spurious Correlations in Agentic Memory cs.LG · 2026-05-10 · unverdicted · none · ref 43
Agentic memory improves clean reasoning but worsens performance when spurious patterns are present in stored trajectories; CAMEL calibration reduces this reliance while preserving clean performance.
Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols cs.AI · 2026-04-27 · unverdicted · none · ref 9
SSRP separates planning from execution in LLM agents to overcome the Attention Latch, delivering 715X resilience gains over ReAct baselines on MultiWOZ tasks.
A Systematic Security Evaluation of OpenClaw and Its Variants cs.CR · 2026-04-03 · unverdicted · none · ref 5
All six evaluated OpenClaw agent frameworks exhibit substantial security vulnerabilities, with reconnaissance behaviors as the most common weakness and agent systems proving significantly riskier than isolated backbone models.
PYTHALAB-MERA: Validation-Grounded Memory, Retrieval, and Acceptance Control for Frozen-LLM Coding Agents cs.CL · 2026-05-08 · unverdicted · none · ref 32
An external controller for frozen LLMs raises strict validation success on three RL coding tasks from 0/9 to 8/9 by selecting memory records and skills, running fail-fast checks, and propagating credit via eligibility traces.
SoK: Security of Autonomous LLM Agents in Agentic Commerce cs.CR · 2026-04-15 · unverdicted · none · ref 107
The paper systematizes security for LLM agents in agentic commerce into five threat dimensions, identifies 12 cross-layer attack vectors, and proposes a layered defense architecture.
Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study cs.CR · 2026-04-30 · conditional · none · ref 41
The survey organizes security threats and defenses in autonomous LLM agents into four layers and identifies that risks can propagate across layers from inputs to ecosystem impacts.
Security, Privacy, and Ethical Risks in OpenClaw cs.CR · 2026-05-22 · unverdicted · none · ref 37
The paper analyzes security, privacy, and ethical risks in the OpenClaw AI agent system arising from its architecture, storage, tool use, and integrations, arguing these form major barriers to trustworthy adoption.

Memory poisoning attack and defense on memory based llm-agents

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer