ShadowMerge exploits relation-channel conflicts to poison graph-based agent memory, achieving 93.8% average attack success rate on Mem0 and real-world datasets while bypassing existing defenses.
hub Mixed citations
arXiv preprint arXiv:2512.16962 , year=
Mixed citation behavior. Most common role is background (57%).
hub tools
citation-role summary
citation-polarity summary
years
2026 26representative citing papers
Presents TMA-NM, a non-malleable origin-bound authority system for LLM-agent memory with TLA+ machine-checked separation theorems and benchmarks showing 0% attack success against direct and laundering poisoning while preserving utility.
SMSR is the first defense with a certified robustness bound against multi-session memory poisoning in persistent LLM agents, combining HMAC provenance signing with randomized ablation and verdict-based voting.
AgentCanary introduces an Entry × Impact risk taxonomy, high-fidelity real tool environments with persistent state, and multi-dimensional trajectory evaluation to assess AI agent security across models and attacks.
MemMark enables snapshot-only attribution for agent long-term memory by embedding signals via keyed distribution-preserving sampling at memory-write decisions, recovering 40-bit payloads with near-baseline utility.
Introduces CSTM-Bench with 26 cross-session attack taxonomies, demonstrates recall loss in session-bound and full-log detectors, and proposes a bounded-memory coreset reader with the CSTM metric balancing detection and serving stability.
Provenance checks in graph memory are blind to structural attacks that reallocate top-k membership; authselect prevents this by enforcing selection on the authenticated subgraph only.
MemAudit combines counterfactual causal influence scores with memory consistency graphs to identify poisoned records in LLM agent memory, reducing MINJA attack success from 70% to 0% in QA and 83.3% to 0% in reasoning tasks.
A3S-Bench evaluates LLM agents against temporal, spatial, and semantic evasions, raising average risk trigger rates from 28.3% to 52.6% across 2,254 trajectories and 20 scenarios.
OEP poisons self-evolving LLM agents by constructing clean edge-case experiences that appear locally valid yet cause harmful over-generalization during reflection, achieving over 50% attack success rate on GPT-4o agents across three domains.
MemLineage enforces untrusted-path persistence in LLM agent memory through Merkle logs, per-principal signatures, and max-of-strong-edges lineage propagation, achieving zero ASR on three poisoning workloads with sub-millisecond overhead.
Sleeper channels enable persistent prompt injection in always-on AI agents via persistence substrate and firing separation, countered by provenance gates using action digests and owner attestations with a soundness theorem.
AgentWard organizes stage-specific security controls with cross-layer coordination to intercept threats across the full lifecycle of autonomous AI agents.
GAAP guarantees confidentiality of private user data for AI agents by enforcing user-specified permissions deterministically through persistent information flow tracking, without trusting the agent or requiring attack-free models.
Claw AI agents' heartbeat background execution shares memory context with user sessions, allowing ordinary social misinformation to silently pollute long-term memory and shape behavior at rates up to 76% across sessions.
No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.
PMD extracts and distills cross-episode procedural knowledge from RL rollouts into LLM policies at three abstraction levels, yielding 3.8-13.6% gains over SDPO on SCIKNOWEVAL and LIVECODEBENCH via co-evolution.
A net-value-per-byte curator governs memory lifecycle in on-device LLM agents, cutting memory 2.7x and uplink 2.4x while driving injection success to zero on task-drift benchmarks and Jetson hardware.
Injecting one poisoned playbook into security knowledge sources causes systematic behavior changes in RAG AI agents across multiple LLMs and tasks, explained by a Verification Boundary classification.
Self-evolving LLM agents introduce persistent, amplifying security threats that static defenses cannot address, as shown by analysis of 25 attack surface cells and case studies.
Causal Memory Intervention selects memories based on estimated causal impact on LLM answers rather than semantic similarity, with a new benchmark showing improved robustness to irrelevant or harmful memories.
Memory poisoning via lost-provenance documents in agent memory stores creates agent misconduct that safety systems misattribute to model failure; the paper defines Semantic Norm Drift, releases a benchmark, and proposes a new testing method plus a defense.
Protected policy placements in LLM agents maintain integrity under replay pressure on AutoGen and OpenHands traces, unlike task-local placements which show eviction or weakening.
The paper systematizes security for LLM agents in agentic commerce into five threat dimensions, identifies 12 cross-layer attack vectors, and proposes a layered defense architecture.
citing papers explorer
No citing papers match the current filters.