and Teglia, Y

· 2024 · arXiv 2410.14479

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Defense effectiveness across architectural layers: a mechanistic evaluation of persistent memory attacks on stateful LLM agents

cs.CR · 2026-05-08 · unverdicted · novelty 7.0

Memory Sandbox at the memory layer reduces persistent memory attack success rate to 0% for eight of nine models with no utility cost, while input-level and retrieval-level defenses achieve near-baseline attack success rates of 88-89%.

Needle-in-RAG: Prompt-Conditioned Character-Level Traceback of Poisoned Spans in Retrieved Evidence

cs.CR · 2026-05-03 · unverdicted · novelty 7.0

RAGCharacter localizes poisoned character spans in RAG evidence via prompt-conditioned counterfactual masking and achieves the best accuracy-over-attribution trade-off across tested attacks and models.

Conflict-Aware Retriever Editing for Knowledge Injection Attacks on LLM-Based RAG Systems

cs.CR · 2026-06-16 · unverdicted · novelty 6.0

CAREATTACK adapts closed-form parameter editing with graph-based conflict resolution and lightweight anchor repair to promote malicious passages in RAG retrieval while limiting side effects on non-target queries.

RADAR: Defending RAG Dynamically against Retrieval Corruption

cs.CR · 2026-05-21 · unverdicted · novelty 6.0

RADAR defends RAG systems in dynamic settings by framing reliable context selection as a Max-Flow Min-Cut graph problem with Bayesian memory updates, claiming superior robustness, response quality, and low storage on a new dynamic dataset.

Personalization Meets Safety:Mechanisms,Risks,and Mitigations in Personalized LLMs

cs.AI · 2026-06-08 · unverdicted · novelty 5.0

A survey that maps safety risks in personalized LLMs, introduces a unified taxonomy, and highlights three structural inadequacies in existing research on user-invariant safety, isolated techniques, and short-term evaluations.

Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges

cs.AI · 2025-10-27 · unverdicted · novelty 4.0

A survey that taxonomizes threats to agentic AI, reviews benchmarks and evaluation methods, discusses technical and governance defenses, and identifies open challenges.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Personalization Meets Safety:Mechanisms,Risks,and Mitigations in Personalized LLMs cs.AI · 2026-06-08 · unverdicted · none · ref 29
A survey that maps safety risks in personalized LLMs, introduces a unified taxonomy, and highlights three structural inadequacies in existing research on user-invariant safety, isolated techniques, and short-term evaluations.

and Teglia, Y

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer