Gradient-based and instruction-override prompt injections largely fail to survive retrieval and reranking in realistic RAG systems, while only LLM-driven injections remain effective end-to-end, and all attacks are detectable by a lightweight guard.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings
Gradient-based and instruction-override prompt injections largely fail to survive retrieval and reranking in realistic RAG systems, while only LLM-driven injections remain effective end-to-end, and all attacks are detectable by a lightweight guard.