RAGShield detects all numerical manipulations in government RAG systems via pattern-based value extraction and cross-source verification, achieving 0% attack success rate on 430 real IRS-derived attacks where embedding defenses miss 79-90%.
PoisonedRAG: Knowledge corruption attacks to retrieval- augmented generation of large language models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
RAGShield: Detecting Numerical Claim Manipulation in Government RAG Systems
RAGShield detects all numerical manipulations in government RAG systems via pattern-based value extraction and cross-source verification, achieving 0% attack success rate on 430 real IRS-derived attacks where embedding defenses miss 79-90%.