Machine against the rag: Jamming retrieval-augmented generation with blocker documents

Avital Shafran, Roei Schuster, Vitaly Shmatikov · 2024 · arXiv 2406.05870

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Detecting Is Not Resolving: The Monitoring Control Gap in Retrieval Augmented LLMs

cs.AI · 2026-05-26 · unverdicted · novelty 6.0

RAG models exhibit a monitoring-control gap: they acknowledge epistemic conflicts in accumulating documents yet fail to constrain unsafe recommendations, with single-turn tests overestimating safety.

One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation Systems

cs.CR · 2025-05-15 · unverdicted · novelty 6.0

AuthChain poisons a single document to achieve high-success attacks on RAG systems for multi-hop queries across six LLMs while evading defenses.

Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction

cs.CR · 2025-04-29 · unverdicted · novelty 6.0

The method prompts LLMs to output both answers and references to the executed instructions, then filters out any answers not linked to the original input instructions, reducing attack success rates to zero in tested scenarios while preserving utility.

PRA-RAG: Provably Robust Aggregation in Retrieval-Augmented Generation against Retrieval Corruption

cs.IR · 2026-05-08 · unverdicted · novelty 5.0

PRA-RAG is a new aggregation algorithm for RAG that claims provable robustness bounds against poisoned retrieved texts and reduces attack success rate to 1% while keeping 71% accuracy.

citing papers explorer

Showing 4 of 4 citing papers after filters.

Detecting Is Not Resolving: The Monitoring Control Gap in Retrieval Augmented LLMs cs.AI · 2026-05-26 · unverdicted · none · ref 29
RAG models exhibit a monitoring-control gap: they acknowledge epistemic conflicts in accumulating documents yet fail to constrain unsafe recommendations, with single-turn tests overestimating safety.
One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation Systems cs.CR · 2025-05-15 · unverdicted · none · ref 6
AuthChain poisons a single document to achieve high-success attacks on RAG systems for multi-hop queries across six LLMs while evading defenses.
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction cs.CR · 2025-04-29 · unverdicted · none · ref 33
The method prompts LLMs to output both answers and references to the executed instructions, then filters out any answers not linked to the original input instructions, reducing attack success rates to zero in tested scenarios while preserving utility.
PRA-RAG: Provably Robust Aggregation in Retrieval-Augmented Generation against Retrieval Corruption cs.IR · 2026-05-08 · unverdicted · none · ref 121
PRA-RAG is a new aggregation algorithm for RAG that claims provable robustness bounds against poisoned retrieved texts and reduces attack success rate to 1% while keeping 71% accuracy.

Machine against the rag: Jamming retrieval-augmented generation with blocker documents

fields

years

verdicts

representative citing papers

citing papers explorer