MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks

Daniel Kang; Dimitrios Bralios; Heng Ji; Hyeonjeong Ha; Jeonghwan Kim; Kai-Wei Chang; Nanyun Peng; Qiusi Zhan; Saikrishna Sanniboina

arxiv: 2502.17832 · v4 · pith:7WMM5URAnew · submitted 2025-02-25 · 💻 cs.LG · cs.AI· cs.CR· cs.CV

MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks

Hyeonjeong Ha , Qiusi Zhan , Jeonghwan Kim , Dimitrios Bralios , Saikrishna Sanniboina , Nanyun Peng , Kai-Wei Chang , Daniel Kang

show 1 more author

Heng Ji

This is my paper

classification 💻 cs.LG cs.AIcs.CRcs.CV

keywords multimodalpoisoningattackknowledgegenerationmm-poisonragaccessacross

0 comments

read the original abstract

Retrieval-augmented generation (RAG) has become a common practice in multimodal large language models (MLLM) to enhance factual grounding and reduce hallucination. Yet, its reliance on retrieval exposes MLLMs to knowledge poisoning attacks, in which adversaries deliberately inject malicious multimodal content into external knowledge bases to steer models toward generating incorrect or even harmful responses. We present MM-PoisonRAG, a framework to systematically study the vulnerability of multimodal RAG under knowledge poisoning. Specifically, we design two novel attack strategies: Localized Poisoning Attack (LPA), which implants targeted, query-specific multimodal misinformation to manipulate outputs toward attacker-controlled responses, and Globalized Poisoning Attack (GPA), which uses a single, untargeted adversarial injection to broadly corrupt reasoning and collapse generation quality across all queries. Extensive experiments on diverse tasks, multimodal RAG components, and attacker access levels reveal severe vulnerabilities: LPA achieves up to 56% attack success rate even under restricted access, and transfers effectively across four different retrievers without re-optimizing the adversaries. GPA completely disrupts model generation to 0% accuracy with just one poisoned content. Moreover, both LPA and GPA bypass existing defenses, underscoring the fragility of multimodal RAG and establishing MM-PoisonRAG as a foundation for future research on securing RAG frameworks against multimodal knowledge poisoning.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Knowledge Poisoning Attacks on Medical Multi-Modal Retrieval-Augmented Generation
cs.CR 2026-05 unverdicted novelty 8.0

M³Att poisons medical multimodal RAG by pairing covert textual misinformation with query-agnostic visual perturbations that increase retrieval of the bad content, causing LLMs to generate clinically plausible but inco...
Security Considerations for Multi-agent Systems
cs.CR 2026-03 unverdicted novelty 6.0

No existing AI security framework covers a majority of the 193 identified multi-agent system threats in any category, with OWASP Agentic Security Initiative achieving the highest overall coverage at 65.3%.