PRISM decomposes harmful instructions into benign visual gadgets and directs LVLMs via prompts to compose them through reasoning into harmful outputs, achieving ASR over 0.90 on SafeBench.
arXiv preprint arXiv:2412.05934
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CR 2years
2025 2verdicts
UNVERDICTED 2representative citing papers
AuthChain poisons a single document to achieve high-success attacks on RAG systems for multi-hop queries across six LLMs while evading defenses.
citing papers explorer
-
PRISM: Programmatic Reasoning with Image Sequence Manipulation for LVLM Jailbreaking
PRISM decomposes harmful instructions into benign visual gadgets and directs LVLMs via prompts to compose them through reasoning into harmful outputs, achieving ASR over 0.90 on SafeBench.
-
One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation Systems
AuthChain poisons a single document to achieve high-success attacks on RAG systems for multi-hop queries across six LLMs while evading defenses.