MemJack achieves 71.48% attack success rate on unmodified COCO val2017 images against Qwen3-VL-Plus by coordinating agents to map visual entities to malicious intents, apply multi-angle camouflage, and filter refusals via iterative nullspace projection while transferring strategies through a shared
Jailbreak large vision-language models through multi-modal linkage
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.AI 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
PHANTOM is a consolidated open-source dataset of 47,524 multimodal adversarial samples for VLMs, extending prior benchmarks across 10 high-level categories and 55 subcategories of harmful intents.
citing papers explorer
-
Every Picture Tells a Dangerous Story: Memory-Augmented Multi-Agent Jailbreak Attacks on VLMs
MemJack achieves 71.48% attack success rate on unmodified COCO val2017 images against Qwen3-VL-Plus by coordinating agents to map visual entities to malicious intents, apply multi-angle camouflage, and filter refusals via iterative nullspace projection while transferring strategies through a shared
-
PHANTOM: A Large-Scale Dataset of Multimodal Adversarial Attacks for Vision-Language Models
PHANTOM is a consolidated open-source dataset of 47,524 multimodal adversarial samples for VLMs, extending prior benchmarks across 10 high-level categories and 55 subcategories of harmful intents.