GAMBIT constructs gamified instructional traps that decompose harmful visuals and drive MLLMs to reconstruct and answer malicious queries as part of winning a game, achieving over 85% attack success on models including GPT-4o and Gemini 2.5 Flash.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
GAMBIT: A Gamified Jailbreak Framework for Multimodal Large Language Models
GAMBIT constructs gamified instructional traps that decompose harmful visuals and drive MLLMs to reconstruct and answer malicious queries as part of winning a game, achieving over 85% attack success on models including GPT-4o and Gemini 2.5 Flash.