Jailbreak vulnerability in MLLMs is language- and modality-dependent, producing rank reversals in model safety between English and Spanish conditions.
Large reasoning models are autonomous jailbreak agents
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Same Model, Different Weakness: How Language and Modality Reshape the Jailbreak Attack Surface in Frontier MLLMs
Jailbreak vulnerability in MLLMs is language- and modality-dependent, producing rank reversals in model safety between English and Spanish conditions.