Jailbreak vulnerability in MLLMs is language- and modality-dependent, producing rank reversals in model safety between English and Spanish conditions.
Generative language models exhibit social identity biases.Nature Computational Science, 5(1):65–75
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Same Model, Different Weakness: How Language and Modality Reshape the Jailbreak Attack Surface in Frontier MLLMs
Jailbreak vulnerability in MLLMs is language- and modality-dependent, producing rank reversals in model safety between English and Spanish conditions.