FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts , booktitle =

Yichen Gong, Delong Ran, Jinyuan Liu, Conglei Wang, Tianshuo Cong, Anyu Wang · 2025 · DOI 10.1609/aaai.v39i22.34568

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

open at publisher browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Hard to Read, Easy to Jailbreak: How Visual Degradation Bypasses MLLM Safety Alignment

cs.CV · 2026-05-08 · conditional · novelty 6.0

Degraded image resolution in MLLMs bypasses safety alignments via cognitive overload, raising jailbreak rates across perturbations.

Structured Visual Narratives Undermine Safety Alignment in Multimodal Large Language Models

cs.CR · 2026-03-23 · unverdicted · novelty 6.0

Comic-based visual narratives achieve over 90% ensemble success rates on multiple MLLMs, outperforming text and random-image baselines while breaking existing safety methods and evaluators.

Adaptive Probe-based Steering for Robust LLM Jailbreaking

cs.CR · 2026-05-19 · unverdicted · novelty 5.0

Adaptive probe-based steering guided by model extraction and activation statistics improves LLM jailbreak success rates from 6% to 70% average harmfulness without extra contrastive prompts or manual tuning.

Break the Brake, Not the Wheel: Untargeted Jailbreak via Entropy Maximization

cs.CV · 2026-05-11

citing papers explorer

Showing 4 of 4 citing papers.

Hard to Read, Easy to Jailbreak: How Visual Degradation Bypasses MLLM Safety Alignment cs.CV · 2026-05-08 · conditional · none · ref 42
Degraded image resolution in MLLMs bypasses safety alignments via cognitive overload, raising jailbreak rates across perturbations.
Structured Visual Narratives Undermine Safety Alignment in Multimodal Large Language Models cs.CR · 2026-03-23 · unverdicted · none · ref 14
Comic-based visual narratives achieve over 90% ensemble success rates on multiple MLLMs, outperforming text and random-image baselines while breaking existing safety methods and evaluators.
Adaptive Probe-based Steering for Robust LLM Jailbreaking cs.CR · 2026-05-19 · unverdicted · none · ref 36
Adaptive probe-based steering guided by model extraction and activation statistics improves LLM jailbreak success rates from 6% to 70% average harmfulness without extra contrastive prompts or manual tuning.
Break the Brake, Not the Wheel: Untargeted Jailbreak via Entropy Maximization cs.CV · 2026-05-11 · unreviewed · ref 6

FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts , booktitle =

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer