An image is worth 1000 lies: Adversarial transferability across prompts on vision-language models

Haochen Luo, Jindong Gu, Fengyuan Liu, Philip Torr · 2024 · arXiv 2403.09766

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

A Cross-Modal Prompt Injection Attack against Large Vision-Language Models with Image-Only Perturbation

cs.CR · 2026-05-15 · unverdicted · novelty 7.0

CrossMPI steers both visual and textual interpretations in LVLMs through image-only perturbations by optimizing in hidden-state space at selected middle layers with distance-based budget allocation.

Jailbreaking Frontier Foundation Models Through Intention Deception

cs.CR · 2026-04-27 · unverdicted · novelty 7.0

A multi-turn intention-deception jailbreak achieves high success on GPT-5 and Claude models while exposing para-jailbreaking where models leak harmful information without direct refusal.

citing papers explorer

Showing 2 of 2 citing papers.

A Cross-Modal Prompt Injection Attack against Large Vision-Language Models with Image-Only Perturbation cs.CR · 2026-05-15 · unverdicted · none · ref 29
CrossMPI steers both visual and textual interpretations in LVLMs through image-only perturbations by optimizing in hidden-state space at selected middle layers with distance-based budget allocation.
Jailbreaking Frontier Foundation Models Through Intention Deception cs.CR · 2026-04-27 · unverdicted · none · ref 10
A multi-turn intention-deception jailbreak achieves high success on GPT-5 and Claude models while exposing para-jailbreaking where models leak harmful information without direct refusal.

An image is worth 1000 lies: Adversarial transferability across prompts on vision-language models

fields

years

verdicts

representative citing papers

citing papers explorer