CrossMPI steers both visual and textual interpretations in LVLMs through image-only perturbations by optimizing in hidden-state space at selected middle layers with distance-based budget allocation.
An image is worth 1000 lies: Adversarial transferability across prompts on vision-language models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CR 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
A multi-turn intention-deception jailbreak achieves high success on GPT-5 and Claude models while exposing para-jailbreaking where models leak harmful information without direct refusal.
citing papers explorer
-
A Cross-Modal Prompt Injection Attack against Large Vision-Language Models with Image-Only Perturbation
CrossMPI steers both visual and textual interpretations in LVLMs through image-only perturbations by optimizing in hidden-state space at selected middle layers with distance-based budget allocation.
-
Jailbreaking Frontier Foundation Models Through Intention Deception
A multi-turn intention-deception jailbreak achieves high success on GPT-5 and Claude models while exposing para-jailbreaking where models leak harmful information without direct refusal.