DeBias-Attack corrects surrogate-specific bias in adversarial gradients for VLP models by subtracting the projection from a reference branch optimized on weak-semantic images.
An image is worth 1000 lies: Adversarial transferability across prompts on vision-language models
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
CrossMPI steers both visual and textual interpretations in LVLMs through image-only perturbations by optimizing in hidden-state space at selected middle layers with distance-based budget allocation.
A multi-turn intention-deception jailbreak achieves high success on GPT-5 and Claude models while exposing para-jailbreaking where models leak harmful information without direct refusal.
JECA^2 is a new white-box attack method using Grad-CAM-guided perturbations and prompt embedding optimization to achieve judgment-explanation consistent adversarial attacks on forensic VLMs.
citing papers explorer
-
Improving Adversarial Transferability on Vision-Language Pre-training Models via Surrogate-Specific Bias Correction
DeBias-Attack corrects surrogate-specific bias in adversarial gradients for VLP models by subtracting the projection from a reference branch optimized on weak-semantic images.
-
A Cross-Modal Prompt Injection Attack against Large Vision-Language Models with Image-Only Perturbation
CrossMPI steers both visual and textual interpretations in LVLMs through image-only perturbations by optimizing in hidden-state space at selected middle layers with distance-based budget allocation.
-
Jailbreaking Frontier Foundation Models Through Intention Deception
A multi-turn intention-deception jailbreak achieves high success on GPT-5 and Claude models while exposing para-jailbreaking where models leak harmful information without direct refusal.
-
JECA^2: Judgment-Explanation Consistent Adversarial Attack against Forensic Vision-Language Models
JECA^2 is a new white-box attack method using Grad-CAM-guided perturbations and prompt embedding optimization to achieve judgment-explanation consistent adversarial attacks on forensic VLMs.