The paper shows that explanation heatmaps in vision-language models can be redirected to irrelevant image regions via imperceptible patch perturbations without changing model predictions, using a new attack called X-Shift.
Mp-nav: Enhancing data poisoning attacks against multimodal learning
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Right Predictions, Misleading Explanations: On the Vulnerability of Vision-Language Model Explanations
The paper shows that explanation heatmaps in vision-language models can be redirected to irrelevant image regions via imperceptible patch perturbations without changing model predictions, using a new attack called X-Shift.