The paper shows that explanation heatmaps in vision-language models can be redirected to irrelevant image regions via imperceptible patch perturbations without changing model predictions, using a new attack called X-Shift.
A clip-powered framework for robust and generalizable data selection.arXiv preprint arXiv:2410.11215,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Data Agent learns a co-evolving sample selection policy end-to-end that accelerates training by over 50% on ImageNet-1k and MMLU with no performance loss.
citing papers explorer
-
Right Predictions, Misleading Explanations: On the Vulnerability of Vision-Language Model Explanations
The paper shows that explanation heatmaps in vision-language models can be redirected to irrelevant image regions via imperceptible patch perturbations without changing model predictions, using a new attack called X-Shift.
-
Data Agent: Learning to Select Data via End-to-End Dynamic Optimization
Data Agent learns a co-evolving sample selection policy end-to-end that accelerates training by over 50% on ImageNet-1k and MMLU with no performance loss.