HQA-VLAttack creates high-quality adversarial attacks on vision-language models via counter-fitting text substitutions and layer-guided contrastive image optimization that decreases positive pair similarity while increasing negative pair similarity, outperforming baselines on three benchmarks.
Tsang, and Qing Guo
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
HQA-VLAttack: Towards High Quality Adversarial Attack on Vision-Language Pre-Trained Models
HQA-VLAttack creates high-quality adversarial attacks on vision-language models via counter-fitting text substitutions and layer-guided contrastive image optimization that decreases positive pair similarity while increasing negative pair similarity, outperforming baselines on three benchmarks.