High-noise feature drift distinguishes adversarial from clean inputs in CLIP, allowing a plug-in gating mechanism to selectively trigger existing test-time defenses and raise mean clean+adversarial accuracy across 13 datasets.
As firm as their foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks?
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
A comprehensive survey that taxonomizes safety threats to large models and agents, reviews defenses and benchmarks, and outlines open challenges.
citing papers explorer
-
Beyond False Stability: High-Noise Drift Gating for Test-Time Adversarial Defenses in Vision-Language Models
High-noise feature drift distinguishes adversarial from clean inputs in CLIP, allowing a plug-in gating mechanism to selectively trigger existing test-time defenses and raise mean clean+adversarial accuracy across 13 datasets.