Adversarial Attacks on Foundational Vision Models

Gwendolyn McDonald; Nathan Inkawhich; Ryan Luley

arxiv: 2308.14597 · v1 · pith:JJFG6GLFnew · submitted 2023-08-28 · 💻 cs.CV · cs.CR· cs.LG

Adversarial Attacks on Foundational Vision Models

Nathan Inkawhich , Gwendolyn McDonald , Ryan Luley This is my paper

classification 💻 cs.CV cs.CRcs.LG

keywords modelsfoundationalattacksdownstreamvisionadversarialclipdinov2

0 comments

read the original abstract

Rapid progress is being made in developing large, pretrained, task-agnostic foundational vision models such as CLIP, ALIGN, DINOv2, etc. In fact, we are approaching the point where these models do not have to be finetuned downstream, and can simply be used in zero-shot or with a lightweight probing head. Critically, given the complexity of working at this scale, there is a bottleneck where relatively few organizations in the world are executing the training then sharing the models on centralized platforms such as HuggingFace and torch.hub. The goal of this work is to identify several key adversarial vulnerabilities of these models in an effort to make future designs more robust. Intuitively, our attacks manipulate deep feature representations to fool an out-of-distribution (OOD) detector which will be required when using these open-world-aware models to solve closed-set downstream tasks. Our methods reliably make in-distribution (ID) images (w.r.t. a downstream task) be predicted as OOD and vice versa while existing in extremely low-knowledge-assumption threat models. We show our attacks to be potent in whitebox and blackbox settings, as well as when transferred across foundational model types (e.g., attack DINOv2 with CLIP)! This work is only just the beginning of a long journey towards adversarially robust foundational vision models.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Robotics-Inspired Guardrails for Foundation Models in Socially Sensitive Domains
cs.AI 2026-05 unverdicted novelty 7.0

Introduces the Grounded Observer framework that applies robotics-inspired formal constructs for runtime constraint enforcement on foundation model interaction trajectories in socially sensitive domains.
Hierarchically Robust Zero-shot Vision-language Models
cs.CV 2026-04 unverdicted novelty 7.0

A hierarchical adversarial fine-tuning method for VLMs aligns image and text embeddings at multiple hierarchy depths with theoretical margin connections to boost robustness to leaf and superclass attacks while using m...