pith. sign in

Robust clip: Unsupervised adversarial fine-tuning of vision embeddings for robust large vision-language models

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

years

2026 4 2025 3

roles

background 1

polarities

unclear 1

representative citing papers

Hierarchically Robust Zero-shot Vision-language Models

cs.CV · 2026-04-20 · unverdicted · novelty 7.0

A hierarchical adversarial fine-tuning method for VLMs aligns image and text embeddings at multiple hierarchy depths with theoretical margin connections to boost robustness to leaf and superclass attacks while using multiple trees for semantic variety.

citing papers explorer

Showing 7 of 7 citing papers.