The paper proposes information scope as a new interpretability axis for SAE features in CLIP and introduces the Contextual Dependency Score to separate local from global scope features, showing they influence model predictions differently.
Causal in- terpretation of sparse autoencoder features in vision
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
SAE-FT uses a sparse autoencoder on pre-trained CLIP visual representations to regularize fine-tuning by penalizing changes to semantically meaningful features, aiming for robust performance on ImageNet and distribution shifts.
citing papers explorer
-
Beyond Semantics: Disentangling Information Scope in Sparse Autoencoders for CLIP
The paper proposes information scope as a new interpretability axis for SAE features in CLIP and introduces the Contextual Dependency Score to separate local from global scope features, showing they influence model predictions differently.
-
Sparse Autoencoders enable Robust and Interpretable Fine-tuning of CLIP models
SAE-FT uses a sparse autoencoder on pre-trained CLIP visual representations to regularize fine-tuning by penalizing changes to semantically meaningful features, aiming for robust performance on ImageNet and distribution shifts.