CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks

Tsui-Wei Weng; Tuomas Oikarinen

arxiv: 2204.10965 · v5 · pith:3UV22FLQnew · submitted 2022-04-23 · 💻 cs.CV · cs.AI· cs.LG

CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks

Tuomas Oikarinen , Tsui-Wei Weng This is my paper

classification 💻 cs.CV cs.AIcs.LG

keywords clip-dissectneuronsavailablevisionconceptsdescriptionsexistingfinally

0 comments

read the original abstract

In this paper, we propose CLIP-Dissect, a new technique to automatically describe the function of individual hidden neurons inside vision networks. CLIP-Dissect leverages recent advances in multimodal vision/language models to label internal neurons with open-ended concepts without the need for any labeled data or human examples. We show that CLIP-Dissect provides more accurate descriptions than existing methods for last layer neurons where the ground-truth is available as well as qualitatively good descriptions for hidden layer neurons. In addition, our method is very flexible: it is model agnostic, can easily handle new concepts and can be extended to take advantage of better multimodal models in the future. Finally CLIP-Dissect is computationally efficient and can label all neurons from five layers of ResNet-50 in just 4 minutes, which is more than 10 times faster than existing methods. Our code is available at https://github.com/Trustworthy-ML-Lab/CLIP-dissect. Finally, crowdsourced user study results are available at Appendix B to further support the effectiveness of our method.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 11 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Can Cross-Layer Transcoders Replace Vision Transformer Activations? An Interpretable Perspective on Vision
cs.CV 2026-04 unverdicted novelty 7.0

Cross-Layer Transcoders decompose ViT activations into sparse, depth-aware layer contributions that maintain zero-shot accuracy and enable faithful attribution of the final representation.
Objects Before Words: Object-First Inductive Biases for Grounding Language in Child-View Video
cs.CV 2026-06 unverdicted novelty 6.0

BabyMind improves forced-choice word grounding accuracy by 2.6 points over CVCL on SAYCam-S by using offline object masks, short-term tracking into object files, and prototype-space multiple-instance contrastive learning.
Measuring What Matters: Synthetic Benchmarks for Concept Bottleneck Models
cs.LG 2026-06 unverdicted novelty 6.0

Introduces synthetic benchmarks for concept bottleneck models that control data modality, concept choice, annotation quality, and completeness to evaluate performance in decision support and automation.
Mechanistically Interpretable Neural Encoding Reveals Fine-Grained Functional Selectivity in Human Visual Cortex
cs.CV 2026-05 unverdicted novelty 6.0

MINE uses mechanistic interpretability on language-aligned image representations to generate per-voxel feature descriptions, validated via image generation and counterfactual edits that causally shift brain activation.
Correcting Influence: Unboxing LLM Outputs with Orthogonal Latent Spaces
cs.LG 2026-05 unverdicted novelty 6.0

A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
Letting the neural code speak: Automated characterization of monkey visual neurons through human language
q-bio.NC 2026-05 unverdicted novelty 6.0

Natural language descriptions generated via a closed-loop pipeline with digital twins capture the selectivity of most neurons in macaque V1 and V4, with synthesized images driving 96% of V4 neurons into the top or bot...
Letting the neural code speak: Automated characterization of monkey visual neurons through human language
q-bio.NC 2026-05 unverdicted novelty 6.0

Natural-language descriptions generated and verified through generative models and digital twins capture the selectivity of most neurons in macaque V1 and V4.
Hierarchical, Interpretable, Label-Free Concept Bottleneck Model
cs.CV 2026-04 unverdicted novelty 6.0

HIL-CBM is a hierarchical label-free concept bottleneck model that improves classification accuracy and explanation quality over prior single-level CBMs using a visual consistency loss and dual heads.
Beyond Interpretability: When, Why, and How Sparse Autoencoders Enable Label-Free Visual Steering
cs.CV 2025-06 unverdicted novelty 6.0

VS2 constructs steering vectors from sparse SAE features on unlabeled in-domain activations to improve zero-shot accuracy of CLIP models by 0.93-4.12% on CIFAR-100, CUB-200, and Tiny-ImageNet while remaining forward-p...
Act on What You See: Unlocking Safe Social Navigation in Vision-Language-Action Models
cs.RO 2026-06 unverdicted novelty 5.0

SALSA aligns social features and adds future-risk signals in VLA models to cut near-collisions by 86.4% and raise social accuracy from 53% to 93% on SCAND and real robots.
Beyond Explainable AI (XAI): An Overdue Paradigm Shift and Post-XAI Research Directions
cs.CY 2026-02 unverdicted novelty 4.0

Current XAI methods for DNNs and LLMs rest on paradoxes and false assumptions that demand a paradigm shift to verification protocols, scientific foundations, context-aware design, and faithful model analysis rather th...