Using contrastive examples with vision-language models and a new CLIP-based scoring method called CSP produces more faithful and granular neuron labels than prior activation-only approaches.
Transformer Circuits Thread2(2023)
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
A computational framework identifies more coherent themes in free-text survey data on race, gender, and sexual orientation than previous methods, with applications for survey design, explaining variation, and detecting identity discordance.
citing papers explorer
-
Contrastive Semantic Projection: Faithful Neuron Labeling with Contrastive Examples
Using contrastive examples with vision-language models and a new CLIP-based scoring method called CSP produces more faithful and granular neuron labels than prior activation-only approaches.
-
In your own words: computationally identifying interpretable themes in free-text survey data
A computational framework identifies more coherent themes in free-text survey data on race, gender, and sexual orientation than previous methods, with applications for survey design, explaining variation, and detecting identity discordance.