Natural language descriptions of deep visual features

Hernandez, Evan, Schwettmann, Sarah, Bau, David, Bagashvili, Teona, Torralba, Antonio, Andreas, Jacob , year = · 2021 · arXiv 2201.11114

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

cs.AI · 2026-05-28 · unverdicted · novelty 6.0

Sparse autoencoders scaled to 34 million features on Claude 3 Sonnet yield interpretable, steerable representations of concrete and abstract concepts that generalize across languages and modalities.

Correcting Influence: Unboxing LLM Outputs with Orthogonal Latent Spaces

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.

A Single Rewrite Suffices: Empirical Lessons from Production Skill Description Optimization

cs.CL · 2026-06-29 · unverdicted · novelty 5.0

A single LLM rewrite of skill descriptions using false positive and negative cases matches manual optimization performance in production, with most other pipeline components adding little value.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet cs.AI · 2026-05-28 · unverdicted · none · ref 34
Sparse autoencoders scaled to 34 million features on Claude 3 Sonnet yield interpretable, steerable representations of concrete and abstract concepts that generalize across languages and modalities.
Correcting Influence: Unboxing LLM Outputs with Orthogonal Latent Spaces cs.LG · 2026-05-12 · unverdicted · none · ref 275
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
A Single Rewrite Suffices: Empirical Lessons from Production Skill Description Optimization cs.CL · 2026-06-29 · unverdicted · none · ref 26
A single LLM rewrite of skill descriptions using false positive and negative cases matches manual optimization performance in production, with most other pipeline components adding little value.

Natural language descriptions of deep visual features

fields

years

verdicts

representative citing papers

citing papers explorer