Latentlens: Revealing highly interpretable visual tokens in llms

Benno Krojer, Shravan Nayak, Oscar Ma˜nas, Vaibhav Adlakha, Desmond Elliott, Siva Reddy, Marius Mosbach · arXiv 2602.00462

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors

cs.CV · 2026-04-02 · unverdicted · novelty 6.0

VLMs bypass visual comparison by recovering semantic labels for nameable entities and hallucinate on unnamable ones, as shown by performance gaps and Logit Lens analysis.

citing papers explorer

Showing 1 of 1 citing paper.

VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors cs.CV · 2026-04-02 · unverdicted · none · ref 8
VLMs bypass visual comparison by recovering semantic labels for nameable entities and hallucinate on unnamable ones, as shown by performance gaps and Logit Lens analysis.

Latentlens: Revealing highly interpretable visual tokens in llms

fields

years

verdicts

representative citing papers

citing papers explorer