Lost in embeddings: Information loss in vision-language models.arXiv preprint arXiv:2509.11986, 2

Wenyan Li, Raphael Tang, Chengzu Li, Caiqi Zhang, Ivan Vulić, Anders Søgaard · 2025 · arXiv 2509.11986

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Reflection Anchors for Propagation-Aware Visual Retention in Long-Chain Multimodal Reasoning

cs.CV · 2026-05-10 · unverdicted · novelty 7.0

RAPO uses an information-theoretic lower bound on visual gain to select high-entropy reflection anchors and optimizes a chain-masked KL surrogate, delivering gains over baselines on reasoning benchmarks across LVLM backbones.

Large Vision-Language Models Get Lost in Attention

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

In LVLMs, attention can be replaced by random Gaussian weights with little or no performance loss, indicating that current models get lost in attention rather than efficiently using visual context.

Mull-Tokens: Modality-Agnostic Latent Thinking

cs.CV · 2025-12-11 · unverdicted · novelty 6.0

Mull-Tokens are modality-agnostic latent tokens that enable free-form multimodal thinking and deliver up to 16% gains on spatial reasoning benchmarks.

Do Activation Verbalization Methods Convey Privileged Information?

cs.CL · 2025-09-16 · unverdicted · novelty 5.0

Activation verbalization methods for LLMs largely reflect the verbalizer model's parametric knowledge rather than privileged information from the target model's activations.

citing papers explorer

Showing 4 of 4 citing papers.

Reflection Anchors for Propagation-Aware Visual Retention in Long-Chain Multimodal Reasoning cs.CV · 2026-05-10 · unverdicted · none · ref 10
RAPO uses an information-theoretic lower bound on visual gain to select high-entropy reflection anchors and optimizes a chain-masked KL surrogate, delivering gains over baselines on reasoning benchmarks across LVLM backbones.
Large Vision-Language Models Get Lost in Attention cs.AI · 2026-05-07 · unverdicted · none · ref 80
In LVLMs, attention can be replaced by random Gaussian weights with little or no performance loss, indicating that current models get lost in attention rather than efficiently using visual context.
Mull-Tokens: Modality-Agnostic Latent Thinking cs.CV · 2025-12-11 · unverdicted · none · ref 34
Mull-Tokens are modality-agnostic latent tokens that enable free-form multimodal thinking and deliver up to 16% gains on spatial reasoning benchmarks.
Do Activation Verbalization Methods Convey Privileged Information? cs.CL · 2025-09-16 · unverdicted · none · ref 31
Activation verbalization methods for LLMs largely reflect the verbalizer model's parametric knowledge rather than privileged information from the target model's activations.

Lost in embeddings: Information loss in vision-language models.arXiv preprint arXiv:2509.11986, 2

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer