Devils in middle layers of large vision- language models: Interpreting, detecting and mitigating ob- ject hallucinations via attention lens

Zhangqi Jiang, Junkai Chen, Beier Zhu, Tingjin Luo, Yankun Shen, Xu Yang · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Responses Fall Short of Understanding: Revealing the Gap between Internal Representations and Responses in Visual Document Understanding

cs.CL · 2026-04-06 · unverdicted · novelty 5.0

Linear probing reveals a gap between internal representations and responses in LVLMs for visual document understanding, with task information encoded more linearly in intermediate layers than the final layer, and fine-tuning those layers narrows the gap.

citing papers explorer

Showing 1 of 1 citing paper.

Responses Fall Short of Understanding: Revealing the Gap between Internal Representations and Responses in Visual Document Understanding cs.CL · 2026-04-06 · unverdicted · none · ref 25
Linear probing reveals a gap between internal representations and responses in LVLMs for visual document understanding, with task information encoded more linearly in intermediate layers than the final layer, and fine-tuning those layers narrows the gap.

Devils in middle layers of large vision- language models: Interpreting, detecting and mitigating ob- ject hallucinations via attention lens

fields

years

verdicts

representative citing papers

citing papers explorer