Detecting and Preventing Hallucinations in Large Vision Language Models , booktitle =

Anisha Gunjal, Jihan Yin, Erhan Bas · 2024 · DOI 10.1609/aaai.v38i16.29771

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

DetailVerifyBench: A Benchmark for Dense Hallucination Localization in Long Image Captions

cs.CV · 2026-04-07 · unverdicted · novelty 7.0

DetailVerifyBench supplies 1,000 images and densely annotated long captions to evaluate precise hallucination localization in multimodal large language models.

Deep Pre-Alignment for VLMs

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

Deep Pre-Alignment uses a small VLM perceiver instead of ViT to pre-align visual features with LLM text space, yielding 1.9-3.0 point gains on multimodal benchmarks and 32.9% less language forgetting.

MHSA: A Lightweight Framework for Mitigating Hallucinations via Steered Attention in LVLMs

cs.CV · 2026-05-14 · unverdicted · novelty 5.0

MHSA mitigates hallucinations in LVLMs by training an MLP to steer cross-modal attention, extending detection work to mitigation via attention replacement at inference.

citing papers explorer

Showing 3 of 3 citing papers.

DetailVerifyBench: A Benchmark for Dense Hallucination Localization in Long Image Captions cs.CV · 2026-04-07 · unverdicted · none · ref 15
DetailVerifyBench supplies 1,000 images and densely annotated long captions to evaluate precise hallucination localization in multimodal large language models.
Deep Pre-Alignment for VLMs cs.CV · 2026-05-14 · unverdicted · none · ref 119
Deep Pre-Alignment uses a small VLM perceiver instead of ViT to pre-align visual features with LLM text space, yielding 1.9-3.0 point gains on multimodal benchmarks and 32.9% less language forgetting.
MHSA: A Lightweight Framework for Mitigating Hallucinations via Steered Attention in LVLMs cs.CV · 2026-05-14 · unverdicted · none · ref 11
MHSA mitigates hallucinations in LVLMs by training an MLP to steer cross-modal attention, extending detection work to mitigation via attention replacement at inference.

Detecting and Preventing Hallucinations in Large Vision Language Models , booktitle =

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer