Mitigating ob- ject hallucination in mllms via data-augmented phrase-level alignment

Pritam Sarkar, Sayna Ebrahimi, Ali Etemad, Ahmad Beirami, Sercan ¨O Arık, Tomas Pfister · 2024 · arXiv 2405.18654

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

CAST: Mitigating Object Hallucination in Large Vision-Language Models via Caption-Guided Visual Attention Steering

cs.CV · 2026-05-06 · unverdicted · novelty 6.0

CAST reduces object hallucination in LVLMs by 6.03% on average across five models and five benchmarks by identifying caption-sensitive attention heads and applying optimized steering directions to their outputs, with negligible added inference cost.

HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering

cs.AI · 2026-04-22 · unverdicted · novelty 6.0

HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.

Uncertainty-Aware Exploratory Direct Preference Optimization for Multimodal Large Language Models

cs.LG · 2026-05-06 · unverdicted · novelty 5.0

UE-DPO quantifies epistemic uncertainty from grounding failures to direct more learning pressure on hard visual tokens in preferred samples while easing penalties on dispreferred ones.

citing papers explorer

Showing 3 of 3 citing papers.

CAST: Mitigating Object Hallucination in Large Vision-Language Models via Caption-Guided Visual Attention Steering cs.CV · 2026-05-06 · unverdicted · none · ref 97
CAST reduces object hallucination in LVLMs by 6.03% on average across five models and five benchmarks by identifying caption-sensitive attention heads and applying optimized steering directions to their outputs, with negligible added inference cost.
HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering cs.AI · 2026-04-22 · unverdicted · none · ref 253
HypEHR is a hyperbolic embedding model for EHR data that uses Lorentzian geometry and hierarchy-aware pretraining to answer clinical questions nearly as well as large language models but with much smaller size.
Uncertainty-Aware Exploratory Direct Preference Optimization for Multimodal Large Language Models cs.LG · 2026-05-06 · unverdicted · none · ref 28
UE-DPO quantifies epistemic uncertainty from grounding failures to direct more learning pressure on hard visual tokens in preferred samples while easing penalties on dispreferred ones.

Mitigating ob- ject hallucination in mllms via data-augmented phrase-level alignment

fields

years

verdicts

representative citing papers

citing papers explorer