YARD is a training-free method using Y-shaped decoder architecture and register tokens to improve contrastive decoding for hallucination reduction in LVLMs with lower latency.
arXiv preprint arXiv:2509.25177 , year=
4 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 4years
2026 4representative citing papers
ADAPT reduces MLLM hallucinations 40-60% by aligning cross-attention dynamics via visual anchors, supervised inference, and preference tuning while preserving general capabilities.
Fox detects risky attention heads in LVLMs using visual attention entropy and severs hallucination shortcuts via numerical logit saturation and conflict-gated decoding, outperforming prior methods by 29.1%.
citing papers explorer
-
YARD: Y-Architecture Register Decoding for Efficient Hallucination Mitigation in Large Vision-Language Models
YARD is a training-free method using Y-shaped decoder architecture and register tokens to improve contrastive decoding for hallucination reduction in LVLMs with lower latency.
-
ADAPT: Attention Dynamics Alignment with Preference Tuning for Faithful MLLMs
ADAPT reduces MLLM hallucinations 40-60% by aligning cross-attention dynamics via visual anchors, supervised inference, and preference tuning while preserving general capabilities.
-
Dismantling Pathological Shortcuts: A Causal Framework for Faithful LVLM Decoding
Fox detects risky attention heads in LVLMs using visual attention entropy and severs hallucination shortcuts via numerical logit saturation and conflict-gated decoding, outperforming prior methods by 29.1%.