VidHal is a new benchmark that evaluates VLLM temporal hallucinations through a caption ordering task on videos with varying hallucination levels.
URL https://arxiv
4 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 4verdicts
UNVERDICTED 4representative citing papers
Hallucinations in LVLMs largely arise from textual priors in prompts, and can be reduced by fine-tuning with preference optimization on grounded vs. hallucinated response pairs.
RUDDER creates a persistent visual anchor by extracting CARD from prefill residuals and modulating its injection via an adaptive Beta Gate, cutting CHAIR_S by 24.4% and CHAIR_i by 23.6% on average across LLaVA, Idefics2, InstructBLIP and Qwen2.5-VL with >96% throughput.
CAAC mitigates hallucinations in LVLMs via Visual-Token Calibration and Adaptive Attention Re-Scaling guided by model confidence, showing gains on CHAIR, AMBER, and POPE especially in long-form generation.
citing papers explorer
-
VidHal: Benchmarking Temporal Hallucinations in Vision LLMs
VidHal is a new benchmark that evaluates VLLM temporal hallucinations through a caption ordering task on videos with varying hallucination levels.
-
When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs
Hallucinations in LVLMs largely arise from textual priors in prompts, and can be reduced by fine-tuning with preference optimization on grounded vs. hallucinated response pairs.
-
Adaptive Residual-Update Steering for Low-Overhead Hallucination Mitigation in Large Vision Language Models
RUDDER creates a persistent visual anchor by extracting CARD from prefill residuals and modulating its injection via an adaptive Beta Gate, cutting CHAIR_S by 24.4% and CHAIR_i by 23.6% on average across LLaVA, Idefics2, InstructBLIP and Qwen2.5-VL with >96% throughput.
-
Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration
CAAC mitigates hallucinations in LVLMs via Visual-Token Calibration and Adaptive Attention Re-Scaling guided by model confidence, showing gains on CHAIR, AMBER, and POPE especially in long-form generation.