Locate-then-Sparsify: Attribution Guided Sparse Strategy for Visual Hallucination Mitigation
Pith reviewed 2026-05-21 10:27 UTC · model grok-4.3
The pith
By scoring each layer's role in hallucinations and steering only the relevant ones, vision-language models reduce errors while keeping general performance intact.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The Locate-Then-Sparsify for Feature Steering framework first constructs token-level and sentence-level hallucination datasets, applies causal-intervention attribution to produce per-layer relevance scores, and then converts those scores into individualized steering intensities that apply stronger corrections only to hallucination-relevant layers while leaving other layers largely untouched.
What carries the argument
Layerwise attribution scores derived from causal interventions on a constructed hallucination dataset, which are then mapped to per-layer feature-steering intensities.
If this is right
- Uniform steering across all layers can be replaced by sparse, score-driven intensities that preserve capability on non-hallucination tasks.
- The same attribution pipeline can be applied to new LVLMs without retraining the base model.
- Both token-level and sentence-level hallucination cases are handled by the single layerwise intensity map.
- Inference cost stays the same because only the steering strengths change, not the number of operations.
Where Pith is reading between the lines
- The method could be extended to other error modes such as factual inconsistency by building analogous attribution datasets.
- Dynamic, input-dependent attribution might further improve results if computed on the fly for each query.
- The approach suggests that many mitigation techniques now applied globally could benefit from first locating the responsible components inside the model.
Load-bearing premise
The causal-intervention method accurately measures how much each layer contributes to hallucinations, and the token- and sentence-level dataset captures the main patterns that need correction.
What would settle it
A controlled comparison in which the same steering strengths are reassigned to random layers instead of the attributed high-relevance layers, with hallucination metrics then checked to see whether mitigation drops sharply.
Figures
read the original abstract
Despite the significant advancements in Large Vision-Language Models (LVLMs), their tendency to generate hallucinations undermines reliability and restricts broader practical deployment. Among the hallucination mitigation methods, feature steering emerges as a promising approach that reduces erroneous outputs in LVLMs without increasing inference costs. However, current methods apply uniform feature steering across all layers. This heuristic strategy ignores inter-layer differences, potentially disrupting layers unrelated to hallucinations and ultimately leading to performance degradation on general tasks. In this paper, we propose Locate-Then-Sparsify for Feature Steering (LTS-FS), a plug-and-play framework which controls the steering intensity according to the hallucination relevance of each layer. We first construct a dataset comprising token-level and sentence-level hallucination cases. Based on this dataset, we introduce an attribution method based on causal interventions to quantify the hallucination relevance of each layer. With the attribution scores across layers, we propose a layerwise strategy that converts these scores into feature steering intensities for individual layers, enabling more precise adjustments specifically on hallucination-relevant layers. Extensive experiments across multiple LVLMs and benchmarks demonstrate that LTS-FS effectively mitigates hallucination while preserving strong performance. Codes are available at https://github.com/huttersadan/LTS-FS.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Locate-Then-Sparsify for Feature Steering (LTS-FS), a plug-and-play framework for mitigating visual hallucinations in LVLMs. It constructs a dataset of token-level and sentence-level hallucination cases, applies causal-intervention attribution to quantify hallucination relevance per layer, converts the resulting scores into layer-specific steering intensities via a sparsification rule, and reports that this targeted approach reduces hallucinations more effectively than uniform steering while preserving performance across multiple LVLMs and benchmarks. Code is released at a public GitHub repository.
Significance. If the attribution method can be shown to isolate hallucination-driving layers rather than generic sensitivity, the work would offer a principled improvement over uniform feature steering by enabling sparse, less disruptive interventions. The public code release supports reproducibility and is a clear strength.
major comments (2)
- [Attribution method] The central claim that LTS-FS enables precise adjustments on hallucination-relevant layers rests on the causal-intervention attribution scores accurately isolating hallucination relevance. However, the manuscript provides no controls comparing attribution-guided layer selection against (a) random subsets of equal cardinality or (b) layers ranked by impact on non-hallucination metrics. In transformer LVLMs, interventions propagate through residual streams, so measured changes may reflect correlated downstream or task-general effects rather than specific hallucination drivers. This directly threatens the justification for the layerwise sparsification strategy (Abstract and attribution-method description).
- [Layerwise strategy] The conversion of attribution scores into per-layer steering intensities is described only at a high level. Without explicit details on the sparsification rule, any free parameters in the score-to-intensity mapping, or ablation of the conversion choices, it is unclear whether the reported gains are robust or depend on dataset-specific tuning. This is load-bearing for the claim of improved performance preservation (Abstract).
minor comments (2)
- The abstract states that experiments demonstrate effectiveness but omits reporting of statistical significance, exact implementation of the attribution (e.g., activation patching vs. scaling), and any ablation on the constructed hallucination dataset.
- Figure and table captions should explicitly state the number of runs, random seeds, and whether error bars reflect standard deviation or standard error.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript accordingly to strengthen the presentation and justification of our method.
read point-by-point responses
-
Referee: [Attribution method] The central claim that LTS-FS enables precise adjustments on hallucination-relevant layers rests on the causal-intervention attribution scores accurately isolating hallucination relevance. However, the manuscript provides no controls comparing attribution-guided layer selection against (a) random subsets of equal cardinality or (b) layers ranked by impact on non-hallucination metrics. In transformer LVLMs, interventions propagate through residual streams, so measured changes may reflect correlated downstream or task-general effects rather than specific hallucination drivers. This directly threatens the justification for the layerwise sparsification strategy (Abstract and attribution-method description).
Authors: We thank the referee for this important observation. We agree that additional controls are needed to demonstrate that the attribution scores capture hallucination-specific relevance rather than generic layer sensitivity. In the revised manuscript we add experiments comparing attribution-guided layer selection against (a) random subsets of equal cardinality and (b) layers ranked by impact on non-hallucination metrics such as standard VQA accuracy. These controls show that our method yields a superior hallucination-mitigation versus performance-preservation trade-off. We also expand the discussion of the causal-intervention procedure to clarify how layer-specific interventions, with other layers held fixed, measure the direct causal contribution to output hallucination rate and thereby mitigate concerns about residual-stream propagation. revision: yes
-
Referee: [Layerwise strategy] The conversion of attribution scores into per-layer steering intensities is described only at a high level. Without explicit details on the sparsification rule, any free parameters in the score-to-intensity mapping, or ablation of the conversion choices, it is unclear whether the reported gains are robust or depend on dataset-specific tuning. This is load-bearing for the claim of improved performance preservation (Abstract).
Authors: We agree that the layerwise strategy requires a more explicit description. In the revised manuscript we provide the precise sparsification rule, the normalization procedure, the free parameters (threshold and scaling factor), and how they are selected on a held-out validation split. We also add an ablation study that varies the sparsification threshold and the functional form of the score-to-intensity mapping. The results confirm that the reported gains remain stable across reasonable parameter choices and are not artifacts of dataset-specific tuning, thereby supporting the claim of improved performance preservation. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper's chain proceeds by constructing a hallucination dataset, applying causal interventions to derive per-layer attribution scores, and then using a deterministic layerwise conversion rule to set steering intensities; none of these steps reduce by construction to the final benchmark performance numbers or to a self-referential definition of the target quantity. The attribution scores are computed via interventions on the constructed cases rather than by optimizing against the reported mitigation metrics, and the subsequent sparsification rule is presented as a fixed function of those scores. Evaluation on separate benchmarks across multiple LVLMs supplies an independent test, with no load-bearing self-citations, uniqueness theorems, or ansatzes imported from prior author work. The derivation therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- score-to-intensity conversion parameters
axioms (1)
- domain assumption Causal interventions on the constructed dataset reveal true per-layer hallucination relevance
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we introduce an attribution method based on causal interventions to quantify the hallucination relevance of each layer... s_l_tok = sum log(P(y|h,a_l)/P(y|h,a_l ⊙ M_h))
-
IndisputableMonolith/Foundation/BranchSelection.leanbranch_selection unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
layerwise strategy that converts these scores into feature steering intensities... m_l = 1[s_l ≥ τ] ... λ_l = λ * m_l + λ * s̃_l
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Flamingo: a visual language model for few-shot learning
Jean-Baptiste Alayrac and et al. Flamingo: a visual language model for few-shot learning. InAdvances in Neural Informa- tion Processing Systems (NeurIPS), 2022. 1
work page 2022
-
[2]
Wenbin An, Feng Tian, Sicong Leng, Jiahao Nie, Haonan Lin, Qianying Wang, Ping Chen, Xiaoqin Zhang, and Shijian Lu. Mitigating object hallucinations in large vision-language models with assembly of global and local attention. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 29915–29926, 2025. 2, 6
work page 2025
-
[3]
Fluctuation-based adaptive structured pruning for large lan- guage models
Yongqi An, Xu Zhao, Tao Yu, Ming Tang, and Jinqiao Wang. Fluctuation-based adaptive structured pruning for large lan- guage models. InProceedings of the AAAI Conference on Artificial Intelligence, 2024. 3
work page 2024
-
[4]
Shuai Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, Sibo Song, Kai Dang, Peng Wang, Shijie Wang, Jun Tang, et al. Qwen2. 5-vl technical report.arXiv preprint arXiv:2502.13923, 2025. 6
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[5]
Chao Bi, Tiantian Dang, Shuhui Wang, Feng Cao, and Qing- ming Huang. Asking questions to alleviate object hallucina- tion in large vision-language models.IEEE Transactions on Circuits and Systems for Video Technology, 2025. 2
work page 2025
-
[6]
David M. Chan, Suzanne Petryk, et al. Clair: Evaluating im- age captions with large language models. InEMNLP 2023,
work page 2023
-
[7]
Prompt-prompted adaptive structured pruning for efficient llm generation
Harry Dong, Beidi Chen, and Yuejie Chi. Prompt-prompted adaptive structured pruning for efficient llm generation. arXiv preprint arXiv:2404.01365, 2024. 3
-
[8]
Learning to prune deep neural networks via layer-wise optimal brain surgeon
Xin Dong, Shangyu Chen, and Sinno Jialin Pan. Learning to prune deep neural networks via layer-wise optimal brain surgeon. InNeurIPS, 2017. 3
work page 2017
-
[9]
Qianyu Feng, Yu Wu, Hehe Fan, Chenggang Yan, Mingliang Xu, and Yi Yang. Cascaded revision network for novel object captioning.IEEE Transactions on Circuits and Systems for Video Technology, 30(10):3413–3421, 2020. 2
work page 2020
-
[10]
Mme: A comprehensive evaluation bench- mark for multimodal large language models
Chaoyou Fu, Peixian Chen, Yunhang Shen, Yulei Qin, Mengdan Zhang, Xu Lin, Jinrui Yang, Xiawu Zheng, Ke Li, Xing Sun, et al. Mme: A comprehensive evaluation bench- mark for multimodal large language models. InThe Thirty- ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2025. 5, 6
work page 2025
-
[11]
Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures
Hengyuan Hu, Rui Peng, Yu-Wing Tai, and Chi-Keung Tang. Network trimming: A data-driven neuron pruning ap- proach towards efficient deep architectures.ArXiv preprint, abs/1607.03250, 2016. 3
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[12]
A survey on evaluation of multimodal large language models.arXiv preprint arXiv:2408.15769, 2024
Jing Huang et al. A survey on evaluation of multimodal large language models.arXiv preprint arXiv:2408.15769, 2024. 1
-
[13]
Lei Huang, Weijiang Yu, Weitao Ma, Weihong Zhong, Zhangyin Feng, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, et al. A survey on hallucination in large language models: Principles, tax- onomy, challenges, and open questions.arXiv preprint arXiv:2311.05232, 2023. 2
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[14]
Qidong Huang, Xiaoyi Dong, Pan Zhang, Bin Wang, Con- ghui He, Jiaqi Wang, Dahua Lin, Weiming Zhang, and Nenghai Yu. Opera: Alleviating hallucination in multi- modal large language models via over-trust penalty and retrospection-allocation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13418–13427, 2024. 2, 3, 4
work page 2024
-
[15]
Gqa: A new dataset for real-world visual reasoning and compositional question answering
Drew A Hudson and Christopher D Manning. Gqa: A new dataset for real-world visual reasoning and compositional question answering. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 6700–6709, 2019. 6
work page 2019
-
[16]
Survey of hallucination in natural language generation.ACM Computing Surveys, 55(12):1–38, 2023
Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. Survey of hallucination in natural language generation.ACM Computing Surveys, 55(12):1–38, 2023. 2
work page 2023
-
[17]
Survey of hal- lucination in natural language generation.ACM Computing Surveys, 2023
Ziwei Ji, Nayeon Lee, Rita Frieske, et al. Survey of hal- lucination in natural language generation.ACM Computing Surveys, 2023. 1
work page 2023
-
[18]
Model sparsity can simplify machine unlearning, 2024
Jinghan Jia, Jiancheng Liu, Parikshit Ram, Yuguang Yao, Gaowen Liu, Yang Liu, Pranay Sharma, and Sijia Liu. Model sparsity can simplify machine unlearning, 2024. 3
work page 2024
-
[19]
Sicong Leng, Hang Zhang, Guanzheng Chen, Xin Li, Shijian Lu, Chunyan Miao, and Lidong Bing. Mitigating object hal- lucinations in large vision-language models through visual contrastive decoding. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 13872–13882, 2024. 2, 6
work page 2024
-
[20]
Kenneth Li, Oam Patel, Fernanda Vi ´egas, Hanspeter Pfister, and Martin Wattenberg. Inference-time intervention: Elicit- ing truthful answers from a language model.Advances in Neural Information Processing Systems, 36:41451–41530,
-
[21]
Evaluating Object Hallucination in Large Vision-Language Models
Y . Li and et al. Evaluating object hallucination in large vision-language models.arXiv preprint arXiv:2305.10355,
work page internal anchor Pith review Pith/arXiv arXiv
-
[22]
Evaluating object hallucination in large vision-language models
Yifan Li, Yifan Du, Kun Zhou, Jinpeng Wang, Wayne Xin Zhao, and Ji-Rong Wen. Evaluating object hallucination in large vision-language models. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Pro- cessing, pages 292–305, 2023. 3, 5, 6
work page 2023
-
[23]
Continual learning via sparse memory finetuning, 2025
Jessy Lin, Luke Zettlemoyer, Gargi Ghosh, Wen-Tau Yih, Aram Markosyan, Vincent-Pierre Berges, and Barlas O ˘guz. Continual learning via sparse memory finetuning, 2025. 3
work page 2025
-
[24]
Microsoft coco: Common objects in context
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014. 6
work page 2014
-
[25]
Mitigating hallucination in large multi-modal models via robust instruction tuning
Fuxiao Liu, Kevin Lin, Linjie Li, Jianfeng Wang, Yaser Ya- coob, and Lijuan Wang. Mitigating hallucination in large multi-modal models via robust instruction tuning. InThe Twelfth International Conference on Learning Representa- tions, pages 1–12, 2023. 2
work page 2023
-
[26]
Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. Visual instruction tuning.arXiv preprint arXiv:2304.08485,
work page internal anchor Pith review Pith/arXiv arXiv
-
[27]
Improved baselines with visual instruction tuning
Haotian Liu, Chunyuan Li, Yuheng Li, and Yong Jae Lee. Improved baselines with visual instruction tuning. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 26296–26306, 2024. 5, 6
work page 2024
-
[28]
Visual instruction tuning.Advances in neural information processing systems, 36, 2024
Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. Visual instruction tuning.Advances in neural information processing systems, 36, 2024. 6
work page 2024
-
[29]
A Survey on Hallucination in Large Vision-Language Models
Hanchao Liu, Wenyuan Xue, Yifei Chen, Dapeng Chen, Xiu- tian Zhao, Ke Wang, Liping Hou, Rongjun Li, and Wei Peng. A survey on hallucination in large vision-language models. arXiv preprint arXiv:2402.00253, 2024. 2
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[30]
Jinzhe Liu, Junshu Sun, Shufan Shen, Chenxue Yang, and Shuhui Wang. Edit less, achieve more: Dynamic sparse neu- ron masking for lifelong knowledge editing in llms, 2025. 3
work page 2025
-
[31]
Reducing hallucina- tions in large vision-language models via latent space steer- ing
Sheng Liu, Haotian Ye, and James Zou. Reducing hallucina- tions in large vision-language models via latent space steer- ing. InThe Thirteenth International Conference on Learning Representations, 2025. 2, 5, 6
work page 2025
-
[32]
Yan Liu, Yu Liu, Xiaokang Chen, Pin-Yu Chen, Daoguang Zan, Min-Yen Kan, and Tsung-Yi Ho. The devil is in the neu- rons: Interpreting and mitigating social biases in pre-trained language models, 2024. 3
work page 2024
-
[33]
Object hallucination in image cap- tioning
Anna Rohrbach, Lisa Anne Hendricks, Kaylee Burns, Trevor Darrell, and Kate Saenko. Object hallucination in image cap- tioning. InProceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4035–4045,
work page 2018
-
[34]
Prasanta Sahoo et al. A comprehensive survey of hallucina- tion mitigation techniques in large language models.Find- ings of EMNLP, 2024. 1
work page 2024
-
[35]
A-okvqa: A benchmark for visual question answering using world knowl- edge
Dustin Schwenk, Apoorv Khandelwal, Christopher Clark, Kenneth Marino, and Roozbeh Mottaghi. A-okvqa: A benchmark for visual question answering using world knowl- edge. InEuropean conference on computer vision, pages 146–162. Springer, 2022. 6
work page 2022
-
[36]
Expanding sparse tuning for low memory usage
Shufan Shen, Junshu Sun, Xiangyang Ji, Qingming Huang, and Shuhui Wang. Expanding sparse tuning for low memory usage. InNeurIPS, 2024. 3
work page 2024
-
[37]
Suraj Srinivas and R. Venkatesh Babu. Data-free parameter pruning for deep neural networks. InBMVC, 2015. 3
work page 2015
-
[38]
Boyao Wang and V olodymyr Kindratenko. Rl-pruner: Struc- tured pruning using reinforcement learning for cnn com- pression and acceleration.arXiv preprint arXiv:2411.06463,
-
[39]
Vigc: Visual instruction generation and correction
Bin Wang, Fan Wu, Xiao Han, Jiahui Peng, Huaping Zhong, Pan Zhang, Xiaoyi Dong, Weijia Li, Wei Li, Jiaqi Wang, et al. Vigc: Visual instruction generation and correction. InProceedings of the AAAI Conference on Artificial Intel- ligence, pages 5309–5317, 2024. 2
work page 2024
-
[40]
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Peng Wang, Shuai Bai, et al. Qwen2-vl: Enhancing vision- language model’s understanding of the open world.arXiv preprint arXiv:2409.12191, 2024. 1
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[41]
Yuanchen Wu, Lu Zhang, Hang Yao, Junlong Du, Ke Yan, Shouhong Ding, Yunsheng Wu, and Xiaoqiang Li. Anti- dote: A unified framework for mitigating lvlm hallucinations in counterfactual presupposition and object perception. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 14646–14656, 2025. 3, 5
work page 2025
-
[42]
Nullu: Mitigating object hallucinations in large vision-language models via halluspace projection
Le Yang, Ziwei Zheng, Boxu Chen, Zhengyu Zhao, Chenhao Lin, and Chao Shen. Nullu: Mitigating object hallucinations in large vision-language models via halluspace projection. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 14635–14645, 2025. 1, 2, 5, 6, 8
work page 2025
-
[43]
Designing energy-efficient convolutional neural networks using energy- aware pruning
Tien-Ju Yang, Yu-Hsin Chen, and Vivienne Sze. Designing energy-efficient convolutional neural networks using energy- aware pruning. InCVPR, 2017. 3
work page 2017
-
[44]
A Survey on Multimodal Large Language Models
Shukang Yin, Chaoyou Fu, Sirui Zhao, Ke Li, Xing Sun, Tong Xu, and Enhong Chen. A survey on multimodal large language models.National Science Review, 2024. Earlier arXiv:2306.13549. 1
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[45]
Shukang Yin, Chaoyou Fu, Sirui Zhao, Tong Xu, Hao Wang, Dianbo Sui, Yunhang Shen, Ke Li, Xing Sun, and Enhong Chen. Woodpecker: Hallucination correction for multimodal large language models.Science China Information Sciences, 67(12):220105, 2024. 2
work page 2024
-
[46]
Neuron-level knowl- edge attribution in large language models
Zeping Yu and Sophia Ananiadou. Neuron-level knowl- edge attribution in large language models. InProceedings of the 2024 Conference on Empirical Methods in Natural Lan- guage Processing, pages 3267–3280, 2024. 4
work page 2024
-
[47]
Zeping Yu and Sophia Ananiadou. Understanding multi- modal llms: the mechanistic interpretability of llava in visual question answering.arXiv preprint arXiv:2411.10950, 2024. 4
-
[48]
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yu- long Chen, et al. Siren’s song in the ai ocean: a survey on hallucination in large language models.arXiv preprint arXiv:2309.01219, 2023. 2
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[49]
Analyzing and mitigating object hallucination in large vision-language models
Yiyang Zhou, Chenhang Cui, Jaehong Yoon, Linjun Zhang, Zhun Deng, Chelsea Finn, Mohit Bansal, and Huaxiu Yao. Analyzing and mitigating object hallucination in large vision-language models. InThe Twelfth International Con- ference on Learning Representations, 2024. 4 Locate-then-Sparsify: Attribution Guided Sparse Strategy for Visual Hallucination Mitigati...
work page 2024
-
[50]
Details of the construction of the dataset In this section, we introduce the details of how to construct the Bi-granularity Dataset. At first, to preserve generalization, the data used for dataset construction and the data used for experiments are strictly disjoint. Particularly, for data selected based on CHAIR and POPE, we use data from train spilt of M...
-
[51]
The mask threshold rs is selected to be 0.5, as shown in Tab.5 of main text
Implementation details of LTS-FS Hyper-parameters.The strength control parameters of sl tok:λ cue, λpos, λhall is set to be 1. The mask threshold rs is selected to be 0.5, as shown in Tab.5 of main text. Environment.All the experiments are conducted on one A100 80G. For 7B model, two RTX3090 24G can replace A100. For detailed python requirements, please r...
-
[52]
Compared methods.We employ the default parameters and settings as reported in the original papers
Implementation Settings of CHAIR Results Generation Setting.Here we set the generation config as follows:Max New Tokens=128,num beams=1, and sampling=False. Compared methods.We employ the default parameters and settings as reported in the original papers
-
[53]
Generation Capability. To evaluate general capability more comprehensively, we perform an evaluation using a broader benchmark called CLAIR [6]. This result in Tab. 7 shows that LTS-FS achieves a better trade-off between hallucination mitigation and general capability preservation. Table 7. Trade-off between hallucination mitigation and general capability...
-
[54]
Compared methods.We employ the default parameters and settings as reported in original papers
More details of POPE results Generation Setting.Here we set the generation config as follows:Max New Tokens=16,num beams=1, andsam- pling=False. Compared methods.We employ the default parameters and settings as reported in original papers. Total Results.The total results is shown in Tab. 13. Across all settings, our LTS-FS framework achieves the best accu...
-
[55]
More details of MME results We report the MME numerical results in Tab. 8. The nu- merical results demonstrate that LTS-FS can strongly in- crease the mitigation abilitity of feature steering methods. Specifically, across the subset most related to hallucina- tion: Count, and Position, LTS-FS achieves great improve- ments, highlighting its effectiveness i...
-
[56]
Time Analysis There are two time cost analysis, the time to apply methods and the time for inference. The time to apply methods is the time to employ a hallucination mitigation method into a specific LVLMs. As an example, in order to apply VTI to LVLMs, the direction vector needs to be computed and the Table 8. Results on all MME perception-related tasks....
-
[57]
Ablation Study about Indicators In this section, we discuss the effect of the three indicator in sentence level hallucination attribution. The result is shown in Tab. 11. We investigate the effect of removing each indi- cator in turn and find thatw/o cue indicatorandw/o position indicatoryield only small changes, whereasw/o hallucina- tioncauses a much la...
-
[58]
Discussion about Generalization To assess generalization beyond the construction sources, we evaluate on datasets whose distributions differ from those used to build our bi-granularity labels. Although the construction leverages CHAIR, POPE, and Antidote, we additionally report results on MME and LLaV A-Bench, which serve as out of- istribution dataset of...
-
[59]
7, which demonstrates our the effectiveness of our framework in hallucination mitigation
More cases in LLaV A-bench More case studies on the LLaV A-bench are presented in Fig. 7, which demonstrates our the effectiveness of our framework in hallucination mitigation. In particular, color and count attributes are given greater emphasis, thereby avoiding hallucinations in these aspects
-
[60]
GPT4v-Evaluation prompt Following VCD, the prompt for GPT4v-aided evaluation is shown in Fig. 8. The GPT4v receive three type of LVLM’s responses and then generate output. Then we collect the output from GPT4v and finally report the average accuracy and detailedness
-
[61]
Limitation and future work Although our approach can be effectively ported to feature- steering methods and achieves strong hallucination mitiga- tion, there is still room for development. Since existing fea- ture steering techniques have not been evaluated on larger 70B-scale models, extending our method to 70B models re- mains a challenge. We aim to ext...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.