Recognition: unknown
Null Space Constrained Contrastive Visual Forgetting for MLLM Unlearning
Pith reviewed 2026-05-08 11:01 UTC · model grok-4.3
The pith
Freezing the LLM backbone and constraining visual module updates to the null space of retained knowledge lets MLLMs forget target visual concepts while keeping all other knowledge intact.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce Null Space Constrained Contrastive Visual Forgetting, which achieves unlearning by fine-tuning only the visual module while the LLM backbone remains frozen; contrastive visual forgetting guides target visual representations toward appropriate feature-space regions, and the null space of retained knowledge constrains the updates so that target visual knowledge is removed without degrading non-target visuals or any textual knowledge, with the method also extending to continual unlearning.
What carries the argument
Null Space Constrained Contrastive Visual Forgetting, which uses contrastive separation of target visual representations together with projection of all updates into the null space associated with retained knowledge.
Load-bearing premise
Fine-tuning only the visual module while freezing the LLM backbone is sufficient to remove target visual knowledge without degrading non-target visual knowledge or any textual knowledge.
What would settle it
A controlled test in which the unlearned model still produces accurate outputs for queries about the target visual concepts or shows measurable accuracy drops on non-target visual or textual benchmarks would falsify the central claim.
Figures
read the original abstract
The core challenge of machine unlearning is to strike a balance between target knowledge removal and non-target knowledge retention. In the context of Multimodal Large Language Models (MLLMs), this challenge becomes even more pronounced, as knowledge is further divided into visual and textual modalities that are tightly intertwined. In this paper, we introduce an MLLM unlearning approach that aims to forget target visual knowledge while preserving non-target visual knowledge and all textual knowledge. Specifically, we freeze the LLM backbone and achieve unlearning by fine-tuning the visual module. First, we propose a Contrastive Visual Forgetting (CVF) mechanism to separate target visual knowledge from retained visual knowledge, guiding the representations of target visual concepts toward appropriate regions in the feature space. Second, we identify the null space associated with retained knowledge and constrain the unlearning process within this space, thereby significantly mitigating degradation in knowledge retention. Third, beyond static unlearning scenarios, we extend our approach to continual unlearning, where forgetting requests arrive sequentially. Extensive experiments across diverse benchmarks demonstrate that our approach achieves a strong balance between effective forgetting and robust knowledge retention.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Null Space Constrained Contrastive Visual Forgetting (CVF) for unlearning target visual knowledge in Multimodal Large Language Models (MLLMs). It freezes the LLM backbone and fine-tunes only the visual module: CVF separates target from retained visual representations via contrastive guidance in feature space, while updates are constrained to the null space of retained-knowledge gradients to protect non-target visual and all textual knowledge. The method is extended to continual unlearning with sequential forgetting requests, and the abstract claims extensive experiments on diverse benchmarks demonstrate an effective balance between forgetting and retention.
Significance. If the empirical claims hold, the work offers a computationally efficient, modality-aware unlearning technique for MLLMs that avoids full-model retraining. The null-space projection idea provides a principled way to mitigate retention degradation during targeted forgetting and could generalize to other continual or selective unlearning settings in large multimodal models.
major comments (3)
- [§3 (Method)] The load-bearing assumption (stated in the abstract and §3) that freezing the LLM while fine-tuning only the visual module isolates target visual forgetting without side-effects is not obviously true. Because the LLM was trained on the original visual feature distribution, any shift induced by CVF—even when projected into a null space—can alter input statistics to the frozen weights and degrade non-target visual or textual performance; this requires explicit ablation on retention metrics for both modalities.
- [§3.2] §3.2 (Null-space identification): The claim that the null space of retained-knowledge gradients can be accurately recovered from limited unlearning data is central to preventing leakage, yet high-dimensional visual feature spaces make exact orthogonal complement recovery unlikely. Without a concrete procedure (e.g., how many samples, how gradients are aggregated, or error bounds), it is unclear whether the constraint actually blocks updates into retained directions.
- [Experiments section] The abstract asserts 'extensive experiments across diverse benchmarks' and a 'strong balance,' but supplies no quantitative metrics, baselines, ablation tables, or implementation details. Without these, the central empirical claim cannot be evaluated and the soundness of the method remains unverifiable.
minor comments (2)
- [Abstract / §3.1] The description of 'appropriate regions in the feature space' for target concepts is vague; a figure or equation showing the target vs. retained separation would clarify the CVF objective.
- [§3] Notation for the null-space projection operator and the contrastive loss should be defined consistently before use in equations.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, clarifying aspects of the method and committing to revisions that strengthen the presentation and empirical support.
read point-by-point responses
-
Referee: [§3 (Method)] The load-bearing assumption (stated in the abstract and §3) that freezing the LLM while fine-tuning only the visual module isolates target visual forgetting without side-effects is not obviously true. Because the LLM was trained on the original visual feature distribution, any shift induced by CVF—even when projected into a null space—can alter input statistics to the frozen weights and degrade non-target visual or textual performance; this requires explicit ablation on retention metrics for both modalities.
Authors: We appreciate the referee's emphasis on this potential issue. The design of CVF with null-space projection is intended to avoid updates that would degrade retained directions, and the manuscript already reports retention results on both non-target visual tasks and textual benchmarks to support that side-effects are limited. To make the isolation explicit, we will add a dedicated ablation subsection and table in the revised §3 and Experiments that directly compares retention metrics (for both modalities) with and without the null-space constraint. revision: yes
-
Referee: [§3.2] §3.2 (Null-space identification): The claim that the null space of retained-knowledge gradients can be accurately recovered from limited unlearning data is central to preventing leakage, yet high-dimensional visual feature spaces make exact orthogonal complement recovery unlikely. Without a concrete procedure (e.g., how many samples, how gradients are aggregated, or error bounds), it is unclear whether the constraint actually blocks updates into retained directions.
Authors: This is a valid request for implementation details. We will expand §3.2 in the revision to specify the exact procedure: the number of retained samples used to compute gradients, the aggregation method (averaging), the numerical technique for recovering the orthogonal complement, and a short analysis of approximation quality via singular-value thresholds. This will allow readers to assess the reliability of the constraint. revision: yes
-
Referee: [Experiments section] The abstract asserts 'extensive experiments across diverse benchmarks' and a 'strong balance,' but supplies no quantitative metrics, baselines, ablation tables, or implementation details. Without these, the central empirical claim cannot be evaluated and the soundness of the method remains unverifiable.
Authors: We acknowledge that the experimental presentation must be fully self-contained. The manuscript's Experiments section contains quantitative results, baseline comparisons, and ablation studies across the reported benchmarks; however, we will reorganize and expand this section in the revision to include all requested elements—complete metric tables, baseline descriptions, hyperparameter details, and data splits—so that the claims in the abstract are directly verifiable from the text. revision: yes
Circularity Check
Empirical method with no derivation chain or self-referential reductions
full rationale
The paper describes an empirical unlearning technique: freezing the LLM backbone, fine-tuning only the visual module via Contrastive Visual Forgetting (CVF) to separate target from retained visual knowledge, plus a null-space constraint on retained-knowledge directions. No equations, closed-form derivations, or first-principles predictions appear in the provided abstract or method summary. Claims of balance between forgetting and retention are supported solely by experimental results on benchmarks rather than any reduction of outputs to fitted inputs or self-citations by construction. The approach is presented as a practical algorithm validated externally, with no load-bearing steps that equate to their own inputs.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Peng Wang, Shuai Bai, Sinan Tan, Shijie Wang, Zhihao Fan, Jinze Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, et al. Qwen2-vl: Enhancing vision-language model’s perception of the world at any resolution.arXiv preprint arXiv:2409.12191, 2024
work page internal anchor Pith review arXiv 2024
-
[2]
Visual instruction tuning.Advances in neural information processing systems, 36:34892–34916, 2023
Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. Visual instruction tuning.Advances in neural information processing systems, 36:34892–34916, 2023
2023
-
[3]
Pencil: Long thoughts with short memory.arXiv preprint arXiv:2503.14337, 2025
Chenxiao Yang, Nathan Srebro, David McAllester, and Zhiyuan Li. Pencil: Long thoughts with short memory.arXiv preprint arXiv:2503.14337, 2025
-
[4]
On path to multimodal generalist: General-level and general-bench
Hao Fei, Yuan Zhou, Juncheng Li, Xiangtai Li, Qingshan Xu, Bobo Li, Shengqiong Wu, Yaoting Wang, Junbao Zhou, Jiahao Meng, et al. On path to multimodal generalist: General-level and general-bench. InForty-second International Conference on Machine Learning, 2025
2025
-
[5]
Jiahao Huo, Yibo Yan, Xu Zheng, Yuanhuiyi Lyu, Xin Zou, Zhihua Wei, and Xuming Hu. Mmunlearner: Reformulating multimodal machine unlearning in the era of multimodal large language models.arXiv preprint arXiv:2502.11051, 2025
-
[6]
Towards safer large language models through machine unlearning.arXiv preprint arXiv:2402.10058, 2024
Zheyuan Liu, Guangyao Dou, Zhaoxuan Tan, Yijun Tian, and Meng Jiang. Towards safer large language models through machine unlearning.arXiv preprint arXiv:2402.10058, 2024
-
[7]
Clear: Character unlearning in textual and visual modalities
Alexey Dontsov, Dmitrii Korzh, Alexey Zhavoronkin, Boris Mikheev, Denis Bobkov, Aibek Alanov, Oleg Rogov, Ivan Oseledets, and Elena Tutubalina. Clear: Character unlearning in textual and visual modalities. InFindings of the Association for Computational Linguistics: ACL 2025, pages 20582–20603, 2025
2025
-
[8]
International conference on machine learning.Transactions on machine learning research, 2023
Wenjie Li, Chi-hua Wang, Guang Cheng, and Qifan Song. International conference on machine learning.Transactions on machine learning research, 2023
2023
-
[9]
Towards making systems forget with machine unlearning
Yinzhi Cao and Junfeng Yang. Towards making systems forget with machine unlearning. In 2015 IEEE symposium on security and privacy, pages 463–480. IEEE, 2015
2015
-
[10]
Adaptive machine unlearning.Advances in Neural Information Processing Systems, 34: 16319–16330, 2021
Varun Gupta, Christopher Jung, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, and Chris Waites. Adaptive machine unlearning.Advances in Neural Information Processing Systems, 34: 16319–16330, 2021
2021
-
[11]
Remember what you want to forget: Algorithms for machine unlearning.Advances in Neural Information Processing Systems, 34:18075–18086, 2021
Ayush Sekhari, Jayadev Acharya, Gautam Kamath, and Ananda Theertha Suresh. Remember what you want to forget: Algorithms for machine unlearning.Advances in Neural Information Processing Systems, 34:18075–18086, 2021
2021
-
[12]
Machine unlearning via algorithmic stability
Enayat Ullah, Tung Mai, Anup Rao, Ryan A Rossi, and Raman Arora. Machine unlearning via algorithmic stability. InConference on Learning Theory, pages 4126–4142. PMLR, 2021
2021
-
[13]
TOFU: A Task of Fictitious Unlearning for LLMs
Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary C Lipton, and J Zico Kolter. Tofu: A task of fictitious unlearning for llms.arXiv preprint arXiv:2401.06121, 2024
work page internal anchor Pith review arXiv 2024
-
[14]
Direct preference optimization: Your language model is secretly a reward model
Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn. Direct preference optimization: Your language model is secretly a reward model. Advances in neural information processing systems, 36:53728–53741, 2023
2023
-
[15]
Ruiqi Zhang, Licong Lin, Yu Bai, and Song Mei. Negative preference optimization: From catastrophic collapse to effective unlearning.arXiv preprint arXiv:2404.05868, 2024. 10
-
[16]
Yue Wang, Qizhou Wang, Feng Liu, Wei Huang, Yali Du, Xiaojiang Du, and Bo Han. Gru: Miti- gating the trade-off between unlearning and retention for llms.arXiv preprint arXiv:2503.09117, 2025
-
[17]
Mllm machine unlearning via visual knowledge distillation.arXiv preprint arXiv:2512.11325, 2025
Yuhang Wang, Zhenxing Niu, Haoxuan Ji, Guangyu He, Haichang Gao, and Gang Hua. Mllm machine unlearning via visual knowledge distillation.arXiv preprint arXiv:2512.11325, 2025
-
[18]
Junfeng Fang, Houcheng Jiang, Kun Wang, Yunshan Ma, Shi Jie, Xiang Wang, Xiangnan He, and Tat-Seng Chua. Alphaedit: Null-space constrained knowledge editing for language models. arXiv preprint arXiv:2410.02355, 2024
-
[19]
Cmmlu: Measuring massive multitask language understanding in chinese
Haonan Li, Yixuan Zhang, Fajri Koto, Yifei Yang, Hai Zhao, Yeyun Gong, Nan Duan, and Timothy Baldwin. Cmmlu: Measuring massive multitask language understanding in chinese. InFindings of the Association for Computational Linguistics: ACL 2024, pages 11260–11285, 2024
2024
-
[20]
Evaluating large language models in class-level code generation
Xueying Du, Mingwei Liu, Kaixin Wang, Hanlin Wang, Junwei Liu, Yixuan Chen, Jiayi Feng, Chaofeng Sha, Xin Peng, and Yiling Lou. Evaluating large language models in class-level code generation. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering, pages 1–13, 2024
2024
-
[21]
Unrolling sgd: Understanding factors influencing machine unlearning
Anvith Thudi, Gabriel Deza, Varun Chandrasekaran, and Nicolas Papernot. Unrolling sgd: Understanding factors influencing machine unlearning. In2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P), pages 303–319. IEEE, 2022
2022
-
[22]
Commonsense knowledge editing based on free-text in llms.arXiv preprint arXiv:2410.23844, 2024
Xiusheng Huang, Yequan Wang, Jun Zhao, and Kang Liu. Commonsense knowledge editing based on free-text in llms.arXiv preprint arXiv:2410.23844, 2024
-
[23]
Zeping Yu and Sophia Ananiadou. Understanding multimodal llms: the mechanistic inter- pretability of llava in visual question answering.arXiv preprint arXiv:2411.10950, 2024
-
[24]
Representation Learning with Contrastive Predictive Coding
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding.arXiv preprint arXiv:1807.03748, 2018
work page internal anchor Pith review arXiv 2018
-
[25]
Lora: Low-rank adaptation of large language models.ICLR, 1 (2):3, 2022
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. Lora: Low-rank adaptation of large language models.ICLR, 1 (2):3, 2022
2022
-
[26]
Protecting privacy in multimodal large language models with mllmu-bench
Zheyuan Liu, Guangyao Dou, Mengzhao Jia, Zhaoxuan Tan, Qingkai Zeng, Yongle Yuan, and Meng Jiang. Protecting privacy in multimodal large language models with mllmu-bench. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages ...
2025
-
[27]
Umu-bench: Closing the modality gap in multimodal unlearning evaluation
Chengye Wang, Yuyuan Li, XiaoHua Feng, Chaochao Chen, Xiaolin Zheng, and Jianwei Yin. Umu-bench: Closing the modality gap in multimodal unlearning evaluation. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2025
2025
-
[28]
Continual learning and private unlearning
Bo Liu, Qiang Liu, and Peter Stone. Continual learning and private unlearning. InConference on Lifelong Learning Agents, pages 243–254. PMLR, 2022
2022
-
[29]
Variational bayesian unlearning.Advances in Neural Information Processing Systems, 33:16025–16036, 2020
Quoc Phong Nguyen, Bryan Kian Hsiang Low, and Patrick Jaillet. Variational bayesian unlearning.Advances in Neural Information Processing Systems, 33:16025–16036, 2020
2020
-
[30]
Zheyuan Liu, Guangyao Dou, Xiangchi Yuan, Chunhui Zhang, Zhaoxuan Tan, and Meng Jiang. Modality-aware neuron pruning for unlearning in multimodal large language models.arXiv preprint arXiv:2502.15910, 2025
-
[31]
Rouge: A package for automatic evaluation of summaries
Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. InText summarization branches out, pages 74–81, 2004
2004
-
[32]
Jiahao Huo, Yibo Yan, Boren Hu, Yutao Yue, and Xuming Hu. Mmneuron: Discovering neuron-level domain-specific interpretation in multimodal large language model.arXiv preprint arXiv:2406.11193, 2024. 11
-
[33]
Fast exact unlearning for in-context learning data for llms
Andrei Ioan Muresanu, Anvith Thudi, Michael R Zhang, and Nicolas Papernot. Fast exact unlearning for in-context learning data for llms. InForty-second International Conference on Machine Learning
-
[34]
Large language model unlearning via embedding-corrupted prompts.Advances in Neural Information Processing Systems, 37: 118198–118266, 2024
Chris Liu, Yaxuan Wang, Jeffrey Flanigan, and Yang Liu. Large language model unlearning via embedding-corrupted prompts.Advances in Neural Information Processing Systems, 37: 118198–118266, 2024
2024
-
[35]
Performance gap in entity knowledge extraction across modalities in vision language models
Ido Cohen, Daniela Gottesman, Mor Geva, and Raja Giryes. Performance gap in entity knowledge extraction across modalities in vision language models. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 29095–29108, 2025
2025
-
[36]
Single image unlearning: Efficient machine unlearning in multimodal large language models.Advances in Neural Information Processing Systems, 37:35414–35453, 2024
Jiaqi Li, Qianshan Wei, Chuanyi Zhang, Guilin Qi, Miaozeng Du, Yongrui Chen, Sheng Bi, and Fan Liu. Single image unlearning: Efficient machine unlearning in multimodal large language models.Advances in Neural Information Processing Systems, 37:35414–35453, 2024
2024
-
[37]
Sauce: Selective concept unlearning in vision-language models with sparse autoencoders
Jiahui Geng and Qing Li. Sauce: Selective concept unlearning in vision-language models with sparse autoencoders. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 3023–3033, 2025
2025
-
[38]
arXiv preprint arXiv:2310.12508 (2023)
Chongyu Fan, Jiancheng Liu, Yihua Zhang, Eric Wong, Dennis Wei, and Sijia Liu. Salun: Empowering machine unlearning via gradient-based weight saliency in both image classification and generation.arXiv preprint arXiv:2310.12508, 2023. 12 A Motivation and Problem Formulation Recent literature presents two competing definitions of MLLM unlearning: (i) forget...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.