Machine Unlearning on Pre-trained Models by Residual Feature Alignment Using LoRA

Laiqiao Qin; Linlin Wang; Tianqing Zhu; Wanlei Zhou

arxiv: 2411.08443 · v2 · submitted 2024-11-13 · 💻 cs.LG · cs.CV

Machine Unlearning on Pre-trained Models by Residual Feature Alignment Using LoRA

Laiqiao Qin , Tianqing Zhu , Linlin Wang , Wanlei Zhou This is my paper

Pith reviewed 2026-05-23 17:28 UTC · model grok-4.3

classification 💻 cs.LG cs.CV

keywords machine unlearningpre-trained modelsLoRAresidual featuresfeature alignmentfine-tuningprivacy

0 comments

The pith

LoRA lets pre-trained models unlearn specific data by aligning residual features at intermediate layers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Residual Feature Alignment Unlearning, which applies LoRA to split a model's intermediate features into their original pre-trained components and adjustable residuals. Residuals are driven to zero on data the model should keep and shifted on data to forget, so the unlearned model stays aligned with the original at the feature level. This targets the problems of high cost from full-parameter fine-tuning and unwanted feature drift that hurt utility on retained data. A reader would care because it offers an efficient route to remove private or harmful examples from large models while trying to preserve overall behavior.

Core claim

The central claim is that leveraging LoRA to decompose the model's intermediate features into pre-trained features and residual features, then adjusting those residuals, aligns the unlearned model with the pre-trained model at the intermediate feature level, thereby achieving both the unlearning target on the specified subset and retention of performance on the remaining data.

What carries the argument

LoRA decomposition of intermediate features into pre-trained and residual parts, with residuals set to zero on the retained set and shifted on the unlearning set.

If this is right

Unlearning becomes feasible on pre-trained models without retraining or fine-tuning all parameters.
Intermediate-layer feature distributions remain close to the original model on retained data.
The same LoRA-based alignment can be applied across multiple datasets and model architectures.
Unlearning and retention objectives are met simultaneously through the choice of residual targets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may extend to other low-rank adaptation methods if they can isolate similar residual components.
It could reduce the need for full model retraining in privacy-regulated settings where only a few examples must be forgotten.
If the residual shifts prove stable, the method might support incremental unlearning of additional batches over time.

Load-bearing premise

Driving residuals to zero on the retained set and to shifted values on the unlearning set via LoRA removes the influence of the unlearning data without unintended shifts in behavior on retained data or downstream tasks.

What would settle it

An experiment in which the unlearned model still shows high accuracy or memorization on the unlearning set after residual adjustment, or in which performance on the retained set drops below the pre-trained baseline, would falsify the alignment claim.

Figures

Figures reproduced from arXiv: 2411.08443 by Laiqiao Qin, Linlin Wang, Tianqing Zhu, Wanlei Zhou.

**Figure 2.** Figure 2: The unlearning process of residual feature alignment. During training, [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: (a) and (b) illustrate the impact on features in [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: To simplify code implementation, a teacher-student network architec [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: The impact of γ on accuracy and feature distance. (a) and (b) show the effect of different γ values on accuracy, while (c) and (d) show the effect of different γ values on feature distance. Among these, (a) and (c) represent sample unlearning, and (b) and (d) represent class unlearning. However, both losses have a similar influence on model performance, likely because the intermediate layer’s loss reflect… view at source ↗

read the original abstract

Machine unlearning is an emerging technology that removes a subset of the training data from a trained model without significantly affecting the model performance on the remaining data. This topic is becoming increasingly important in protecting user privacy and eliminating harmful or outdated data. The key challenge lies in effectively and efficiently unlearning specific information without compromising the model's utility on the retained data. For pre-trained models, fine-tuning is an important way to achieve the unlearning target. Previous work typically fine-tuned the entire model's parameters, which incurred significant computational costs. In addition, the fine-tuning process may cause shifts in the intermediate layer features, affecting the model's overall utility. In this work, we propose a novel and efficient machine unlearning method for pre-trained models. We term the method Residual Feature Alignment Unlearning. Specifically, we leverage LoRA (Low-Rank Adaptation) to decompose the model's intermediate features into pre-trained features and residual features. By adjusting the residual features, we align the unlearned model with the pre-trained model at the intermediate feature level to achieve both unlearning and remaining targets. The method aims to learn zero residuals on the retained set and shifted residuals on the unlearning set. Extensive experiments on numerous datasets validate the effectiveness of our approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper tries a LoRA-based residual alignment to unlearn by zeroing features on retained data and shifting them on the forget set, but the abstract gives no numbers or checks on whether this actually works without side effects.

read the letter

The main point is that this work uses LoRA to split intermediate features into a fixed pre-trained part and a learnable residual, then trains the residual to hit zero on retained examples while hitting some target shift on the unlearning set. That is the claimed way to get unlearning without touching the whole model or losing utility on the rest of the data. It is new in framing the unlearning objective exactly this way around feature residuals rather than parameter or output changes. The efficiency angle is reasonable: LoRA keeps the update cheap for large pre-trained models, and focusing on intermediate layers is a sensible attempt to limit unintended drift. The abstract says experiments on many datasets back it up, which at least shows the authors ran the usual checks. That is the credit due. The soft spot is that nothing in the provided text shows the actual metrics, baselines, or ablations, so the central claim stays untested here. The stress-test concern lands: a low-rank residual has to be driven exactly to zero across the retained distribution while taking non-zero values on the unlearning distribution, and the paper gives no derivation or evidence that the chosen targets produce real forgetting (membership inference resistance, for example) or that the low-rank form does not force unwanted coupling between the two sets. If the optimization on finite samples fails to generalize or if the shift on the unlearning set leaks into retained behavior, the method collapses. This is aimed at the machine unlearning subgroup working on pre-trained models. A reader already in that area might pick up the LoRA trick and try it, but the lack of concrete results makes it hard to know the practical value. It deserves a serious referee to look at the full experiments and the actual unlearning strength.

Referee Report

3 major / 2 minor

Summary. The paper proposes Residual Feature Alignment Unlearning (RFAU), a method for machine unlearning on pre-trained models. It employs LoRA to decompose intermediate-layer features into pre-trained components and low-rank residual components. The LoRA parameters are optimized so that residuals are driven to zero on the retained dataset (preserving alignment with the original pre-trained model) and to non-zero target shifts on the unlearning dataset, with the goal of removing the influence of the unlearning data while maintaining model utility on retained data and downstream tasks. The abstract states that extensive experiments on multiple datasets validate the approach.

Significance. If the central assumption holds—that a low-rank residual adapter can be optimized to exactly zero residuals on the full retained distribution while producing the required shifts on the unlearning set without collateral effects on generalization or downstream performance—the method would offer a computationally efficient alternative to full fine-tuning for unlearning in large pre-trained models. The feature-level alignment strategy is a distinct angle relative to existing parameter- or output-space unlearning techniques.

major comments (3)

[Abstract / Method description] The central claim rests on the existence of a low-rank residual function r(x) such that r(x) = 0 for all x drawn from the retained distribution while r(x) equals a chosen non-zero target for x in the unlearning set. No derivation, existence proof, or capacity analysis is supplied showing that the low-rank constraint permits this separation without forcing r(x) near zero on the unlearning set or introducing unintended shifts on retained data.
[Abstract] The abstract asserts that the method achieves both unlearning and retention targets, yet supplies no quantitative results (accuracy, membership-inference attack success rates, or downstream task metrics), no baselines, and no ablation on the choice of residual targets or LoRA rank. Without these, it is impossible to verify that the optimization on finite samples generalizes to the retained distribution as required.
[Abstract / implied in method] The weakest assumption—that driving residuals to zero on retained samples via LoRA will not alter the model's behavior on retained data or downstream tasks—is stated but not accompanied by any analysis of the optimization landscape or generalization gap induced by the low-rank adapter.

minor comments (2)

Notation for the residual target values on the unlearning set and the precise loss terms used to enforce zero versus shifted residuals should be defined explicitly with equations.
The abstract refers to 'numerous datasets' but does not name them or indicate the model architectures used; this information belongs in the abstract or a dedicated experimental-setup paragraph.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below, agreeing where revisions are needed to strengthen the presentation while defending the empirical contributions of the work.

read point-by-point responses

Referee: [Abstract / Method description] The central claim rests on the existence of a low-rank residual function r(x) such that r(x) = 0 for all x drawn from the retained distribution while r(x) equals a chosen non-zero target for x in the unlearning set. No derivation, existence proof, or capacity analysis is supplied showing that the low-rank constraint permits this separation without forcing r(x) near zero on the unlearning set or introducing unintended shifts on retained data.

Authors: We acknowledge the absence of a formal existence proof or capacity analysis for the low-rank residual function. The method is primarily empirical, relying on LoRA's demonstrated ability to capture task-specific adaptations in prior work. In revision we will add a dedicated discussion subsection citing LoRA approximation results from the literature and presenting empirical evidence from our optimization that the low-rank constraint achieves the desired separation on the evaluated distributions without forcing residuals near zero on the unlearning set. revision: yes
Referee: [Abstract] The abstract asserts that the method achieves both unlearning and retention targets, yet supplies no quantitative results (accuracy, membership-inference attack success rates, or downstream task metrics), no baselines, and no ablation on the choice of residual targets or LoRA rank. Without these, it is impossible to verify that the optimization on finite samples generalizes to the retained distribution as required.

Authors: The abstract prioritizes conciseness while the full paper contains quantitative results, baselines, and ablations in the experiments section. To directly address the concern we will revise the abstract to include a small number of key metrics (e.g., unlearning effectiveness via MIA success rate reduction and retained-data accuracy) along with a brief mention of the LoRA rank used. revision: yes
Referee: [Abstract / implied in method] The weakest assumption—that driving residuals to zero on retained samples via LoRA will not alter the model's behavior on retained data or downstream tasks—is stated but not accompanied by any analysis of the optimization landscape or generalization gap induced by the low-rank adapter.

Authors: We agree that an explicit analysis of the optimization landscape and generalization implications would improve clarity. In the revised manuscript we will expand the method section with a short discussion of the objective function, why the low-rank updates are localized, and supporting evidence from downstream-task performance that the generalization gap remains small. revision: yes

Circularity Check

0 steps flagged

No circularity: method defined by explicit objectives and validated empirically, not by self-referential reduction.

full rationale

The paper proposes Residual Feature Alignment Unlearning via LoRA, with the explicit goal of learning zero residuals on retained data and shifted residuals on unlearning data to align intermediate features. No equations, derivations, or self-citations appear in the provided text that reduce the unlearning claim to a quantity fitted or defined by the method itself. The central premise is an engineering assumption about low-rank residuals, supported by experimental validation rather than mathematical self-definition or imported uniqueness results. This is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are stated in the provided text.

pith-pipeline@v0.9.0 · 5754 in / 1052 out tokens · 21899 ms · 2026-05-23T17:28:47.411354+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 6 internal anchors

[1]

Kga: A general machine unlearning framework based on knowledge gap alignment,

L. Wang, T. Chen, W. Yuan, X. Zeng, K.-F. Wong, and H. Yin, “Kga: A general machine unlearning framework based on knowledge gap alignment,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2023, pp. 13 264–13 276

work page 2023
[2]

A survey of machine unlearning

T. T. Nguyen, T. T. Huynh, P. L. Nguyen, A. W.-C. Liew, H. Yin, and Q. V . H. Nguyen, “A survey of machine unlearning,” arXiv preprint arXiv:2209.02299, 2022

work page arXiv 2022
[3]

Machine unlearning: A survey,

H. Xu, T. Zhu, L. Zhang, W. Zhou, and P. S. Yu, “Machine unlearning: A survey,” ACM Comput. Surv. , vol. 56, no. 1, pp. 9:1–9:36, 2024. [Online]. Available: https://doi.org/10.1145/3603620

work page doi:10.1145/3603620 2024
[4]

Machine unlearning,

L. Bourtoule, V . Chandrasekaran, C. A. Choquette-Choo, H. Jia, A. Travers, B. Zhang, D. Lie, and N. Papernot, “Machine unlearning,” in 2021 IEEE Symposium on Security and Privacy (SP) . IEEE, 2021, pp. 141–159

work page 2021
[5]

Erm-ktp: Knowledge-level machine unlearning via knowledge transfer,

S. Lin, X. Zhang, C. Chen, X. Chen, and W. Susilo, “Erm-ktp: Knowledge-level machine unlearning via knowledge transfer,” in Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 20 147–20 155

work page 2023
[6]

Cer- tified data removal from machine learning models,

C. Guo, T. Goldstein, A. Hannun, and L. Van Der Maaten, “Cer- tified data removal from machine learning models,” arXiv preprint arXiv:1911.03030, 2019

work page arXiv 1911
[7]

LoRA: Low-Rank Adaptation of Large Language Models

E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, and W. Chen, “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685 , 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[8]

Towards making systems forget with machine unlearning,

Y . Cao and J. Yang, “Towards making systems forget with machine unlearning,” in 2015 IEEE symposium on security and privacy . IEEE, 2015, pp. 463–480. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 15

work page 2015
[9]

Remember what you want to forget: Algorithms for machine unlearning,

A. Sekhari, J. Acharya, G. Kamath, and A. T. Suresh, “Remember what you want to forget: Algorithms for machine unlearning,” Advances in Neural Information Processing Systems , vol. 34, pp. 18 075–18 086, 2021

work page 2021
[10]

Differential privacy,

C. Dwork, “Differential privacy,” in International colloquium on au- tomata, languages, and programming . Springer, 2006, pp. 1–12

work page 2006
[11]

Making ai forget you: Data deletion in machine learning,

A. Ginart, M. Guan, G. Valiant, and J. Y . Zou, “Making ai forget you: Data deletion in machine learning,” Advances in neural information processing systems, vol. 32, 2019

work page 2019
[12]

Unrolling sgd: Understanding factors influencing machine unlearning,

A. Thudi, G. Deza, V . Chandrasekaran, and N. Papernot, “Unrolling sgd: Understanding factors influencing machine unlearning,” in 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P) . IEEE, 2022, pp. 303–319

work page 2022
[13]

Approximate data deletion from machine learning models,

Z. Izzo, M. A. Smart, K. Chaudhuri, and J. Zou, “Approximate data deletion from machine learning models,” in International Conference on Artificial Intelligence and Statistics . PMLR, 2021, pp. 2008–2016

work page 2021
[14]

Descent-to-delete: Gradient-based methods for machine unlearning,

S. Neel, A. Roth, and S. Sharifi-Malvajerdi, “Descent-to-delete: Gradient-based methods for machine unlearning,” in Algorithmic Learn- ing Theory. PMLR, 2021, pp. 931–962

work page 2021
[15]

Amnesiac machine learning,

L. Graves, V . Nagisetty, and V . Ganesh, “Amnesiac machine learning,” in Proceedings of the AAAI Conference on Artificial Intelligence , vol. 35, no. 13, 2021, pp. 11 516–11 524

work page 2021
[16]

Eternal sunshine of the spotless net: Selective forgetting in deep networks,

A. Golatkar, A. Achille, and S. Soatto, “Eternal sunshine of the spotless net: Selective forgetting in deep networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2020, pp. 9304–9312

work page 2020
[17]

Machine unlearning of features and labels,

A. Warnecke, L. Pirch, C. Wressnegger, and K. Rieck, “Machine unlearning of features and labels,” arXiv preprint arXiv:2108.11577 , 2021

work page arXiv 2021
[18]

On the necessity of auditable algorithmic definitions for machine unlearning,

A. Thudi, H. Jia, I. Shumailov, and N. Papernot, “On the necessity of auditable algorithmic definitions for machine unlearning,” in 31st USENIX Security Symposium (USENIX Security 22) , 2022, pp. 4007– 4022

work page 2022
[19]

Fast yet effective machine unlearning,

A. K. Tarun, V . S. Chundawat, M. Mandal, and M. Kankanhalli, “Fast yet effective machine unlearning,” IEEE Transactions on Neural Networks and Learning Systems , 2023

work page 2023
[20]

Zero- shot machine unlearning,

V . S. Chundawat, A. K. Tarun, M. Mandal, and M. Kankanhalli, “Zero- shot machine unlearning,” IEEE Transactions on Information Forensics and Security, 2023

work page 2023
[21]

Few-shot unlearning by model inversion,

Y . Yoon, J. Nam, H. Yun, J. Lee, D. Kim, and J. Ok, “Few-shot unlearning by model inversion,” arXiv preprint arXiv:2205.15567, 2022

work page arXiv 2022
[22]

Learning to unlearn: Instance-wise unlearning for pre-trained classifiers,

S. Cha, S. Cho, D. Hwang, H. Lee, T. Moon, and M. Lee, “Learning to unlearn: Instance-wise unlearning for pre-trained classifiers,” in Proceedings of the AAAI Conference on Artificial Intelligence , vol. 38, no. 10, 2024, pp. 11 186–11 194

work page 2024
[23]

Mixed-privacy forgetting in deep networks,

A. Golatkar, A. Achille, A. Ravichandran, M. Polito, and S. Soatto, “Mixed-privacy forgetting in deep networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 792–801

work page 2021
[24]

Deep unlearning via randomized conditionally independent hessians,

R. Mehta, S. Pal, V . Singh, and S. N. Ravi, “Deep unlearning via randomized conditionally independent hessians,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 10 422–10 431

work page 2022
[25]

Efficient two-stage model retraining for machine unlearning,

J. Kim and S. S. Woo, “Efficient two-stage model retraining for machine unlearning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 4361–4369

work page 2022
[26]

ARCANE: an efficient architecture for exact machine unlearning,

H. Yan, X. Li, Z. Guo, H. Li, F. Li, and X. Lin, “ARCANE: an efficient architecture for exact machine unlearning,” in Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23-29 July 2022 , L. D. Raedt, Ed. ijcai.org, 2022, pp. 4006–4013. [Online]. Available: https://doi.org/10.24963/ijcai.2022/556

work page doi:10.24963/ijcai.2022/556 2022
[27]

Understanding black-box predictions via influence functions,

P. W. Koh and P. Liang, “Understanding black-box predictions via influence functions,” in International conference on machine learning . PMLR, 2017, pp. 1885–1894

work page 2017
[28]

Can bad teaching induce forgetting? unlearning in deep networks using an incompetent teacher,

V . S. Chundawat, A. K. Tarun, M. Mandal, and M. Kankanhalli, “Can bad teaching induce forgetting? unlearning in deep networks using an incompetent teacher,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 6, 2023, pp. 7210–7217

work page 2023
[29]

Fitnets: Hints for thin deep nets,

R. Adriana, B. Nicolas, K. S. Ebrahimi, C. Antoine, G. Carlo, and B. Yoshua, “Fitnets: Hints for thin deep nets,” Proc. ICLR, vol. 2, no. 3, p. 1, 2015

work page 2015
[30]

Auto-Encoding Variational Bayes

D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[31]

High- resolution image synthesis with latent diffusion models,

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High- resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695

work page 2022
[32]

Towards unbounded machine unlearning,

M. Kurmanji, P. Triantafillou, J. Hayes, and E. Triantafillou, “Towards unbounded machine unlearning,” Advances in Neural Information Pro- cessing Systems, vol. 36, 2024

work page 2024
[33]

Knowledge distillation: A survey,

J. Gou, B. Yu, S. J. Maybank, and D. Tao, “Knowledge distillation: A survey,” International Journal of Computer Vision , vol. 129, no. 6, pp. 1789–1819, 2021

work page 2021
[34]

Learning multiple layers of features from tiny images,

A. Krizhevsky, “Learning multiple layers of features from tiny images,” Master’s thesis, University of Tront, 2009

work page 2009
[35]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

H. Xiao, K. Rasul, and R. V ollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,” arXiv preprint arXiv:1708.07747, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[36]

Learning word vectors for sentiment analysis,

A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y . Ng, and C. Potts, “Learning word vectors for sentiment analysis,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies . Portland, Oregon, USA: Association for Computational Linguistics, June 2011, pp. 142–150. [Online]. Available: http:...

work page 2011
[37]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition , 2016, pp. 770–778

work page 2016
[38]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929 , 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[39]

ImageNet Large Scale Visual Recognition Challenge,

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision (IJCV) , vol. 115, no. 3, pp. 211–252, 2015

work page 2015
[40]

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

V . Sanh, L. Debut, J. Chaumond, and T. Wolf, “Distilbert, a dis- tilled version of bert: smaller, faster, cheaper and lighter,” ArXiv, vol. abs/1910.01108, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1910
[41]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[42]

Language models are unsupervised multitask learners,

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog , vol. 1, no. 8, p. 9, 2019

work page 2019
[43]

Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary,

M. Chen, W. Gao, G. Liu, K. Peng, and C. Wang, “Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 7766–7775

work page 2023
[44]

Catastrophic forgetting in connectionist networks,

R. M. French, “Catastrophic forgetting in connectionist networks,” Trends in cognitive sciences , vol. 3, no. 4, pp. 128–135, 1999. Laiqiao Qin is a Ph.D. candidate at City University of Macau, Macao SAR, China. He received his M.Eng.degree in the Faculty of Data Science from City University of Macau. His research interests include AI security and privacy...

work page 1999

[1] [1]

Kga: A general machine unlearning framework based on knowledge gap alignment,

L. Wang, T. Chen, W. Yuan, X. Zeng, K.-F. Wong, and H. Yin, “Kga: A general machine unlearning framework based on knowledge gap alignment,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2023, pp. 13 264–13 276

work page 2023

[2] [2]

A survey of machine unlearning

T. T. Nguyen, T. T. Huynh, P. L. Nguyen, A. W.-C. Liew, H. Yin, and Q. V . H. Nguyen, “A survey of machine unlearning,” arXiv preprint arXiv:2209.02299, 2022

work page arXiv 2022

[3] [3]

Machine unlearning: A survey,

H. Xu, T. Zhu, L. Zhang, W. Zhou, and P. S. Yu, “Machine unlearning: A survey,” ACM Comput. Surv. , vol. 56, no. 1, pp. 9:1–9:36, 2024. [Online]. Available: https://doi.org/10.1145/3603620

work page doi:10.1145/3603620 2024

[4] [4]

Machine unlearning,

L. Bourtoule, V . Chandrasekaran, C. A. Choquette-Choo, H. Jia, A. Travers, B. Zhang, D. Lie, and N. Papernot, “Machine unlearning,” in 2021 IEEE Symposium on Security and Privacy (SP) . IEEE, 2021, pp. 141–159

work page 2021

[5] [5]

Erm-ktp: Knowledge-level machine unlearning via knowledge transfer,

S. Lin, X. Zhang, C. Chen, X. Chen, and W. Susilo, “Erm-ktp: Knowledge-level machine unlearning via knowledge transfer,” in Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 20 147–20 155

work page 2023

[6] [6]

Cer- tified data removal from machine learning models,

C. Guo, T. Goldstein, A. Hannun, and L. Van Der Maaten, “Cer- tified data removal from machine learning models,” arXiv preprint arXiv:1911.03030, 2019

work page arXiv 1911

[7] [7]

LoRA: Low-Rank Adaptation of Large Language Models

E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, and W. Chen, “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685 , 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021

[8] [8]

Towards making systems forget with machine unlearning,

Y . Cao and J. Yang, “Towards making systems forget with machine unlearning,” in 2015 IEEE symposium on security and privacy . IEEE, 2015, pp. 463–480. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 15

work page 2015

[9] [9]

Remember what you want to forget: Algorithms for machine unlearning,

A. Sekhari, J. Acharya, G. Kamath, and A. T. Suresh, “Remember what you want to forget: Algorithms for machine unlearning,” Advances in Neural Information Processing Systems , vol. 34, pp. 18 075–18 086, 2021

work page 2021

[10] [10]

Differential privacy,

C. Dwork, “Differential privacy,” in International colloquium on au- tomata, languages, and programming . Springer, 2006, pp. 1–12

work page 2006

[11] [11]

Making ai forget you: Data deletion in machine learning,

A. Ginart, M. Guan, G. Valiant, and J. Y . Zou, “Making ai forget you: Data deletion in machine learning,” Advances in neural information processing systems, vol. 32, 2019

work page 2019

[12] [12]

Unrolling sgd: Understanding factors influencing machine unlearning,

A. Thudi, G. Deza, V . Chandrasekaran, and N. Papernot, “Unrolling sgd: Understanding factors influencing machine unlearning,” in 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P) . IEEE, 2022, pp. 303–319

work page 2022

[13] [13]

Approximate data deletion from machine learning models,

Z. Izzo, M. A. Smart, K. Chaudhuri, and J. Zou, “Approximate data deletion from machine learning models,” in International Conference on Artificial Intelligence and Statistics . PMLR, 2021, pp. 2008–2016

work page 2021

[14] [14]

Descent-to-delete: Gradient-based methods for machine unlearning,

S. Neel, A. Roth, and S. Sharifi-Malvajerdi, “Descent-to-delete: Gradient-based methods for machine unlearning,” in Algorithmic Learn- ing Theory. PMLR, 2021, pp. 931–962

work page 2021

[15] [15]

Amnesiac machine learning,

L. Graves, V . Nagisetty, and V . Ganesh, “Amnesiac machine learning,” in Proceedings of the AAAI Conference on Artificial Intelligence , vol. 35, no. 13, 2021, pp. 11 516–11 524

work page 2021

[16] [16]

Eternal sunshine of the spotless net: Selective forgetting in deep networks,

A. Golatkar, A. Achille, and S. Soatto, “Eternal sunshine of the spotless net: Selective forgetting in deep networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2020, pp. 9304–9312

work page 2020

[17] [17]

Machine unlearning of features and labels,

A. Warnecke, L. Pirch, C. Wressnegger, and K. Rieck, “Machine unlearning of features and labels,” arXiv preprint arXiv:2108.11577 , 2021

work page arXiv 2021

[18] [18]

On the necessity of auditable algorithmic definitions for machine unlearning,

A. Thudi, H. Jia, I. Shumailov, and N. Papernot, “On the necessity of auditable algorithmic definitions for machine unlearning,” in 31st USENIX Security Symposium (USENIX Security 22) , 2022, pp. 4007– 4022

work page 2022

[19] [19]

Fast yet effective machine unlearning,

A. K. Tarun, V . S. Chundawat, M. Mandal, and M. Kankanhalli, “Fast yet effective machine unlearning,” IEEE Transactions on Neural Networks and Learning Systems , 2023

work page 2023

[20] [20]

Zero- shot machine unlearning,

V . S. Chundawat, A. K. Tarun, M. Mandal, and M. Kankanhalli, “Zero- shot machine unlearning,” IEEE Transactions on Information Forensics and Security, 2023

work page 2023

[21] [21]

Few-shot unlearning by model inversion,

Y . Yoon, J. Nam, H. Yun, J. Lee, D. Kim, and J. Ok, “Few-shot unlearning by model inversion,” arXiv preprint arXiv:2205.15567, 2022

work page arXiv 2022

[22] [22]

Learning to unlearn: Instance-wise unlearning for pre-trained classifiers,

S. Cha, S. Cho, D. Hwang, H. Lee, T. Moon, and M. Lee, “Learning to unlearn: Instance-wise unlearning for pre-trained classifiers,” in Proceedings of the AAAI Conference on Artificial Intelligence , vol. 38, no. 10, 2024, pp. 11 186–11 194

work page 2024

[23] [23]

Mixed-privacy forgetting in deep networks,

A. Golatkar, A. Achille, A. Ravichandran, M. Polito, and S. Soatto, “Mixed-privacy forgetting in deep networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 792–801

work page 2021

[24] [24]

Deep unlearning via randomized conditionally independent hessians,

R. Mehta, S. Pal, V . Singh, and S. N. Ravi, “Deep unlearning via randomized conditionally independent hessians,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 10 422–10 431

work page 2022

[25] [25]

Efficient two-stage model retraining for machine unlearning,

J. Kim and S. S. Woo, “Efficient two-stage model retraining for machine unlearning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2022, pp. 4361–4369

work page 2022

[26] [26]

ARCANE: an efficient architecture for exact machine unlearning,

H. Yan, X. Li, Z. Guo, H. Li, F. Li, and X. Lin, “ARCANE: an efficient architecture for exact machine unlearning,” in Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23-29 July 2022 , L. D. Raedt, Ed. ijcai.org, 2022, pp. 4006–4013. [Online]. Available: https://doi.org/10.24963/ijcai.2022/556

work page doi:10.24963/ijcai.2022/556 2022

[27] [27]

Understanding black-box predictions via influence functions,

P. W. Koh and P. Liang, “Understanding black-box predictions via influence functions,” in International conference on machine learning . PMLR, 2017, pp. 1885–1894

work page 2017

[28] [28]

Can bad teaching induce forgetting? unlearning in deep networks using an incompetent teacher,

V . S. Chundawat, A. K. Tarun, M. Mandal, and M. Kankanhalli, “Can bad teaching induce forgetting? unlearning in deep networks using an incompetent teacher,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 6, 2023, pp. 7210–7217

work page 2023

[29] [29]

Fitnets: Hints for thin deep nets,

R. Adriana, B. Nicolas, K. S. Ebrahimi, C. Antoine, G. Carlo, and B. Yoshua, “Fitnets: Hints for thin deep nets,” Proc. ICLR, vol. 2, no. 3, p. 1, 2015

work page 2015

[30] [30]

Auto-Encoding Variational Bayes

D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013

[31] [31]

High- resolution image synthesis with latent diffusion models,

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High- resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695

work page 2022

[32] [32]

Towards unbounded machine unlearning,

M. Kurmanji, P. Triantafillou, J. Hayes, and E. Triantafillou, “Towards unbounded machine unlearning,” Advances in Neural Information Pro- cessing Systems, vol. 36, 2024

work page 2024

[33] [33]

Knowledge distillation: A survey,

J. Gou, B. Yu, S. J. Maybank, and D. Tao, “Knowledge distillation: A survey,” International Journal of Computer Vision , vol. 129, no. 6, pp. 1789–1819, 2021

work page 2021

[34] [34]

Learning multiple layers of features from tiny images,

A. Krizhevsky, “Learning multiple layers of features from tiny images,” Master’s thesis, University of Tront, 2009

work page 2009

[35] [35]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

H. Xiao, K. Rasul, and R. V ollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,” arXiv preprint arXiv:1708.07747, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[36] [36]

Learning word vectors for sentiment analysis,

A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y . Ng, and C. Potts, “Learning word vectors for sentiment analysis,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies . Portland, Oregon, USA: Association for Computational Linguistics, June 2011, pp. 142–150. [Online]. Available: http:...

work page 2011

[37] [37]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition , 2016, pp. 770–778

work page 2016

[38] [38]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929 , 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010

[39] [39]

ImageNet Large Scale Visual Recognition Challenge,

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision (IJCV) , vol. 115, no. 3, pp. 211–252, 2015

work page 2015

[40] [40]

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

V . Sanh, L. Debut, J. Chaumond, and T. Wolf, “Distilbert, a dis- tilled version of bert: smaller, faster, cheaper and lighter,” ArXiv, vol. abs/1910.01108, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1910

[41] [41]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[42] [42]

Language models are unsupervised multitask learners,

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog , vol. 1, no. 8, p. 9, 2019

work page 2019

[43] [43]

Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary,

M. Chen, W. Gao, G. Liu, K. Peng, and C. Wang, “Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 7766–7775

work page 2023

[44] [44]

Catastrophic forgetting in connectionist networks,

R. M. French, “Catastrophic forgetting in connectionist networks,” Trends in cognitive sciences , vol. 3, no. 4, pp. 128–135, 1999. Laiqiao Qin is a Ph.D. candidate at City University of Macau, Macao SAR, China. He received his M.Eng.degree in the Faculty of Data Science from City University of Macau. His research interests include AI security and privacy...

work page 1999