pith. machine review for the scientific record. sign in

arxiv: 2604.12686 · v1 · submitted 2026-04-14 · 💻 cs.LG · cs.AI

Recognition: unknown

BID-LoRA: A Parameter-Efficient Framework for Continual Learning and Unlearning

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:03 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords continual learningmachine unlearningparameter-efficient fine-tuningLoRA adaptersknowledge leakageCLUadapter pathwaysface recognition
0
0 comments X

The pith

BID-LoRA uses three separate low-rank adapter paths plus escape unlearning to add new knowledge and delete old knowledge while limiting leakage across cycles.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies that naively pairing continual learning methods with machine unlearning methods produces gradual knowledge leakage that degrades retained performance over repeated adaptation cycles. It formalizes a unified Continual Learning Unlearning paradigm whose three goals are precise deletion of unwanted data, efficient addition of new data without harming prior knowledge, and reduction of leakage. BID-LoRA realizes this by attaching three dedicated adapter pathways (retain, new, and unlearn) to attention layers and applying escape unlearning that pushes forget-class embeddings maximally far from retained knowledge. Only 5 percent of parameters are updated. Experiments on CIFAR-100 and a face-recognition subset show the approach maintains accuracy better than existing CLU baselines across multiple enrollment and withdrawal cycles.

Core claim

BID-LoRA is a parameter-efficient framework that applies three dedicated low-rank adapter pathways (retain, new, and unlearn) to attention layers together with escape unlearning that repositions forget-class embeddings at maximum distance from retained knowledge, thereby satisfying the three CLU goals of precise deletion, efficient integration, and minimal leakage while updating only 5 percent of parameters.

What carries the argument

Three dedicated adapter pathways (retain, new, unlearn) combined with escape unlearning that maximizes the distance of forget-class embeddings from retained knowledge.

If this is right

  • Multiple cycles of adding and removing knowledge can be performed without progressive degradation of retained performance.
  • Real-world identity systems can enroll new users and remove withdrawn users while updating only 5 percent of parameters.
  • Unified CLU training reduces reliance on separate continual-learning and machine-unlearning pipelines.
  • The same architecture applies to both image classification and face-recognition tasks without architecture changes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The escape-unlearning distance mechanism could be tested on language models to forget specific training examples while acquiring new capabilities.
  • Repeated-cycle stability might allow deployment in privacy-regulated environments where user data must be added and deleted on demand.
  • The three-pathway design could be combined with other low-rank methods to handle larger base models.

Load-bearing premise

The three separate adapter pathways and escape unlearning together will block knowledge leakage and preserve retained-task accuracy without new instabilities or dataset-specific retuning.

What would settle it

If accuracy on retained classes declines or forgotten data can be reconstructed after several adaptation cycles on CIFAR-100 or CASIA-Face100, the leakage-prevention claim is falsified.

Figures

Figures reproduced from arXiv: 2604.12686 by Amit Shukla, Jagadeesh Rachapudi, Praful Hambarde, Ritali Vatsi.

Figure 1
Figure 1. Figure 1: Overview of CLU. The CLU system removes unwanted knowledge [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: LoRA placement in BID-LoRA at Attention Modules [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Pathway separation for BID-LoRA. This separation provides two benefits: (1) drift in one adapter does not corrupt others, and (2) each adapter trains on its specific objective without gradient interference. Adapters merge at inference, preserving efficiency. We empirically val￾idate pathway specialization in Section V-D4. C. Loss Function In this section, we present the loss functions which are designed to… view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of continual adapting evaluation protocol. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Radar plot comparison at Task-6. BID-LoRA consistently outperforms [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Geometric verification of unlearning. Top row: t-SNE visualization [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Knowledge leakage in CL+MU combinations. Retain accuracy [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
read the original abstract

Recent advances in deep learning underscore the need for systems that can not only acquire new knowledge through Continual Learning (CL) but also remove outdated, sensitive, or private information through Machine Unlearning (MU). However, while CL methods are well-developed, MU techniques remain in early stages, creating a critical gap for unified frameworks that depend on both capabilities. We find that naively combining existing CL and MU approaches results in knowledge leakage a gradual degradation of foundational knowledge across repeated adaptation cycles. To address this, we formalize Continual Learning Unlearning (CLU) as a unified paradigm with three key goals: (i) precise deletion of unwanted knowledge, (ii) efficient integration of new knowledge while preserving prior information, and (iii) minimizing knowledge leakage across cycles. We propose Bi-Directional Low-Rank Adaptation (BID-LoRA), a novel framework featuring three dedicated adapter pathways-retain, new, and unlearn applied to attention layers, combined with escape unlearning that pushes forget-class embeddings to positions maximally distant from retained knowledge, updating only 5% of parameters. Experiments on CIFAR-100 show that BID-LoRA outperforms CLU baselines across multiple adaptation cycles. We further evaluate on CASIA-Face100, a curated face recognition subset, demonstrating practical applicability to real-world identity management systems where new users must be enrolled and withdrawn users removed.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes BID-LoRA as a parameter-efficient framework for the unified Continual Learning Unlearning (CLU) paradigm. It introduces three dedicated low-rank adapter pathways (retain, new, and unlearn) applied to attention layers, combined with an escape unlearning strategy that pushes forget-class embeddings maximally distant from retained knowledge while updating only 5% of parameters. The central claims are that this prevents knowledge leakage across repeated adaptation cycles and outperforms existing CLU baselines on CIFAR-100, with additional evaluation on CASIA-Face100 demonstrating applicability to real-world identity management.

Significance. If the empirical results and stability claims hold, the work could be significant for addressing the gap between continual learning and machine unlearning in a single efficient framework. The 5% parameter update budget and explicit handling of leakage via dedicated pathways and embedding distancing offer a practical advance for privacy-sensitive continual adaptation tasks, such as enrolling and removing users in face recognition systems. The architectural separation of pathways is a clear strength if it can be shown to avoid the instabilities noted in naive combinations.

major comments (2)
  1. [Abstract] Abstract: The claim that 'Experiments on CIFAR-100 show that BID-LoRA outperforms CLU baselines across multiple adaptation cycles' supplies no quantitative metrics, baseline names, error bars, leakage measurement protocol, or cycle counts. This absence makes the central empirical claim unverifiable and load-bearing for the paper's contribution.
  2. [Method (escape unlearning)] Escape unlearning description: The strategy of maximizing embedding distance for forget classes assumes the retain and new adapters maintain stable decision boundaries without interference or gradient conflicts in a continual regime. No analysis of cross-pathway interactions, retained-class embedding distances, or boundary shifts under the 5% update constraint (applied only to attention layers) is provided, which directly undermines the no-leakage and stability guarantees.
minor comments (1)
  1. [Introduction] The three CLU goals are listed but could be more explicitly tied to the architectural choices in the introduction for improved clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will incorporate revisions to improve clarity and completeness.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that 'Experiments on CIFAR-100 show that BID-LoRA outperforms CLU baselines across multiple adaptation cycles' supplies no quantitative metrics, baseline names, error bars, leakage measurement protocol, or cycle counts. This absence makes the central empirical claim unverifiable and load-bearing for the paper's contribution.

    Authors: We agree that the abstract would be strengthened by including more specific details. In the revised manuscript, we will update the abstract to reference key quantitative results from the CIFAR-100 experiments, including accuracy metrics with error bars, the specific CLU baselines used for comparison, the leakage measurement protocol, and the number of adaptation cycles evaluated. Given typical abstract length constraints, we will prioritize the most salient metrics while ensuring the full experimental protocols, tables, and figures remain detailed in the Experiments section. revision: yes

  2. Referee: [Method (escape unlearning)] Escape unlearning description: The strategy of maximizing embedding distance for forget classes assumes the retain and new adapters maintain stable decision boundaries without interference or gradient conflicts in a continual regime. No analysis of cross-pathway interactions, retained-class embedding distances, or boundary shifts under the 5% update constraint (applied only to attention layers) is provided, which directly undermines the no-leakage and stability guarantees.

    Authors: We acknowledge this as a valid observation regarding the depth of analysis in the current draft. Although the empirical results demonstrate reduced leakage across cycles, the manuscript does not explicitly analyze cross-pathway interactions or boundary dynamics. In the revision, we will add a new analysis subsection (likely in Experiments or an extended Methods discussion) that quantifies: (i) embedding distances for both forget and retain classes before and after escape unlearning, (ii) potential interference or gradient conflicts among the retain/new/unlearn pathways, and (iii) decision boundary stability under the 5% parameter budget restricted to attention layers. This will include additional metrics, visualizations, and discussion to support the stability claims. revision: yes

Circularity Check

0 steps flagged

No circularity: BID-LoRA is an empirical architecture proposal with no self-referential derivations

full rationale

The paper defines BID-LoRA as a new three-pathway adapter architecture plus escape unlearning, then reports empirical outperformance on CIFAR-100 and CASIA-Face100. No equations, uniqueness theorems, or first-principles results are presented that reduce by construction to quantities defined in terms of the method's own fitted parameters or prior self-citations. The central claims rest on experimental comparisons rather than tautological redefinitions or load-bearing self-references.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 2 invented entities

The central claim rests on the unproven effectiveness of the newly introduced retain/new/unlearn pathways and escape unlearning mechanism; these are postulated to achieve the three CLU goals but lack independent justification beyond the summarized experiments.

free parameters (1)
  • adapter rank and scaling
    Low-rank dimension and scaling factors are design choices that control the 5% parameter budget and must be selected for each base model.
axioms (2)
  • domain assumption Independent low-rank adapters applied to attention layers can be trained without mutual interference or base-model degradation.
    Invoked when stating that the three pathways are applied to attention layers while updating only 5% of parameters.
  • ad hoc to paper Maximizing embedding distance for forget classes achieves precise deletion without collateral damage to retained knowledge.
    This is the core of the escape unlearning step and is introduced without prior derivation or external validation in the abstract.
invented entities (2)
  • retain, new, and unlearn adapter pathways no independent evidence
    purpose: Separate handling of retention, acquisition, and deletion within a single parameter-efficient update.
    New components introduced by the framework; no independent evidence supplied.
  • escape unlearning no independent evidence
    purpose: Push forget-class embeddings maximally distant from retained knowledge to enforce deletion.
    Novel unlearning technique proposed in the paper; no independent evidence supplied.

pith-pipeline@v0.9.0 · 5557 in / 1671 out tokens · 51471 ms · 2026-05-10T15:03:34.329415+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. BackFlush: Knowledge-Free Backdoor Detection and Elimination with Watermark Preservation in Large Language Models

    cs.CR 2026-04 unverdicted novelty 6.0

    BackFlush detects backdoors via susceptibility amplification and eliminates them with RoPE unlearning to reach 1% ASR and 99% clean accuracy while preserving watermarks.

Reference graph

Works this paper leans on

44 extracted references · 14 canonical work pages · cited by 1 Pith paper · 3 internal anchors

  1. [1]

    Scaling Laws for Neural Language Models

    J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, and D. Amodei, “Scaling laws for neural language models,”arXiv preprint arXiv:2001.08361, 2020

  2. [2]

    A unified framework for continual learning and unlearning.arXiv preprint arXiv:2408.11374,

    R. Chatterjee, V . Chundawat, A. Tarun, A. Mali, and M. Mandal, “A unified framework for continual learning and unlearning,”arXiv preprint arXiv:2408.11374, 2024

  3. [3]

    Continual learning and private unlearning,

    B. Liu, Q. Liu, and P. Stone, “Continual learning and private unlearning,” inConference on Lifelong Learning Agents. PMLR, 2022, pp. 243–254

  4. [4]

    Overcoming catastrophic forgetting in neural networks,

    J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska et al., “Overcoming catastrophic forgetting in neural networks,”Pro- ceedings of the national academy of sciences, vol. 114, no. 13, pp. 3521–3526, 2017

  5. [5]

    Dark experience for general continual learning: a strong, simple baseline,

    P. Buzzega, M. Boschini, A. Porrello, D. Abati, and S. Calderara, “Dark experience for general continual learning: a strong, simple baseline,” Advances in neural information processing systems, vol. 33, pp. 15 920– 15 930, 2020

  6. [6]

    New insights on reducing abrupt representation change in online continual learning,

    L. Caccia, R. Aljundi, N. Asadi, T. Tuytelaars, J. Pineau, and E. Belilovsky, “New insights on reducing abrupt representation change in online continual learning,”arXiv preprint arXiv:2104.05025, 2021

  7. [7]

    Dytox: Trans- formers for continual learning with dynamic token expansion,

    A. Douillard, A. Ram ´e, G. Couairon, and M. Cord, “Dytox: Trans- formers for continual learning with dynamic token expansion,” inPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 9285–9295

  8. [8]

    Exemplar-free continual learning of vision transformers via gated class- attention and cascaded feature drift compensation,

    M. Cotogni, F. Yang, C. Cusano, A. D. Bagdanov, and J. van de Weijer, “Exemplar-free continual learning of vision transformers via gated class- attention and cascaded feature drift compensation,”International Journal of Computer Vision, pp. 1–19, 2025

  9. [9]

    D3former: Debiased dual distilled transformer for incremental learning,

    A. Mohamed, R. Grandhe, K. Joseph, S. Khan, and F. Khan, “D3former: Debiased dual distilled transformer for incremental learning,” inPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 2421–2430

  10. [10]

    Towards unbounded machine unlearning,

    M. Kurmanji, P. Triantafillou, J. Hayes, and E. Triantafillou, “Towards unbounded machine unlearning,”Advances in neural information pro- cessing systems, vol. 36, pp. 1957–1987, 2023

  11. [11]

    Continual forgetting for pre-trained vision models,

    H. Zhao, B. Ni, J. Fan, Y . Wang, Y . Chen, G. Meng, and Z. Zhang, “Continual forgetting for pre-trained vision models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 28 631–28 642. 11

  12. [12]

    Learning to unlearn: Instance-wise unlearning for pre-trained classifiers,

    S. Cha, S. Cho, D. Hwang, H. Lee, T. Moon, and M. Lee, “Learning to unlearn: Instance-wise unlearning for pre-trained classifiers,” in Proceedings of the AAAI conference on artificial intelligence, vol. 38, no. 10, 2024, pp. 11 186–11 194

  13. [13]

    arXiv preprint arXiv:2310.12508 (2023)

    C. Fan, J. Liu, Y . Zhang, E. Wong, D. Wei, and S. Liu, “Salun: Em- powering machine unlearning via gradient-based weight saliency in both image classification and generation,”arXiv preprint arXiv:2310.12508, 2023

  14. [14]

    Fast yet effective machine unlearning,

    A. K. Tarun, V . S. Chundawat, M. Mandal, and M. Kankanhalli, “Fast yet effective machine unlearning,”IEEE Transactions on Neural Networks and Learning Systems, vol. 35, no. 9, pp. 13 046–13 055, 2023

  15. [15]

    An introduction to the california consumer privacy act (ccpa),

    E. Goldman, “An introduction to the california consumer privacy act (ccpa),”Santa Clara Univ. Legal Studies Research Paper, 2020

  16. [16]

    General data protection regulation (gdpr),

    G. Data, “General data protection regulation (gdpr),”Intersoft Consult- ing, Accessed in October, vol. 24, no. 1, 2018

  17. [17]

    Model inversion attacks that exploit confidence information and basic countermeasures,

    M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks that exploit confidence information and basic countermeasures,” in Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, 2015, pp. 1322–1333

  18. [18]

    Deep leakage from gradients,

    L. Zhu, Z. Liu, and S. Han, “Deep leakage from gradients,”Advances in neural information processing systems, vol. 32, 2019

  19. [19]

    The secret revealer: Generative model-inversion attacks against deep neural net- works,

    Y . Zhang, R. Jia, H. Pei, W. Wang, B. Li, and D. Song, “The secret revealer: Generative model-inversion attacks against deep neural net- works,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 253–261

  20. [20]

    Parameter-efficient transfer learning for nlp,

    N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe, A. Gesmundo, M. Attariyan, and S. Gelly, “Parameter-efficient transfer learning for nlp,” inInternational conference on machine learning. PMLR, 2019, pp. 2790–2799

  21. [21]

    Lora: Low-rank adaptation of large language models

    E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, W. Chenet al., “Lora: Low-rank adaptation of large language models.” ICLR, vol. 1, no. 2, p. 3, 2022

  22. [22]

    Prefix-Tuning: Optimizing Continuous Prompts for Generation

    X. L. Li and P. Liang, “Prefix-tuning: Optimizing continuous prompts for generation,”arXiv preprint arXiv:2101.00190, 2021

  23. [23]

    Transformer Feed-Forward Layers Are Key-Value Memories

    M. Geva, R. Schuster, J. Berant, and O. Levy, “Transformer feed-forward layers are key-value memories,”arXiv preprint arXiv:2012.14913, 2020

  24. [24]

    Packnet: Adding multiple tasks to a single network by iterative pruning,

    A. Mallya and S. Lazebnik, “Packnet: Adding multiple tasks to a single network by iterative pruning,” inProceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2018, pp. 7765–7773

  25. [25]

    Piggyback: Adapting a single network to multiple tasks by learning to mask weights,

    A. Mallya, D. Davis, and S. Lazebnik, “Piggyback: Adapting a single network to multiple tasks by learning to mask weights,” inProceedings of the European conference on computer vision (ECCV), 2018, pp. 67– 82

  26. [26]

    Learning to prompt for continual learning,

    Z. Wang, Z. Zhang, C.-Y . Lee, H. Zhang, R. Sun, X. Ren, G. Su, V . Perot, J. Dy, and T. Pfister, “Learning to prompt for continual learning,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 139–149

  27. [27]

    Continual stereo matching of continuous driving scenes with growing architecture,

    C. Zhang, K. Tian, B. Fan, G. Meng, Z. Zhang, and C. Pan, “Continual stereo matching of continuous driving scenes with growing architecture,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 18 901–18 910

  28. [28]

    Fast: Feature aware similarity thresholding for weak unlearning in black-box generative models,

    S. Panda and A. Prathosh, “Fast: Feature aware similarity thresholding for weak unlearning in black-box generative models,”IEEE Transactions on Artificial Intelligence, 2024

  29. [29]

    Malicious clients and contribution co- aware federated unlearning,

    Y . Wang, X. Li, and S. Chen, “Malicious clients and contribution co- aware federated unlearning,”IEEE Transactions on Artificial Intelli- gence, 2025

  30. [30]

    Gpt understands, too,

    X. Liu, Y . Zheng, Z. Du, M. Ding, Y . Qian, Z. Yang, and J. Tang, “Gpt understands, too,”AI Open, vol. 5, pp. 208–215, 2024

  31. [31]

    What would elsa do? freezing layers during transformer fine-tuning.arXiv preprint arXiv:1911.03090,

    J. Lee, R. Tang, and J. Lin, “What would elsa do? freezing layers during transformer fine-tuning,”arXiv preprint arXiv:1911.03090, 2019

  32. [32]

    One-for-all: Generalized lora for parameter-efficient fine-tuning.arXiv preprint arXiv:2306.07967,

    A. Chavan, Z. Liu, D. Gupta, E. Xing, and Z. Shen, “One-for-all: Generalized lora for parameter-efficient fine-tuning,”arXiv preprint arXiv:2306.07967, 2023

  33. [33]

    Dylora: Parameter efficient tuning of pre-trained models using dynamic search- free low-rank adaptation,

    M. Valipour, M. Rezagholizadeh, I. Kobyzev, and A. Ghodsi, “Dylora: Parameter efficient tuning of pre-trained models using dynamic search- free low-rank adaptation,”arXiv preprint arXiv:2210.07558, 2022

  34. [34]

    Learning with selective forgetting

    T. Shibata, G. Irie, D. Ikami, and Y . Mitsuzumi, “Learning with selective forgetting.” inIJCAI, vol. 3, 2021, p. 4

  35. [35]

    A unified gradient-based framework for task-agnostic contin- ual learning-unlearning.arXiv preprint arXiv:2505.15178,

    Z. Huang, X. Cheng, J. Zhang, J. Zheng, H. Wang, Z. He, T. Li, and X. Huang, “A unified gradient-based framework for task-agnostic continual learning-unlearning,”arXiv preprint arXiv:2505.15178, 2025

  36. [36]

    An unlearning framework for continual learning.arXiv preprint arXiv:2509.17530, 2025

    S. Adhikari, V . Kumaravelu, and P. Srijith, “An unlearning framework for continual learning,”arXiv preprint arXiv:2509.17530, 2025

  37. [37]

    Learning multiple layers of features from tiny images,

    A. Krizhevsky, G. Hintonet al., “Learning multiple layers of features from tiny images,” 2009

  38. [38]

    Training data-efficient image transformers & distillation through attention,

    H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. J ´egou, “Training data-efficient image transformers & distillation through attention,” inInternational conference on machine learning. PMLR, 2021, pp. 10 347–10 357

  39. [39]

    Learning Face Representation from Scratch

    D. Yi, Z. Lei, S. Liao, and S. Z. Li, “Learning face representation from scratch,”arXiv preprint arXiv:1411.7923, 2014

  40. [40]

    Face transformer for recognition,

    Y . Zhong and W. Deng, “Face transformer for recognition,”arXiv preprint arXiv:2103.14803, 2021

  41. [41]

    Label- only membership inference attacks,

    C. A. Choquette-Choo, F. Tramer, N. Carlini, and N. Papernot, “Label- only membership inference attacks,” inInternational conference on machine learning. PMLR, 2021, pp. 1964–1974

  42. [42]

    Towards source-free machine unlearning,

    S. M. Ahmed, U. Y . Basaran, D. S. Raychaudhuri, A. Dutta, R. Kundu, F. F. Niloy, B. Guler, and A. K. Roy-Chowdhury, “Towards source-free machine unlearning,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 4948–4957

  43. [43]

    Llm unlearning via loss adjustment with only forget data,

    Y . Wang, J. Wei, C. Y . Liu, J. Pang, Q. Liu, A. P. Shah, Y . Bao, Y . Liu, and W. Wei, “Llm unlearning via loss adjustment with only forget data,” arXiv preprint arXiv:2410.11143, 2024

  44. [44]

    Erasing concepts from diffusion models,

    R. Gandikota, J. Materzynska, J. Fiotto-Kaufman, and D. Bau, “Erasing concepts from diffusion models,” inProceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 2426–2436