pith. machine review for the scientific record. sign in

arxiv: 2310.12508 · v5 · submitted 2023-10-19 · 💻 cs.LG · cs.AI

Recognition: 2 theorem links

· Lean Theorem

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

Authors on Pith no claims yet

Pith reviewed 2026-05-16 17:52 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords machine unlearningweight saliencygradient methodsimage classificationdiffusion modelsdata erasuremodel safetyprivacy compliance
0
0 comments X

The pith

Gradient-based weight saliency enables effective unlearning of data, classes, or concepts in both image classifiers and generators while approaching exact retraining performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Machine unlearning removes the influence of specific training data from a model to satisfy privacy rules or safety requirements. Prior methods often trade off accuracy, stability, or the ability to handle both classification and generation tasks. The paper introduces weight saliency computed from gradients to focus updates on only the parameters most tied to the data being forgotten. This produces the SalUn method that narrows the performance gap to full retraining from scratch. A reader would care because it offers a more practical route to compliance with data-deletion requests across common vision models.

Core claim

SalUn computes weight saliency by examining gradients of the forgetting data and then applies an optimization step that updates primarily those salient weights. The result erases targeted information in image classification models and prevents conditional diffusion models from generating specified concepts. Experiments show stability advantages on random data removal and near-100 percent unlearning accuracy on harmful-image prevention tasks, all while preserving accuracy on retained data.

What carries the argument

Gradient-based weight saliency, which ranks model parameters by their gradient magnitude or influence on the forgetting objective and thereby restricts unlearning updates to those parameters.

Load-bearing premise

Gradient information reliably isolates the exact parameters responsible for the forgetting data without causing large unintended changes to retained knowledge.

What would settle it

If a model processed by SalUn still classifies forgotten classes or generates forbidden concepts at rates close to the original trained model on held-out test examples, the claim of effective unlearning would be refuted.

read the original abstract

With evolving data regulations, machine unlearning (MU) has become an important tool for fostering trust and safety in today's AI models. However, existing MU methods focusing on data and/or weight perspectives often suffer limitations in unlearning accuracy, stability, and cross-domain applicability. To address these challenges, we introduce the concept of 'weight saliency' for MU, drawing parallels with input saliency in model explanation. This innovation directs MU's attention toward specific model weights rather than the entire model, improving effectiveness and efficiency. The resultant method that we call saliency unlearning (SalUn) narrows the performance gap with 'exact' unlearning (model retraining from scratch after removing the forgetting data points). To the best of our knowledge, SalUn is the first principled MU approach that can effectively erase the influence of forgetting data, classes, or concepts in both image classification and generation tasks. As highlighted below, For example, SalUn yields a stability advantage in high-variance random data forgetting, e.g., with a 0.2% gap compared to exact unlearning on the CIFAR-10 dataset. Moreover, in preventing conditional diffusion models from generating harmful images, SalUn achieves nearly 100% unlearning accuracy, outperforming current state-of-the-art baselines like Erased Stable Diffusion and Forget-Me-Not. Codes are available at https://github.com/OPTML-Group/Unlearn-Saliency. (WARNING: This paper contains model outputs that may be offensive in nature.)

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes SalUn, a machine unlearning method that computes gradient-based weight saliency on forgetting data to identify and selectively update a subset of model parameters, thereby erasing the influence of specific data points, classes, or concepts. It reports empirical results on image classification (0.2% gap to exact unlearning on CIFAR-10) and conditional diffusion models (near-100% unlearning accuracy, outperforming Erased Stable Diffusion and Forget-Me-Not), claiming to be the first principled approach effective for both domains.

Significance. If the isolation of salient weights holds, the work offers a unified, efficient framework for machine unlearning that narrows the gap to exact retraining while extending to generative models; the open-sourced code at the provided GitHub link is a clear strength that supports reproducibility and further testing.

major comments (2)
  1. [§3.2] §3.2, Eq. (3): the top-k gradient saliency computed solely on forgetting samples assumes these weights encode the target influence in isolation; however, in shared-backbone networks (ResNet/VGG for classification, U-Net for diffusion), gradients may highlight parameters also used by retained classes/concepts, and the manuscript provides no direct measurement of salient-set overlap or degradation when thresholds vary.
  2. [Experiments] Experimental section: reported gaps (0.2% on CIFAR-10, near-100% on diffusion) are presented without error bars, seed-wise stability checks on the saliency computation, or ablations on the saliency threshold k; these omissions make it difficult to confirm that the small advantage over baselines is robust rather than sensitive to initialization or hyperparameter choice.
minor comments (2)
  1. [Abstract] Abstract: the 0.2% stability advantage is stated without naming the exact metric (accuracy? loss?) or reporting variability; add this detail for clarity.
  2. [§3] Notation: ensure consistent use of symbols for saliency scores and update rules across equations and text; a short table summarizing symbols would aid readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below and have revised the paper to incorporate additional analyses for greater clarity and robustness.

read point-by-point responses
  1. Referee: [§3.2] §3.2, Eq. (3): the top-k gradient saliency computed solely on forgetting samples assumes these weights encode the target influence in isolation; however, in shared-backbone networks (ResNet/VGG for classification, U-Net for diffusion), gradients may highlight parameters also used by retained classes/concepts, and the manuscript provides no direct measurement of salient-set overlap or degradation when thresholds vary.

    Authors: We appreciate this observation regarding the potential for parameter overlap in shared-backbone architectures. While the saliency computation focuses on forgetting samples to identify the most affected weights, we acknowledge that some shared parameters may exist. To directly address this, the revised manuscript now includes a quantitative analysis of the overlap between the top-k salient weight sets derived from forgetting data versus retained data (or concepts). This overlap is measured across the classification and diffusion experiments and shown to be limited, supporting the targeted nature of the updates. We have also added an ablation on varying the threshold k, reporting both unlearning effectiveness and any degradation in retained performance to demonstrate stability within the chosen operating range. revision: yes

  2. Referee: [Experiments] Experimental section: reported gaps (0.2% on CIFAR-10, near-100% on diffusion) are presented without error bars, seed-wise stability checks on the saliency computation, or ablations on the saliency threshold k; these omissions make it difficult to confirm that the small advantage over baselines is robust rather than sensitive to initialization or hyperparameter choice.

    Authors: We agree that the absence of error bars, multi-seed checks, and k ablations limits the ability to assess robustness. The revised experimental section now reports results averaged over multiple random seeds (with standard deviations shown as error bars) for the primary metrics on CIFAR-10 and the conditional diffusion models. We have also added a dedicated ablation study on the saliency threshold k, illustrating how performance varies with different k values and confirming that the reported gaps to exact unlearning remain stable and small within the selected range. These additions substantiate that the advantages are not artifacts of a single initialization or hyperparameter setting. revision: yes

Circularity Check

0 steps flagged

No significant circularity: SalUn is a new algorithmic construction validated against external baselines.

full rationale

The paper introduces gradient-based weight saliency (Eq. 3 in §3.2) as a novel MU procedure that computes top-k salient weights from forgetting-data gradients and applies targeted updates. Performance is measured directly against exact retraining from scratch and prior MU baselines (e.g., Erased Stable Diffusion) on CIFAR-10, ImageNet, and diffusion models, with reported gaps (0.2% stability) and unlearning accuracy (~100%). No equation reduces the claimed improvement to a fitted hyperparameter by definition, no self-citation chain justifies the core premise, and the uniqueness claim is presented as an empirical observation rather than a theorem derived from prior author work. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The method rests on the domain assumption that gradients with respect to forgetting data identify the weights whose modification will remove influence without global side effects; no new physical entities or free parameters are introduced beyond standard optimization hyperparameters.

axioms (1)
  • domain assumption Gradient-based saliency computed on forgetting data isolates the relevant model weights for unlearning
    Core premise stated in the introduction of the weight-saliency concept.
invented entities (1)
  • weight saliency no independent evidence
    purpose: Directs unlearning updates to specific weights rather than the entire model
    New conceptual construct introduced by the paper; no independent falsifiable prediction supplied beyond the empirical results.

pith-pipeline@v0.9.0 · 5590 in / 1278 out tokens · 71419 ms · 2026-05-16T17:52:24.110584+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 17 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Unlearning with Asymmetric Sources: Improved Unlearning-Utility Trade-off with Public Data

    cs.LG 2026-05 unverdicted novelty 7.0

    Asymmetric Langevin Unlearning uses public data to suppress unlearning noise costs by O(1/n_pub²), enabling practical mass unlearning with preserved utility under distribution mismatch.

  2. Classification-Head Bias in Class-Level Machine Unlearning: Diagnosis, Mitigation, and Evaluation

    cs.LG 2026-05 conditional novelty 7.0

    Class-level unlearning shortcuts via bias suppression in the classification head; new bias-aware training mechanisms and bias-specific metrics are introduced to diagnose and reduce this dependence.

  3. Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models

    cs.CV 2026-05 unverdicted novelty 7.0

    CoVUBench is the first benchmark framework for evaluating multimodal copyright unlearning in LVLMs via synthetic data, systematic variations, and a dual protocol for forgetting efficacy and utility preservation.

  4. Efficient Unlearning through Maximizing Relearning Convergence Delay

    cs.LG 2026-04 unverdicted novelty 7.0

    The Influence Eliminating Unlearning framework maximizes relearning convergence delay via weight decay and noise injection to remove the influence of a forgetting set while preserving accuracy on retained data.

  5. Is your algorithm unlearning or untraining?

    cs.LG 2026-04 conditional novelty 7.0

    Machine unlearning conflates reversing the influence of specific training examples (untraining) with removing the full underlying distribution or behavior (unlearning).

  6. CURE:Circuit-Aware Unlearning for LLM-based Recommendation

    cs.IR 2026-04 unverdicted novelty 7.0

    CURE disentangles LLM recommendation circuits into forget-specific, retain-specific, and task-shared modules with tailored update rules to achieve more effective unlearning than weighted baselines.

  7. Null Space Constrained Contrastive Visual Forgetting for MLLM Unlearning

    cs.AI 2026-05 unverdicted novelty 6.0

    A contrastive visual forgetting technique constrained to the null space of retained knowledge enables targeted unlearning of visual concepts in MLLMs while preserving non-target visual and all textual knowledge.

  8. Evaluation without Generation: Non-Generative Assessment of Harmful Model Specialization with Applications to CSAM

    cs.LG 2026-04 unverdicted novelty 6.0

    Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.

  9. IPRU: Input-Perturbation-based Radio Frequency Fingerprinting Unlearning for LAWNs

    eess.SP 2026-04 unverdicted novelty 6.0

    IPRU erases target AAV radio fingerprints via an optimized input perturbation vector, delivering 1.41% unlearning accuracy, 99.41% remaining accuracy, full membership-inference resistance, and 5.79X speedup over retraining.

  10. Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration

    cs.CV 2026-04 unverdicted novelty 6.0

    TICoE achieves more precise and faithful concept erasure in text-to-image models by collaborating text and image data through a convex manifold and hierarchical learning, outperforming prior methods.

  11. Class Unlearning via Depth-Aware Removal of Forget-Specific Directions

    cs.CV 2026-04 unverdicted novelty 6.0

    DAMP performs one-shot class unlearning by extracting and projecting out forget-specific residual directions at each network depth using class prototypes and a separability-derived scaling rule.

  12. BID-LoRA: A Parameter-Efficient Framework for Continual Learning and Unlearning

    cs.LG 2026-04 unverdicted novelty 6.0

    BID-LoRA uses bi-directional low-rank adapters with retain/new/unlearn pathways and escape unlearning to enable continual learning and unlearning while minimizing knowledge leakage and parameter updates.

  13. EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure

    cs.CV 2026-04 unverdicted novelty 6.0

    EGLOCE erases target concepts in diffusion models at inference time by optimizing latents with dual energy guidance that repels unwanted concepts while retaining prompt alignment.

  14. Bias Redistribution in Visual Machine Unlearning: Does Forgetting One Group Harm Another?

    cs.LG 2026-04 unverdicted novelty 6.0

    Unlearning a demographic group in CLIP models redistributes bias primarily along gender boundaries rather than eliminating it.

  15. Erasure or Erosion? Evaluating Compositional Degradation in Unlearned Text-To-Image Diffusion Models

    cs.CV 2026-04 unverdicted novelty 6.0

    Unlearning methods that strongly erase concepts from text-to-image diffusion models consistently degrade performance on attribute binding, spatial reasoning, and counting tasks.

  16. Jellyfish: Zero-Shot Federated Unlearning Scheme with Knowledge Disentanglement

    cs.CR 2026-04 unverdicted novelty 6.0

    Jellyfish enables zero-shot federated unlearning through synthetic proxy data generation, channel-restricted knowledge disentanglement, and a composite loss with repair to forget target data while retaining model utility.

  17. Machine Unlearning for Class Removal through SISA-based Deep Neural Network Architectures

    cs.CV 2026-04 unverdicted novelty 5.0

    A modified SISA architecture with replay and gating achieves effective class removal from trained CNNs on image datasets while preserving accuracy and cutting retraining costs.

Reference graph

Works this paper leans on

206 extracted references · 206 canonical work pages · cited by 17 Pith papers · 17 internal anchors

  1. [1]

    Sanity checks for saliency maps

    Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Goodfellow, Moritz Hardt, and Been Kim. Sanity checks for saliency maps. Advances in neural information processing systems, 31, 2018

  2. [2]

    Gradient surgery for one-shot unlearning on generative model, 2023

    Seohui Bae, Seoyoon Kim, Hyemin Jung, and Woohyung Lim. Gradient surgery for one-shot unlearning on generative model, 2023

  3. [4]

    Nudenet: Neural nets for nudity classification, detection and selective censoring, 2019

    P Bedapudi. Nudenet: Neural nets for nudity classification, detection and selective censoring, 2019

  4. [6]

    Membership inference attacks from first principles

    Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tramer. Membership inference attacks from first principles. In 2022 IEEE Symposium on Security and Privacy (SP), pp.\ 1897--1914. IEEE, 2022

  5. [7]

    Grad- CAM ++: Generalized gradient-based visual explanations for deep convolutional networks

    Aditya Chattopadhay, Anirban Sarkar, Prantik Howlader, and Vineeth N Balasubramanian. Grad- CAM ++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp.\ 839--847. IEEE, 2018

  6. [8]

    Graph unlearning

    Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, and Yang Zhang. Graph unlearning. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pp.\ 499--513, 2022 a

  7. [9]

    Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary

    Min Chen, Weizhuo Gao, Gaoyang Liu, Kai Peng, and Chen Wang. Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 7766--7775, 2023

  8. [10]

    Quarantine: Sparsity can uncover the trojan attack trigger for free

    Tianlong Chen, Zhenyu Zhang, Yihua Zhang, Shiyu Chang, Sijia Liu, and Zhangyang Wang. Quarantine: Sparsity can uncover the trojan attack trigger for free. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 598--609, 2022 b

  9. [15]

    Our data, ourselves: Privacy via distributed noise generation

    Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Annual international conference on the theory and applications of cryptographic techniques, pp.\ 486--503. Springer, 2006

  10. [18]

    Making ai forget you: Data deletion in machine learning

    Antonio Ginart, Melody Guan, Gregory Valiant, and James Y Zou. Making ai forget you: Data deletion in machine learning. Advances in neural information processing systems, 32, 2019

  11. [19]

    Eternal sunshine of the spotless net: Selective forgetting in deep networks

    Aditya Golatkar, Alessandro Achille, and Stefano Soatto. Eternal sunshine of the spotless net: Selective forgetting in deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 9304--9312, 2020

  12. [20]

    Amnesiac machine learning

    Laura Graves, Vineel Nagisetty, and Vijay Ganesh. Amnesiac machine learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp.\ 11516--11524, 2021

  13. [24]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 770--778, 2016

  14. [25]

    Selective amnesia: A continual learning approach to forgetting in deep generative models, 2023

    Alvin Heng and Harold Soh. Selective amnesia: A continual learning approach to forgetting in deep generative models, 2023

  15. [27]

    The european union general data protection regulation: what it is and what it means

    Chris Jay Hoofnagle, Bart van der Sloot, and Frederik Zuiderveen Borgesius. The european union general data protection regulation: what it is and what it means. Information & Communications Technology Law, 28 0 (1): 0 65--98, 2019

  16. [28]

    Fastai: A layered api for deep learning

    Jeremy Howard and Sylvain Gugger. Fastai: A layered api for deep learning. Information, 11 0 (2): 0 108, 2020

  17. [30]

    Approximate data deletion from machine learning models

    Zachary Izzo, Mary Anne Smart, Kamalika Chaudhuri, and James Zou. Approximate data deletion from machine learning models. In International Conference on Artificial Intelligence and Statistics, pp.\ 2008--2016. PMLR, 2021

  18. [31]

    A data-based perspective on transfer learning

    Saachi Jain, Hadi Salman, Alaa Khaddaj, Eric Wong, Sung Min Park, and Aleksander M a dry. A data-based perspective on transfer learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 3613--3622, 2023

  19. [32]

    How can i explain this to you? an empirical study of deep neural network explanation methods

    Jeya Vikranth Jeyakumar, Joseph Noor, Yu-Hsi Cheng, Luis Garcia, and Mani Srivastava. How can i explain this to you? an empirical study of deep neural network explanation methods. Advances in Neural Information Processing Systems, 33: 0 4211--4222, 2020

  20. [34]

    Understanding black-box predictions via influence functions

    Pang Wei Koh and Percy Liang. Understanding black-box predictions via influence functions. In International conference on machine learning, pp.\ 1885--1894. PMLR, 2017

  21. [35]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009

  22. [36]

    Tiny imagenet visual recognition challenge

    Ya Le and Xuan Yang. Tiny imagenet visual recognition challenge. CS 231N, 7 0 (7): 0 3, 2015

  23. [39]

    Swin transformer: Hierarchical vision transformer using shifted windows

    Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pp.\ 10012--10022, 2021

  24. [40]

    Locating and editing factual associations in gpt

    Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov. Locating and editing factual associations in gpt. Advances in Neural Information Processing Systems, 35: 0 17359--17372, 2022

  25. [42]

    Descent-to-delete: Gradient-based methods for machine unlearning

    Seth Neel, Aaron Roth, and Saeed Sharifi-Malvajerdi. Descent-to-delete: Gradient-based methods for machine unlearning. In Algorithmic Learning Theory, pp.\ 931--962. PMLR, 2021

  26. [43]

    Reading digits in natural images with unsupervised feature learning

    Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. 2011

  27. [45]

    Proximal algorithms

    Neal Parikh, Stephen Boyd, et al. Proximal algorithms. Foundations and trends in Optimization , 1 0 (3): 0 127--239, 2014

  28. [49]

    Scaling vision with sparse mixture of experts

    Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, Andr \'e Susano Pinto, Daniel Keysers, and Neil Houlsby. Scaling vision with sparse mixture of experts. Advances in Neural Information Processing Systems, 34: 0 8583--8595, 2021

  29. [50]

    High-resolution image synthesis with latent diffusion models

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj \"o rn Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 10684--10695, 2022

  30. [51]

    Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models, 2023

    Patrick Schramowski, Manuel Brack, Björn Deiseroth, and Kristian Kersting. Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models, 2023

  31. [53]

    Laion-5b: An open large-scale dataset for training next generation image-text models

    Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, et al. Laion-5b: An open large-scale dataset for training next generation image-text models. Advances in Neural Information Processing Systems, 35: 0 25278--25294, 2022

  32. [54]

    Remember what you want to forget: Algorithms for machine unlearning

    Ayush Sekhari, Jayadev Acharya, Gautam Kamath, and Ananda Theertha Suresh. Remember what you want to forget: Algorithms for machine unlearning. Advances in Neural Information Processing Systems, 34: 0 18075--18086, 2021

  33. [55]

    Grad- CAM : Visual explanations from deep networks via gradient-based localization

    Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad- CAM : Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, pp.\ 618--626, 2017

  34. [60]

    Diffusion art or digital forgery? investigating data replication in diffusion models

    Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, and Tom Goldstein. Diffusion art or digital forgery? investigating data replication in diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 6048--6058, 2023

  35. [62]

    Axiomatic attribution for deep networks

    Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp.\ 3319--3328. JMLR. org, 2017

  36. [63]

    Unrolling sgd: Understanding factors influencing machine unlearning

    Anvith Thudi, Gabriel Deza, Varun Chandrasekaran, and Nicolas Papernot. Unrolling sgd: Understanding factors influencing machine unlearning. In 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P), pp.\ 303--319. IEEE, 2022 a

  37. [64]

    On the necessity of auditable algorithmic definitions for machine unlearning

    Anvith Thudi, Hengrui Jia, Ilia Shumailov, and Nicolas Papernot. On the necessity of auditable algorithmic definitions for machine unlearning. In 31st USENIX Security Symposium (USENIX Security 22), pp.\ 4007--4022, 2022 b

  38. [65]

    Machine unlearning via algorithmic stability

    Enayat Ullah, Tung Mai, Anup Rao, Ryan A Rossi, and Raman Arora. Machine unlearning via algorithmic stability. In Conference on Learning Theory, pp.\ 4126--4142. PMLR, 2021

  39. [66]

    Federated unlearning via class-discriminative pruning

    Junxiao Wang, Song Guo, Xin Xie, and Heng Qi. Federated unlearning via class-discriminative pruning. In Proceedings of the ACM Web Conference 2022, pp.\ 622--632, 2022

  40. [68]

    Leveraging sparse linear layers for debuggable deep networks

    Eric Wong, Shibani Santurkar, and Aleksander Madry. Leveraging sparse linear layers for debuggable deep networks. In International Conference on Machine Learning, pp.\ 11205--11216. PMLR, 2021

  41. [69]

    Federated unlearning: Guarantee the right of clients to forget

    Leijie Wu, Song Guo, Junxiao Wang, Zicong Hong, Jie Zhang, and Yaohong Ding. Federated unlearning: Guarantee the right of clients to forget. IEEE Network, 36 0 (5): 0 129--135, 2022

  42. [71]

    Visualizing and understanding convolutional networks

    Matthew D Zeiler and Rob Fergus. Visualizing and understanding convolutional networks. In European conference on computer vision, pp.\ 818--833. Springer, 2014

  43. [74]

    B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.\ 2921--2929, 2016

  44. [75]

    2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P) , pages=

    Unrolling sgd: Understanding factors influencing machine unlearning , author=. 2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P) , pages=. 2022 , organization=

  45. [76]

    Can sensitive information be deleted from llms? objectives for defending against extraction attacks

    Can sensitive information be deleted from llms? objectives for defending against extraction attacks , author=. arXiv preprint arXiv:2309.17410 , year=

  46. [77]

    International Conference on Machine Learning , pages=

    Understanding instance-level impact of fairness constraints , author=. International Conference on Machine Learning , pages=. 2022 , organization=

  47. [78]

    Canadian privacy law: The personal information protection and electronic documents act (PIPEDA) , author=. Int'l. In-House Counsel J. , volume=. 2008 , publisher=

  48. [79]

    arXiv preprint arXiv:2301.09753 , year=

    Towards Modular Machine Learning Solution Development: Benefits and Trade-offs , author=. arXiv preprint arXiv:2301.09753 , year=

  49. [80]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Quarantine: Sparsity can uncover the trojan attack trigger for free , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  50. [81]

    Advances in neural information processing systems , volume=

    Sanity checks for saliency maps , author=. Advances in neural information processing systems , volume=

  51. [82]

    SmoothGrad: removing noise by adding noise

    Smoothgrad: removing noise by adding noise , author=. arXiv preprint arXiv:1706.03825 , year=

  52. [83]

    International Conference on Machine Learning , pages=

    Leveraging sparse linear layers for debuggable deep networks , author=. International Conference on Machine Learning , pages=. 2021 , organization=

  53. [84]

    Advances in Neural Information Processing Systems , volume=

    Prompt certified machine unlearning with randomized gradient smoothing and quantization , author=. Advances in Neural Information Processing Systems , volume=

  54. [85]

    European conference on computer vision , pages=

    Visualizing and understanding convolutional networks , author=. European conference on computer vision , pages=. 2014 , organization=

  55. [86]

    Striving for Simplicity: The All Convolutional Net

    Striving for simplicity: The all convolutional net , author=. arXiv preprint arXiv:1412.6806 , year=

  56. [87]

    Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

    Deep inside convolutional networks: Visualising image classification models and saliency maps , author=. arXiv preprint arXiv:1312.6034 , year=

  57. [88]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Diffusion art or digital forgery? investigating data replication in diffusion models , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  58. [89]

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

    Learning deep features for discriminative localization , author=. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages=

  59. [90]

    RISE: Randomized Input Sampling for Explanation of Black-box Models

    RISE: Randomized Input Sampling for Explanation of Black-box Models , author=. arXiv preprint arXiv:1806.07421 , year=

  60. [91]

    arXiv preprint arXiv:2308.03296 , year=

    Studying Large Language Model Generalization with Influence Functions , author=. arXiv preprint arXiv:2308.03296 , year=

  61. [92]

    arXiv preprint arXiv:2302.03169 , year=

    Data selection for language models via importance resampling , author=. arXiv preprint arXiv:2302.03169 , year=

  62. [93]

    Advances in Neural Information Processing Systems , volume=

    How can i explain this to you? an empirical study of deep neural network explanation methods , author=. Advances in Neural Information Processing Systems , volume=

  63. [94]

    arXiv preprint arXiv:2202.00622 , year=

    Datamodels: Predicting predictions from training data , author=. arXiv preprint arXiv:2202.00622 , year=

  64. [95]

    Chattopadhay, Aditya and Sarkar, Anirban and Howlader, Prantik and Balasubramanian, Vineeth N , booktitle=. Grad-. 2018 , organization=

  65. [96]

    arXiv preprint arXiv:2303.14186 , year=

    Trak: Attributing model behavior at scale , author=. arXiv preprint arXiv:2303.14186 , year=

  66. [97]

    Selvaraju, Ramprasaath R and Cogswell, Michael and Das, Abhishek and Vedantam, Ramakrishna and Parikh, Devi and Batra, Dhruv , booktitle=. Grad-

  67. [98]

    Proceedings of the 34th International Conference on Machine Learning-Volume 70 , pages=

    Axiomatic attribution for deep networks , author=. Proceedings of the 34th International Conference on Machine Learning-Volume 70 , pages=. 2017 , organization=

  68. [99]

    2022 IEEE Symposium on Security and Privacy (SP) , pages=

    Membership inference attacks from first principles , author=. 2022 IEEE Symposium on Security and Privacy (SP) , pages=. 2022 , organization=

  69. [100]

    2022 IEEE International Conference on Knowledge Graph (ICKG) , pages=

    Certified Data Removal in Sum-Product Networks , author=. 2022 IEEE International Conference on Knowledge Graph (ICKG) , pages=. 2022 , organization=

  70. [101]

    IEEE Network , volume=

    Federated unlearning: Guarantee the right of clients to forget , author=. IEEE Network , volume=. 2022 , publisher=

  71. [102]

    arXiv preprint arXiv:2307.14754 , year=

    Fair Machine Unlearning: Data Removal while Mitigating Disparities , author=. arXiv preprint arXiv:2307.14754 , year=

  72. [103]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    A Data-Based Perspective on Transfer Learning , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  73. [104]

    Advances in Neural Information Processing Systems , volume=

    Scaling vision with sparse mixture of experts , author=. Advances in Neural Information Processing Systems , volume=

  74. [105]

    International conference on machine learning , pages=

    Understanding black-box predictions via influence functions , author=. International conference on machine learning , pages=. 2017 , organization=

  75. [106]

    IEEE Transactions on Pattern Analysis and Machine Intelligence , year=

    Dataset security for machine learning: Data poisoning, backdoor attacks, and defenses , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , year=

  76. [107]

    International Conference on Artificial Intelligence and Statistics , pages=

    Approximate data deletion from machine learning models , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2021 , organization=

  77. [108]

    Neural Networks , volume=

    Continual lifelong learning with neural networks: A review , author=. Neural Networks , volume=. 2019 , publisher=

  78. [109]

    1982 , publisher=

    Residuals and influence in regression , author=. 1982 , publisher=

  79. [110]

    Foundations and Trends

    Optimization with sparsity-inducing penalties , author=. Foundations and Trends. 2012 , publisher=

  80. [111]

    Advances in Neural Information Processing Systems , year=

    Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting , author=. Advances in Neural Information Processing Systems , year=

Showing first 80 references.