pith. sign in

hub

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

28 Pith papers cite this work. Polarity classification is still indexing.

28 Pith papers citing it
abstract

With evolving data regulations, machine unlearning (MU) has become an important tool for fostering trust and safety in today's AI models. However, existing MU methods focusing on data and/or weight perspectives often suffer limitations in unlearning accuracy, stability, and cross-domain applicability. To address these challenges, we introduce the concept of 'weight saliency' for MU, drawing parallels with input saliency in model explanation. This innovation directs MU's attention toward specific model weights rather than the entire model, improving effectiveness and efficiency. The resultant method that we call saliency unlearning (SalUn) narrows the performance gap with 'exact' unlearning (model retraining from scratch after removing the forgetting data points). To the best of our knowledge, SalUn is the first principled MU approach that can effectively erase the influence of forgetting data, classes, or concepts in both image classification and generation tasks. As highlighted below, For example, SalUn yields a stability advantage in high-variance random data forgetting, e.g., with a 0.2% gap compared to exact unlearning on the CIFAR-10 dataset. Moreover, in preventing conditional diffusion models from generating harmful images, SalUn achieves nearly 100% unlearning accuracy, outperforming current state-of-the-art baselines like Erased Stable Diffusion and Forget-Me-Not. Codes are available at https://github.com/OPTML-Group/Unlearn-Saliency. (WARNING: This paper contains model outputs that may be offensive in nature.)

hub tools

citation-role summary

background 3 baseline 1

citation-polarity summary

years

2026 26 2025 2

clear filters

representative citing papers

Machine Unlearning for Masked Diffusion Language Models

cs.CL · 2026-05-18 · unverdicted · novelty 7.0

MDU minimizes forward KL divergence from prompt-conditional to prompt-masked unconditional predictions at masked positions to unlearn knowledge in MDLMs while trading off privacy and utility via temperature scaling.

Is your algorithm unlearning or untraining?

cs.LG · 2026-04-09 · conditional · novelty 7.0

Machine unlearning conflates reversing the influence of specific training examples (untraining) with removing the full underlying distribution or behavior (unlearning).

CURE:Circuit-Aware Unlearning for LLM-based Recommendation

cs.IR · 2026-04-04 · unverdicted · novelty 7.0

CURE disentangles LLM recommendation circuits into forget-specific, retain-specific, and task-shared modules with tailored update rules to achieve more effective unlearning than weighted baselines.

Multi-Objective Reference-Aligned Machine Unlearning

cs.LG · 2026-05-29 · unverdicted · novelty 6.0

RAUL is a multi-objective unlearning framework using bounded KL alignment to a reference distribution and Jacobian descent that reports closer performance to full retraining than single-objective baselines.

Class Unlearning via Depth-Aware Removal of Forget-Specific Directions

cs.CV · 2026-04-16 · unverdicted · novelty 6.0 · 2 refs

DAMP performs one-shot class unlearning by depth-aware projection removal of forget-specific directions, producing forgetting behavior closer to retraining from scratch than prior methods on image classification tasks.

Orthogonal Concept Erasure for Diffusion Models

cs.AI · 2026-05-27 · unverdicted · novelty 5.0

OCE reformulates editing-based concept erasure in diffusion models as multiplicative orthogonal transformations on neuron parameters to decouple concept direction from magnitude and angular geometry.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • Machine Unlearning for Masked Diffusion Language Models cs.CL · 2026-05-18 · unverdicted · none · ref 23 · internal anchor

    MDU minimizes forward KL divergence from prompt-conditional to prompt-masked unconditional predictions at masked positions to unlearn knowledge in MDLMs while trading off privacy and utility via temperature scaling.