hub

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

Chongyu Fan, Jiancheng Liu, Yihua Zhang, Eric Wong, Dennis Wei, Sijia Liu · 2023 · cs.LG · arXiv 2310.12508

28 Pith papers cite this work. Polarity classification is still indexing.

28 Pith papers citing it

open full Pith review browse 28 citing papers arXiv PDF

abstract

With evolving data regulations, machine unlearning (MU) has become an important tool for fostering trust and safety in today's AI models. However, existing MU methods focusing on data and/or weight perspectives often suffer limitations in unlearning accuracy, stability, and cross-domain applicability. To address these challenges, we introduce the concept of 'weight saliency' for MU, drawing parallels with input saliency in model explanation. This innovation directs MU's attention toward specific model weights rather than the entire model, improving effectiveness and efficiency. The resultant method that we call saliency unlearning (SalUn) narrows the performance gap with 'exact' unlearning (model retraining from scratch after removing the forgetting data points). To the best of our knowledge, SalUn is the first principled MU approach that can effectively erase the influence of forgetting data, classes, or concepts in both image classification and generation tasks. As highlighted below, For example, SalUn yields a stability advantage in high-variance random data forgetting, e.g., with a 0.2% gap compared to exact unlearning on the CIFAR-10 dataset. Moreover, in preventing conditional diffusion models from generating harmful images, SalUn achieves nearly 100% unlearning accuracy, outperforming current state-of-the-art baselines like Erased Stable Diffusion and Forget-Me-Not. Codes are available at https://github.com/OPTML-Group/Unlearn-Saliency. (WARNING: This paper contains model outputs that may be offensive in nature.)

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 baseline 1

citation-polarity summary

background 3 baseline 1

representative citing papers

PIU: Proximity-guided Identity Unlearning in ID-Conditioned Diffusion Models

cs.CV · 2026-05-21 · unverdicted · novelty 7.0

PIU suppresses target identity generation in Arc2Face by replacing it with a proximity-selected anchor identity through localized fine-tuning of cross-attention layers while preserving output quality for other identities.

Machine Unlearning for Masked Diffusion Language Models

cs.CL · 2026-05-18 · unverdicted · novelty 7.0

MDU minimizes forward KL divergence from prompt-conditional to prompt-masked unconditional predictions at masked positions to unlearn knowledge in MDLMs while trading off privacy and utility via temperature scaling.

Classification-Head Bias in Class-Level Machine Unlearning: Diagnosis, Mitigation, and Evaluation

cs.LG · 2026-05-09 · conditional · novelty 7.0

Class-level unlearning shortcuts via bias suppression in the classification head; new bias-aware training mechanisms and bias-specific metrics are introduced to diagnose and reduce this dependence.

Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models

cs.CV · 2026-05-05 · unverdicted · novelty 7.0

CoVUBench is the first benchmark framework for evaluating multimodal copyright unlearning in LVLMs via synthetic data, systematic variations, and a dual protocol for forgetting efficacy and utility preservation.

Efficient Unlearning through Maximizing Relearning Convergence Delay

cs.LG · 2026-04-10 · unverdicted · novelty 7.0

The Influence Eliminating Unlearning framework maximizes relearning convergence delay via weight decay and noise injection to remove the influence of a forgetting set while preserving accuracy on retained data.

Is your algorithm unlearning or untraining?

cs.LG · 2026-04-09 · conditional · novelty 7.0

Machine unlearning conflates reversing the influence of specific training examples (untraining) with removing the full underlying distribution or behavior (unlearning).

CURE:Circuit-Aware Unlearning for LLM-based Recommendation

cs.IR · 2026-04-04 · unverdicted · novelty 7.0

CURE disentangles LLM recommendation circuits into forget-specific, retain-specific, and task-shared modules with tailored update rules to achieve more effective unlearning than weighted baselines.

Multi-Objective Reference-Aligned Machine Unlearning

cs.LG · 2026-05-29 · unverdicted · novelty 6.0

RAUL is a multi-objective unlearning framework using bounded KL alignment to a reference distribution and Jacobian descent that reports closer performance to full retraining than single-objective baselines.

Unlearning in Diffusion Models: A Unified Framework with KL Divergence and Likelihood Constraints

cs.LG · 2026-05-29 · unverdicted · novelty 6.0

A constrained optimization framework for diffusion model unlearning via KL and likelihood constraints, with duality results and reported better retention-unlearning tradeoffs than weight-based baselines.

Erased but Exploitable: Black-box Embedding-Aware Prompting Against Unlearned Text-to-Image Diffusion Models

cs.CV · 2026-05-25 · unverdicted · novelty 6.0

BEAP is a black-box embedding-aware prompting attack using LLM-guided search that raises attack success rate over 60% against unlearned diffusion models while keeping prompts undetectable.

CPC-VAR:Continual Personalized and Compositional Generation in Visual Autoregressive Models

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

CPC-VAR adds Gradient-based Concept Neuron Selection for continual single-concept learning and a context-aware multi-branch composition strategy to reduce forgetting and entanglement in VAR-based personalized image generation.

Null Space Constrained Contrastive Visual Forgetting for MLLM Unlearning

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

A contrastive visual forgetting technique constrained to the null space of retained knowledge enables targeted unlearning of visual concepts in MLLMs while preserving non-target visual and all textual knowledge.

Evaluation without Generation: Non-Generative Assessment of Harmful Model Specialization with Applications to CSAM

cs.LG · 2026-04-28 · unverdicted · novelty 6.0

Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.

IPRU: Input-Perturbation-based Radio Frequency Fingerprinting Unlearning for LAWNs

eess.SP · 2026-04-27 · unverdicted · novelty 6.0

IPRU erases target AAV radio fingerprints via an optimized input perturbation vector, delivering 1.41% unlearning accuracy, 99.41% remaining accuracy, full membership-inference resistance, and 5.79X speedup over retraining.

Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration

cs.CV · 2026-04-17 · unverdicted · novelty 6.0

TICoE achieves more precise and faithful concept erasure in text-to-image models by collaborating text and image data through a convex manifold and hierarchical learning, outperforming prior methods.

Class Unlearning via Depth-Aware Removal of Forget-Specific Directions

cs.CV · 2026-04-16 · unverdicted · novelty 6.0 · 2 refs

DAMP performs one-shot class unlearning by depth-aware projection removal of forget-specific directions, producing forgetting behavior closer to retraining from scratch than prior methods on image classification tasks.

BID-LoRA: A Parameter-Efficient Framework for Continual Learning and Unlearning

cs.LG · 2026-04-14 · unverdicted · novelty 6.0

BID-LoRA uses bi-directional low-rank adapters with retain/new/unlearn pathways and escape unlearning to enable continual learning and unlearning while minimizing knowledge leakage and parameter updates.

EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure

cs.CV · 2026-04-10 · unverdicted · novelty 6.0

EGLOCE erases target concepts in diffusion models at inference time by optimizing latents with dual energy guidance that repels unwanted concepts while retaining prompt alignment.

Bias Redistribution in Visual Machine Unlearning: Does Forgetting One Group Harm Another?

cs.LG · 2026-04-09 · unverdicted · novelty 6.0

Unlearning a demographic group in CLIP models redistributes bias primarily along gender boundaries rather than eliminating it.

Erasure or Erosion? Evaluating Compositional Degradation in Unlearned Text-To-Image Diffusion Models

cs.CV · 2026-04-06 · unverdicted · novelty 6.0

Unlearning methods that strongly erase concepts from text-to-image diffusion models consistently degrade performance on attribute binding, spatial reasoning, and counting tasks.

Jellyfish: Zero-Shot Federated Unlearning Scheme with Knowledge Disentanglement

cs.CR · 2026-04-05 · unverdicted · novelty 6.0

Jellyfish enables zero-shot federated unlearning through synthetic proxy data generation, channel-restricted knowledge disentanglement, and a composite loss with repair to forget target data while retaining model utility.

Forget-It-All: Multi-Concept Machine Unlearning via Concept-Aware Neuron Masking

cs.CV · 2026-01-07 · unverdicted · novelty 6.0

FIA uses contrastive concept saliency and temporal-spatial neuron identification to build unified masks that erase multiple target concepts while preserving general generation quality in diffusion models.

Exploring Nonlinear Pathway in Parameter Space for Machine Unlearning

cs.AI · 2025-05-16 · unverdicted · novelty 6.0

MCU applies mode connectivity to trace nonlinear unlearning pathways in parameter space, adds a parameter mask and adaptive penalty, and produces a range of unlearning models that plug into existing methods.

Orthogonal Concept Erasure for Diffusion Models

cs.AI · 2026-05-27 · unverdicted · novelty 5.0

OCE reformulates editing-based concept erasure in diffusion models as multiplicative orthogonal transformations on neuron parameters to decouple concept direction from magnitude and angular geometry.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Machine Unlearning for Masked Diffusion Language Models cs.CL · 2026-05-18 · unverdicted · none · ref 23 · internal anchor
MDU minimizes forward KL divergence from prompt-conditional to prompt-masked unconditional predictions at masked positions to unlearn knowledge in MDLMs while trading off privacy and utility via temperature scaling.

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer