Machine unlearning: A survey

Xu, H · 2023 · DOI 10.1145/3603620

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open at publisher browse 6 citing papers

representative citing papers

Look But Don't Touch with Sparse Autoencoders for Unlearning in Diffusion Models

cs.CV · 2026-06-30 · unverdicted · novelty 7.0 · 2 refs

SAEs detect concepts well in diffusion models but fail as direct intervention points for unlearning; a detection-guided patch replacement method yields significantly cleaner erasure results.

MEMOREPAIR: Barrier-First Cascade Repair in Agentic Memory

cs.AI · 2026-05-08 · unverdicted · novelty 7.0

MemoRepair formalizes the cascade update problem in agentic memory and solves it via a min-cut reduction that eliminates invalidated memory exposure to 0% while recovering 91-94% of valid successors at 57-76% of baseline repair cost.

Improving LLM Unlearning Robustness via Random Perturbations

cs.CL · 2025-01-31 · unverdicted · novelty 7.0

LLM unlearning is reframed as inadvertently installing backdoor triggers on forget-tokens; Random Noise Augmentation is introduced as a defense that improves robustness with theoretical guarantees.

CSC: Turning the Adversary's Poison against Itself

cs.CR · 2026-04-23 · unverdicted · novelty 6.0

CSC identifies backdoored samples via early-epoch latent clustering and conceals them by relabeling to a virtual class, driving attack success rates near zero on benchmarks with little clean accuracy loss.

CiPO: Counterfactual Unlearning for Large Reasoning Models through Iterative Preference Optimization

cs.CL · 2026-04-17 · unverdicted · novelty 6.0

CiPO removes undesired knowledge from both intermediate reasoning steps and final answers in large reasoning models by iteratively optimizing preferences toward valid counterfactual traces while keeping overall reasoning performance intact.

Machine Unlearning on Pre-trained Models by Residual Feature Alignment Using LoRA

cs.LG · 2024-11-13 · unverdicted · novelty 5.0

A LoRA-based residual feature alignment method for efficient machine unlearning on pre-trained models by targeting zero residuals on retained data and shifted residuals on unlearned data.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Machine Unlearning on Pre-trained Models by Residual Feature Alignment Using LoRA cs.LG · 2024-11-13 · unverdicted · none · ref 3
A LoRA-based residual feature alignment method for efficient machine unlearning on pre-trained models by targeting zero residuals on retained data and shifted residuals on unlearned data.

Machine unlearning: A survey

fields

years

verdicts

representative citing papers

citing papers explorer