Targeting minor components in LLM representations during unlearning yields substantially better resistance to relearning attacks than prior methods.
The reportedRelearnaccuracy corresponds to the maximum smoothed accuracy across the relearning trajectory, as some attack runs may exceed the optimal number of epochs
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Robust LLM Unlearning Against Relearning Attacks: The Minor Components in Representations Matter
Targeting minor components in LLM representations during unlearning yields substantially better resistance to relearning attacks than prior methods.