Quantization-Robust LLM Unlearning via Low-Rank Adaptation

· 2026 · cs.LG · arXiv 2602.13151

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Large Language Model (LLM) unlearning aims to remove targeted knowledge from a trained model, but practical deployments often require post-training quantization (PTQ) for efficient inference. However, aggressive low-bit PTQ can mask unlearning updates, causing quantized models to revert to pre-unlearning behavior. We show that standard full-parameter fine-tuning often induces parameter changes that are too small to survive 4-bit quantization. We propose quantization-robust unlearning via low-rank adaptation (LoRA): we freeze the base model and concentrate unlearning into trainable adapters so that the effective update is preserved after quantization. On Llama-2-7B evaluated with MUSE dataset (BOOKS and NEWS), LoRA improves 4-bit utility by up to 7.93 points (NPO+GDR on BOOKS: 50.17 to 58.10) and yields higher 4-bit utility on NEWS for GA+GDR (40.06 to 44.82, increase of 4.76). LoRA also substantially reduces privacy leakage under 4-bit PTQ, e.g., for GA+KLR on BOOKS, PrivLeak moves from -25.68 to -5.86 (closer to ideal 0), while maintaining strong forgetting (VerMem and KnowMem near 0). Thus, using LoRA for Machine Unlearning is beneficial for scenarios where quantization is necessary for model deployment.

representative citing papers

Null-Space Constrained Low-Rank Adaptation for Response-Specified Large Language Model Unlearning

cs.AI · 2026-06-09 · unverdicted · novelty 5.0

NSRU constrains LoRA updates via null-space projection of retain subspaces to jointly optimize safe-target learning, undesired-response suppression, and retention in LLM unlearning.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Null-Space Constrained Low-Rank Adaptation for Response-Specified Large Language Model Unlearning cs.AI · 2026-06-09 · unverdicted · none · ref 32 · internal anchor
NSRU constrains LoRA updates via null-space projection of retain subspaces to jointly optimize safe-target learning, undesired-response suppression, and retention in LLM unlearning.

Quantization-Robust LLM Unlearning via Low-Rank Adaptation

fields

years

verdicts

representative citing papers

citing papers explorer