LACUNA is a new testbed that injects PII into predefined model parameters to benchmark the localization precision of LLM unlearning methods, revealing that SOTA approaches are imprecise despite strong output performance.
Do Localization Methods Actually Localize Memorized Data in LLM s? A Tale of Two Benchmarks
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
Output vector editing on MLP neurons suppresses memorization in LLMs up to 87.9% on 6831 sequences in OLMo-7B with a 2.7x gap over zero ablation, ensemble covering 96.5%.
Empirical benchmarks show distribution similarity between adaptation and pretraining data increases practical privacy leakage in DP-adapted LLMs at fixed theoretical guarantees, with LoRA providing strongest protection for OOD cases.
citing papers explorer
-
LACUNA: A Testbed for Evaluating Localization Precision for LLM Unlearning
LACUNA is a new testbed that injects PII into predefined model parameters to benchmark the localization precision of LLM unlearning methods, revealing that SOTA approaches are imprecise despite strong output performance.
-
Output Vector Editing for Memorization Mitigation in Large Language Models
Output vector editing on MLP neurons suppresses memorization in LLMs up to 87.9% on 6831 sequences in OLMo-7B with a 2.7x gap over zero ablation, ensemble covering 96.5%.