Intrinsic Test of Unlearning Using Parametric Knowledge Traces

Yihuai Hong, Lei Yu, Haiqin Yang, Shauli Ravfogel, Mor Geva · 2025 · DOI 10.18653/v1/2025.emnlp-main.985

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

representative citing papers

Measuring the Depth of LLM Unlearning via Activation Patching

cs.CL · 2026-05-23 · unverdicted · novelty 7.0

Introduces Unlearning Depth Score (UDS) via activation patching to quantify LLM unlearning depth and claims it outperforms 20 other metrics in faithfulness and robustness on 150 models.

LMs as Task-Specific Knowledge Bases: An Interpretability Analysis

cs.CL · 2026-06-25 · unverdicted · novelty 6.0

LMs store facts in task-specific parameter subsets, shown by inconsistent emergence across tasks during training and distinct localized parameters for the same fact.

Don't Forget Your Embeddings: Robust Knowledge Erasure via Precise Editing of Embeddings

cs.CL · 2026-06-02 · unverdicted · novelty 6.0

EMBER augments existing erasure methods by precisely removing concept features from embeddings via sparse matrix factorization, cutting relearning recovery to 35% on Llama-3.1-8B from 70-76%.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Measuring the Depth of LLM Unlearning via Activation Patching cs.CL · 2026-05-23 · unverdicted · none · ref 14
Introduces Unlearning Depth Score (UDS) via activation patching to quantify LLM unlearning depth and claims it outperforms 20 other metrics in faithfulness and robustness on 150 models.
LMs as Task-Specific Knowledge Bases: An Interpretability Analysis cs.CL · 2026-06-25 · unverdicted · none · ref 7
LMs store facts in task-specific parameter subsets, shown by inconsistent emergence across tasks during training and distinct localized parameters for the same fact.
Don't Forget Your Embeddings: Robust Knowledge Erasure via Precise Editing of Embeddings cs.CL · 2026-06-02 · unverdicted · none · ref 2
EMBER augments existing erasure methods by precisely removing concept features from embeddings via sparse matrix factorization, cutting relearning recovery to 35% on Llama-3.1-8B from 70-76%.

Intrinsic Test of Unlearning Using Parametric Knowledge Traces

fields

years

verdicts

representative citing papers

citing papers explorer