NPO enables stable unlearning of 50%+ training data in LLMs on TOFU by making collapse exponentially slower than gradient ascent, preserving sensible outputs where prior methods fail.
hub
Cer- tified data removal from machine learning models
26 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
TRACER uses token reassignment for concept-related items plus a coherence regularizer to unlearn specific concepts in generative recommendation while preserving utility better than baselines.
For any ρ>0 there exists a ρ-TV-stable RL algorithm for tabular MDPs supporting exact unlearning at expected cost ρ√(ln T) of retraining from scratch, with regret O(H²√(SAT)+H³S²A+H^{2.5}S²A/ρ) and matching lower bound Ω(H√(SAT)+SAH/ρ).
DivIn samples initial noise from a guidance potential posterior via Langevin dynamics to improve diversity in class-to-image and text-to-image generation.
Introduces interference-aware multi-task unlearning with task-aware gradient projection and instance-level gradient orthogonalization, reducing interference scores by 30.3% and 52.9% on vision benchmarks.
CoVUBench is the first benchmark framework for evaluating multimodal copyright unlearning in LVLMs via synthetic data, systematic variations, and a dual protocol for forgetting efficacy and utility preservation.
Second-order optimizers retain residual geometric memory in their state after unlearning that first-order metrics miss, and only controlled eigendecay perturbations fully erase it.
ZeroUnlearn reformulates machine unlearning as knowledge re-mapping via model editing, using multiplicative updates with closed-form solutions for efficient few-shot removal of sensitive representations while preserving utility.
REGLU guides LoRA-based unlearning via representation subspaces and orthogonal regularization to outperform prior methods on forget-retain trade-off in LLM benchmarks.
WIN-U delivers a retain-free unlearning update that approximates the gold-standard retrained model via a Woodbury-informed Newton step using only forget-set curvature information.
PrivEraserVerify unifies efficiency via adaptive checkpointing, privacy via layer-adaptive DP, and verifiability via fingerprints in federated unlearning, claiming 2-3x faster performance than retraining with formal guarantees.
Parameter-difference and model-inversion attacks can identify forgotten classes after machine unlearning on standard image datasets.
Jellyfish enables zero-shot federated unlearning through synthetic proxy data generation, channel-restricted knowledge disentanglement, and a composite loss with repair to forget target data while retaining model utility.
FIA uses contrastive concept saliency and temporal-spatial neuron identification to build unified masks that erase multiple target concepts while preserving general generation quality in diffusion models.
POUR derives a provably optimal forgetting operator by showing that orthogonal projections of simplex equiangular tight frames remain ETFs in lower dimensions, enabling representation-level unlearning with closed-form and distillation variants.
MCU applies mode connectivity to trace nonlinear unlearning pathways in parameter space, adds a parameter mask and adaptive penalty, and produces a range of unlearning models that plug into existing methods.
TOFU is a new benchmark with synthetic profiles and metrics demonstrating that existing unlearning algorithms for LLMs fail to achieve effective forgetting of targeted information.
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
TrustErase uses passport-embedded representations for instant, data-free, and auditable machine unlearning through simple deactivation of adaptation layers.
Withdrawal rights paired with centralized cost-based assignment prevent subsidy waste by collecting data only when the improvement threshold is sustainably reachable, turning infeasible cases into null outcomes.
A complete pipeline for federated unlearning via knowledge distillation for efficient removal and a GAN-integrated classifier for visual evaluation of forgetting capacity.
A LoRA-based residual feature alignment method for efficient machine unlearning on pre-trained models by targeting zero residuals on retained data and shifted residuals on unlearned data.
AdaProb performs machine unlearning by substituting final-layer output probabilities with optimized uniform pseudo-probabilities and updating model weights.
Survey organizes LLM trustworthiness into seven categories and 29 sub-categories, measures eight sub-categories on popular models, and finds that more aligned models generally score higher but with varying effectiveness.
citing papers explorer
-
Incentivizing User Data Contributions for LLM Improvement under Withdrawal Rights
Withdrawal rights paired with centralized cost-based assignment prevent subsidy waste by collecting data only when the improvement threshold is sustainably reachable, turning infeasible cases into null outcomes.
-
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
Survey organizes LLM trustworthiness into seven categories and 29 sub-categories, measures eight sub-categories on popular models, and finds that more aligned models generally score higher but with varying effectiveness.