{"paper":{"title":"MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models","license":"http://creativecommons.org/licenses/by-nc-sa/4.0/","headline":"MPU lets clients unlearn LLM knowledge without exposing the forget set or the original model parameters.","cross_cats":["cs.AI","cs.CR","cs.DC"],"primary_cat":"cs.LG","authors_text":"Pengjun Xie, Tiantong Wang, Tiantong Wu, Wei Yang Bryan Lim, Xinyu Yan, Yurong Hao","submitted_at":"2026-02-27T08:39:36Z","abstract_excerpt":"Machine unlearning for large language models often faces a privacy dilemma in which strict constraints prohibit sharing either the server's parameters or the client's forget set. To address this dual non-disclosure constraint, we propose MPU, an algorithm-agnostic privacy-preserving Multiple Perturbed Copies Unlearning framework that primarily introduces two server-side modules: Pre-Process for randomized copy generation and Post-Process for update aggregation. In Pre-Process, the server distributes multiple perturbed and reparameterized model instances, allowing the client to execute unlearni"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"MPU achieves comparable unlearning performance to noise-free baselines, with most algorithms' average degradation well below 1% up to 10% noise, and can even outperform the noise-free baseline for some algorithms under 1% noise.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the harmonic denoising procedure in Post-Process sufficiently removes perturbation effects without introducing new biases or security vulnerabilities that would be visible only under stronger adversarial analysis than the reported experiments.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"MPU is a framework that achieves privacy-preserving unlearning for LLMs by distributing perturbed model copies for local client-side unlearning followed by server-side aggregation with harmonic denoising.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"MPU lets clients unlearn LLM knowledge without exposing the forget set or the original model parameters.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"5a6c7720a43522bc677bb344b34bf332484f95da9a63fbfd7adb2b09e37b652a"},"source":{"id":"2602.23798","kind":"arxiv","version":2},"verdict":{"id":"46c1a69b-e554-4052-8ea8-7f4d3e5692af","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T18:45:17.308762Z","strongest_claim":"MPU achieves comparable unlearning performance to noise-free baselines, with most algorithms' average degradation well below 1% up to 10% noise, and can even outperform the noise-free baseline for some algorithms under 1% noise.","one_line_summary":"MPU is a framework that achieves privacy-preserving unlearning for LLMs by distributing perturbed model copies for local client-side unlearning followed by server-side aggregation with harmonic denoising.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the harmonic denoising procedure in Post-Process sufficiently removes perturbation effects without introducing new biases or security vulnerabilities that would be visible only under stronger adversarial analysis than the reported experiments.","pith_extraction_headline":"MPU lets clients unlearn LLM knowledge without exposing the forget set or the original model parameters."},"references":{"count":18,"sample":[{"doi":"10.3115/v1/w14-4012","year":2025,"title":"On the properties of neural machine translation: Encoder-decoder approaches","work_id":"62cd99e9-1990-4d03-bfe7-00e9b69af44b","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"OpenUnlearning: Accelerating LLM unlearning via unified benchmarking of methods and metrics","work_id":"59905b5b-83cd-476d-8cf6-82ef102a5ff3","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Chongyang Gao, Lixu Wang, Kaize Ding, Chenkai Weng, Xiao Wang, and Qi Zhu","work_id":"372e1066-40c7-4da3-97bd-755328c1de7a","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"The Llama 3 Herd of Models","work_id":"1549a635-88af-4ac1-acfe-51ae7bb53345","ref_index":4,"cited_arxiv_id":"2407.21783","is_internal_anchor":true},{"doi":"","year":2025,"title":"Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate","work_id":"1fbe461d-8290-4581-ae70-82b5143e80b2","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":18,"snapshot_sha256":"fc84825712155a58f192834ee40d69733a711ee5c240ae563137e9aa889cbb4e","internal_anchors":3},"formal_canon":{"evidence_count":2,"snapshot_sha256":"9e48548268a0cc31391c331bffb2b4f3d801f1cb06432db9ef0269f17fa4f6aa"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}