pith. sign in

arxiv: 2606.17122 · v1 · pith:OMVBWHPUnew · submitted 2026-06-15 · 💻 cs.CR · cs.AI· cs.LG

TrustErase: Auditable Instant Machine Unlearning with Passport-Embedded Representations

Pith reviewed 2026-06-27 03:22 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.LG
keywords machine unlearningverifiable forgettingpassport embeddingdata-free methodsparameter efficient adaptationsingular value decompositionprivacy compliant AI
0
0 comments X

The pith

TrustErase enables instant machine unlearning by deactivating embedded passports in model weights without retraining or data access.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Existing machine unlearning methods require expensive retraining or distillation and lack verifiability. TrustErase embeds passports as cryptographic keys in parameter-efficient adaptation layers, hidden via singular value decomposition. Deactivating these passports removes specific classes or datasets instantly in a data-free manner. This provides auditable compliance with privacy requirements. Tests on MNIST, CIFAR10, and CIFAR100 demonstrate performance matching or exceeding prior approaches.

Core claim

By treating passports as cryptographic keys within parameter-efficient adaptation layers and concealing them in model weights with singular value based decomposition, TrustErase allows removal of specific classes or datasets through simple deactivation without retraining, fine-tuning, or access to the original data, while keeping unlearning actions transparent and provably compliant.

What carries the argument

Passport-embedded representations treated as cryptographic keys in parameter-efficient adaptation layers, concealed via singular value based decomposition.

If this is right

  • Unlearning actions become modular and can target specific classes or entire datasets independently.
  • No access to original training data is needed, enabling unlearning in restricted data environments.
  • Model utility is preserved at levels comparable to existing methods on image classification tasks.
  • Transparency of the deactivation process supports auditability and regulatory compliance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This method may allow unlearning in large-scale models where retraining is prohibitive.
  • Similar passport techniques could be adapted for other forms of model control beyond forgetting.
  • Integration with existing adaptation methods like adapters could broaden applicability.

Load-bearing premise

A singular value based decomposition can conceal passports within model weights such that deactivation achieves verifiable unlearning while preserving model utility and transparency.

What would settle it

If deactivating the passports fails to reduce accuracy on the target classes to near-random levels while keeping overall performance intact, the unlearning claim would not hold.

Figures

Figures reproduced from arXiv: 2606.17122 by Concetto Spampinato, Giovanni Bellitto, Leonardo G. Russo, Matteo Pennisi, Rutger Hendrix.

Figure 1
Figure 1. Figure 1: Overview of TrustErase for Auditable Machine Unlearning. Left: Multi-task training with task-specific passports Ci inserted between shared LoRA matrices A and B, enabling simulta￾neous optimization across multiple unlearning objectives. Middle: Post-training passport hiding via SVD decomposition (Ci = C1,iC2,i), where factors are absorbed into public weights B′ i = BC1,i and A′ i = C2,iA. Right: Audit prot… view at source ↗
Figure 2
Figure 2. Figure 2: Hypernetwork-based Passport Composition. Atomic passports are sampled from the power set and fed into a hypernetwork to synthesize composite passports. The hypernetwork is optimized end-to-end with frozen LoRA parameters and base passports. 6 [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Sensitivity to perturbations. The x-axis represents the magnitude of a random perturba￾tion ϵP added to the passport, and the y-axis shows the mean ℓ2 distance between the logits produced using the original and perturbed passports. To evaluate robustness, we perturb the passport matrix with random noise of magnitude ϵP and measure the resulting change in model logits ( [PITH_FULL_IMAGE:figures/full_fig_p0… view at source ↗
Figure 4
Figure 4. Figure 4: Overview of the proposed verification protocol for auditable unlearning. The protocol follows a trust-but-verify paradigm involving three entities: a Verification Authority (VA) that con￾ducts initial training and distributes secrets, trusted providers that perform model serving using desig￾nated passports, and users who request specific un￾learning tasks. The VA validates both the structural correctness o… view at source ↗
read the original abstract

The demand for privacy-compliant AI has amplified the need for machine unlearning; yet, existing retraining or distillation-based methods remain unverifiable and computationally costly. We introduce TrustErase, a verifiable, data-free unlearning framework leveraging passport-embedded representations for instant, modular, and auditable forgetting. By treating passports as cryptographic keys within parameter-efficient adaptation layers, TrustErase enables the removal of specific classes or datasets through simple deactivation, without retraining, fine-tuning, or access to the original data. A singular value based decomposition conceals passports within model weights, ensuring that unlearning actions remain transparent and provably compliant. Evaluations on MNIST, CIFAR10 and CIFAR100 show that TrustErase matches or exceeds state-of-the-art benchmarks such as DELETE, L2UL, and Boundary Shrink, while operating in a strictly data-free regime. Ultimately, TrustErase establishes a new paradigm for trustworthy, accountable, and instantly forgettable AI systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces TrustErase, a data-free machine unlearning framework that embeds 'passports' as cryptographic keys inside parameter-efficient adaptation layers via singular-value decomposition; deactivation of the passports is claimed to instantly remove the influence of specific classes or datasets while preserving utility, enabling auditability, and matching the performance of DELETE, L2UL, and Boundary Shrink on MNIST, CIFAR-10, and CIFAR-100.

Significance. If the SVD-based embedding were shown to isolate class-specific effects orthogonally and if deactivation were proven to constitute verifiable unlearning, the approach would offer a genuinely new, low-cost paradigm for auditable forgetting that avoids retraining or data access. The modular passport mechanism itself is a conceptually attractive idea that could be extended beyond the current setting.

major comments (2)
  1. [Technical Approach (no equation or lemma supplied)] The central technical claim—that an SVD-based concealment of passports inside model weights allows deactivation to nullify exactly the forgotten information while leaving task-relevant directions intact—receives no derivation or orthogonality argument anywhere in the manuscript. Without such an argument the equivalence between deactivation and unlearning remains an unproven assumption.
  2. [Abstract / Evaluation section] The abstract asserts that TrustErase 'matches or exceeds' DELETE, L2UL, and Boundary Shrink on MNIST/CIFAR-10/CIFAR-100, yet supplies neither quantitative metrics, error bars, verification procedure, nor dataset-split details. This absence makes the performance claim impossible to evaluate and directly undermines the empirical support for the method.
minor comments (1)
  1. [Method description] Notation for the passport embedding and the precise definition of the adaptation layer are introduced without a clear mathematical formulation or pseudocode, making the method difficult to reproduce from the text alone.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and will revise the paper to strengthen both the technical justification and the presentation of empirical results.

read point-by-point responses
  1. Referee: [Technical Approach (no equation or lemma supplied)] The central technical claim—that an SVD-based concealment of passports inside model weights allows deactivation to nullify exactly the forgotten information while leaving task-relevant directions intact—receives no derivation or orthogonality argument anywhere in the manuscript. Without such an argument the equivalence between deactivation and unlearning remains an unproven assumption.

    Authors: We acknowledge that the manuscript presents the SVD-based passport embedding at a conceptual level without supplying a formal derivation or orthogonality argument. In the revised version we will add a dedicated technical subsection that derives the embedding procedure, shows that the singular vectors associated with the passports lie in directions orthogonal to the principal task-relevant components (via the properties of SVD), and provides lemmas establishing that deactivation removes exactly the class-specific influence while leaving utility intact. This will make the equivalence between deactivation and unlearning explicit rather than assumed. revision: yes

  2. Referee: [Abstract / Evaluation section] The abstract asserts that TrustErase 'matches or exceeds' DELETE, L2UL, and Boundary Shrink on MNIST/CIFAR-10/CIFAR-100, yet supplies neither quantitative metrics, error bars, verification procedure, nor dataset-split details. This absence makes the performance claim impossible to evaluate and directly undermines the empirical support for the method.

    Authors: The referee is correct that the abstract states the performance claim without accompanying quantitative details. Although the experiments section of the manuscript reports the full comparisons (including accuracy, forgetting metrics, and standard deviations across the three datasets), the abstract itself does not. We will revise the abstract to include representative numerical results with error bars and will ensure the evaluation section explicitly states the verification procedure and dataset splits used. revision: yes

Circularity Check

0 steps flagged

No circularity detected; derivation chain absent from text

full rationale

The abstract and full-text placeholder supply no equations, parameter-fitting steps, self-citations, or uniqueness theorems. The SVD concealment of passports is asserted as enabling verifiable deactivation, yet no derivation, ansatz, or reduction to prior inputs is exhibited that could be inspected for self-definition or fitted-input prediction. Without any load-bearing mathematical chain, the paper remains self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Abstract-only review provides no explicit free parameters, axioms, or detailed invented entities beyond the high-level mention of passports; no independent evidence for the passports is described.

invented entities (1)
  • passport-embedded representations no independent evidence
    purpose: Enable instant deactivation-based unlearning while concealing information in weights
    Introduced as a core novel component for verifiable forgetting

pith-pipeline@v0.9.1-grok · 5705 in / 1066 out tokens · 42349 ms · 2026-06-27T03:22:25.667490+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 8 canonical work pages · 4 internal anchors

  1. [1]

    Regulation (eu) 2016/679 of the european parliament and of the council of 27 april 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/ec (general data protection regulation) (text with eea relevance), May 2016

  2. [2]

    The economics of personal data and the economics of privacy

    Alessandro Acquisti et al. The economics of personal data and the economics of privacy. Background Paper for OECD Joint WPISP-WPIE Roundtable, 1:50, 2010

  3. [3]

    Artificial intelligence risk management framework (ai rmf 1.0).URL: https://nvlpubs

    NIST AI. Artificial intelligence risk management framework (ai rmf 1.0).URL: https://nvlpubs. nist. gov/nistpubs/ai/nist. ai, pages 100–1, 2023

  4. [4]

    Machine unlearning: Linear filtration for logit-based classifiers.Machine Learning, 111(9):3203–3226, 2022

    Thomas Baumhauer, Pascal Schöttle, and Matthias Zeppelzauer. Machine unlearning: Linear filtration for logit-based classifiers.Machine Learning, 111(9):3203–3226, 2022

  5. [5]

    Machine unlearning

    Lucas Bourtoule, Varun Chandrasekaran, Christopher A Choquette-Choo, Hengrui Jia, Adelin Travers, Baiwu Zhang, David Lie, and Nicolas Papernot. Machine unlearning. In2021 IEEE symposium on security and privacy (SP), pages 141–159. IEEE, 2021

  6. [6]

    Towards making systems forget with machine unlearning

    Yinzhi Cao and Junfeng Yang. Towards making systems forget with machine unlearning. In 2015 IEEE symposium on security and privacy, pages 463–480. IEEE, 2015. 10

  7. [7]

    Extracting training data from large language models

    Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-V oss, Kather- ine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, et al. Extracting training data from large language models. In30th USENIX security symposium (USENIX Security 21), pages 2633–2650, 2021

  8. [8]

    Learning to unlearn: Instance-wise unlearning for pre-trained classifiers

    Sungmin Cha, Sungjun Cho, Dasol Hwang, Honglak Lee, Taesup Moon, and Moontae Lee. Learning to unlearn: Instance-wise unlearning for pre-trained classifiers. InProceedings of the AAAI conference on artificial intelligence, volume 38, pages 11186–11194, 2024

  9. [9]

    Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary

    Min Chen, Weizhuo Gao, Gaoyang Liu, Kai Peng, and Chen Wang. Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7766–7775, 2023

  10. [10]

    Can bad teaching induce forgetting? unlearning in deep networks using an incompetent teacher

    Vikram S Chundawat, Ayush K Tarun, Murari Mandal, and Mohan Kankanhalli. Can bad teaching induce forgetting? unlearning in deep networks using an incompetent teacher. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 7210–7217, 2023

  11. [11]

    Excavating ai: The politics of images in machine learning training sets.Ai & Society, 36(4):1105–1116, 2021

    Kate Crawford and Trevor Paglen. Excavating ai: The politics of images in machine learning training sets.Ai & Society, 36(4):1105–1116, 2021

  12. [12]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Alexey Dosovitskiy. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020

  13. [13]

    SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

    Chongyu Fan, Jiancheng Liu, Yihua Zhang, Eric Wong, Dennis Wei, and Sijia Liu. Salun: Empowering machine unlearning via gradient-based weight saliency in both image classification and generation.arXiv preprint arXiv:2310.12508, 2023

  14. [14]

    Rethinking deep neural network ownership verification: Embedding passports to defeat ambiguity attacks.Advances in neural information processing systems, 32, 2019

    Lixin Fan, Kam Woh Ng, and Chee Seng Chan. Rethinking deep neural network ownership verification: Embedding passports to defeat ambiguity attacks.Advances in neural information processing systems, 32, 2019

  15. [15]

    Fast machine unlearning without retraining through selective synaptic dampening

    Jack Foster, Stefan Schoepf, and Alexandra Brintrup. Fast machine unlearning without retraining through selective synaptic dampening. InProceedings of the AAAI conference on artificial intelligence, volume 38, pages 12043–12051, 2024

  16. [16]

    Erasing concepts from diffusion models

    Rohit Gandikota, Joanna Materzynska, Jaden Fiotto-Kaufman, and David Bau. Erasing concepts from diffusion models. InProceedings of the IEEE/CVF international conference on computer vision, pages 2426–2436, 2023

  17. [17]

    Stochastic first-and zeroth-order methods for nonconvex stochastic programming.SIAM journal on optimization, 23(4):2341–2368, 2013

    Saeed Ghadimi and Guanghui Lan. Stochastic first-and zeroth-order methods for nonconvex stochastic programming.SIAM journal on optimization, 23(4):2341–2368, 2013

  18. [18]

    Making ai forget you: Data deletion in machine learning.Advances in neural information processing systems, 32, 2019

    Antonio Ginart, Melody Guan, Gregory Valiant, and James Y Zou. Making ai forget you: Data deletion in machine learning.Advances in neural information processing systems, 32, 2019

  19. [19]

    Forgetting outside the box: Scrub- bing deep networks of information accessible from input-output observations

    Aditya Golatkar, Alessandro Achille, and Stefano Soatto. Forgetting outside the box: Scrub- bing deep networks of information accessible from input-output observations. InEuropean Conference on Computer Vision, pages 383–398. Springer, 2020

  20. [20]

    Amnesiac machine learning

    Laura Graves, Vineel Nagisetty, and Vijay Ganesh. Amnesiac machine learning. InProceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 11516–11524, 2021

  21. [21]

    Certified data removal from machine learning models.arXiv preprint arXiv:1911.03030, 2019

    Chuan Guo, Tom Goldstein, Awni Hannun, and Laurens Van Der Maaten. Certified data removal from machine learning models.arXiv preprint arXiv:1911.03030, 2019

  22. [22]

    Dai, and Quoc V

    David Ha, Andrew M. Dai, and Quoc V . Le. Hypernetworks. InInternational Conference on Learning Representations, 2017. URLhttps://openreview.net/forum?id=rkpACe1lx

  23. [23]

    Russo, Simone Carnemolla, Federica Proi- etto Salanitri, Giovanni Bellitto, Concetto Spampinato, and Matteo Pennisi

    Rutger Hendrix, Giovanni Patanè, Leonardo G. Russo, Simone Carnemolla, Federica Proi- etto Salanitri, Giovanni Bellitto, Concetto Spampinato, and Matteo Pennisi. Pre-forgettable models: Prompt learning as a native mechanism for unlearning. InACM International Confer- ence on Multimedia (ACM MM), Dublin, Ireland, 2025. doi: 10.1145/3746027.3758171. 11

  24. [24]

    Distilling the Knowledge in a Neural Network

    Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. In arXiv preprint arXiv:1503.02531, 2015

  25. [25]

    Lora: Low-rank adaptation of large language models.ICLR, 1 (2):3, 2022

    Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. Lora: Low-rank adaptation of large language models.ICLR, 1 (2):3, 2022

  26. [26]

    Editing Models with Task Arithmetic

    Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Suchin Gururangan, Ludwig Schmidt, Hannaneh Hajishirzi, and Ali Farhadi. Editing models with task arithmetic.arXiv preprint arXiv:2212.04089, 2022

  27. [27]

    Conflict-averse gradient descent for multi-task learning.Advances in neural information processing systems, 34:18878–18890, 2021

    Bo Liu, Xingchao Liu, Xiaojie Jin, Peter Stone, and Qiang Liu. Conflict-averse gradient descent for multi-task learning.Advances in neural information processing systems, 34:18878–18890, 2021

  28. [28]

    Parameter-efficient multi-task fine-tuning for transformers via shared hypernetworks

    Rabeeh Karimi Mahabadi, Sebastian Ruder, Mostafa Dehghani, and James Henderson. Parameter-efficient multi-task fine-tuning for transformers via shared hypernetworks. InPro- ceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers),...

  29. [29]

    Certifiable machine unlearning for linear models.arXiv preprint arXiv:2106.15093, 2021

    Ananth Mahadevan and Michael Mathioudakis. Certifiable machine unlearning for linear models.arXiv preprint arXiv:2106.15093, 2021

  30. [30]

    Descent-to-delete: Gradient-based methods for machine unlearning

    Seth Neel, Aaron Roth, and Saeed Sharifi-Malvajerdi. Descent-to-delete: Gradient-based methods for machine unlearning. InAlgorithmic Learning Theory, pages 931–962. PMLR, 2021

  31. [31]

    A survey of machine unlearning.ACM Transactions on Intelligent Systems and Technology, 16(5):1–46, 2025

    Thanh Tam Nguyen, Thanh Trung Huynh, Zhao Ren, Phi Le Nguyen, Alan Wee-Chung Liew, Hongzhi Yin, and Quoc Viet Hung Nguyen. A survey of machine unlearning.ACM Transactions on Intelligent Systems and Technology, 16(5):1–46, 2025

  32. [32]

    Seal: Entangled white-box watermarks on low-rank adaptation.arXiv preprint arXiv:2501.09284, 2025

    Giyeong Oh, Saejin Kim, Woohyun Cho, Sangkyu Lee, Jiwan Chung, Dokyung Song, and Youngjae Yu. Seal: Entangled white-box watermarks on low-rank adaptation.arXiv preprint arXiv:2501.09284, 2025

  33. [33]

    Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models

    Patrick Schramowski, Manuel Brack, Björn Deiseroth, and Kristian Kersting. Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22522–22531, 2023

  34. [34]

    Laion- 5b: An open large-scale dataset for training next generation image-text models.Advances in neural information processing systems, 35:25278–25294, 2022

    Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, et al. Laion- 5b: An open large-scale dataset for training next generation image-text models.Advances in neural information processing systems, 35:25278–25294, 2022

  35. [35]

    Multi-task learning as multi-objective optimization.Advances in neural information processing systems, 31, 2018

    Ozan Sener and Vladlen Koltun. Multi-task learning as multi-objective optimization.Advances in neural information processing systems, 31, 2018

  36. [36]

    Glaze: Protecting artists from style mimicry by {Text-to-Image} models

    Shawn Shan, Jenna Cryan, Emily Wenger, Haitao Zheng, Rana Hanocka, and Ben Y Zhao. Glaze: Protecting artists from style mimicry by {Text-to-Image} models. In32nd USENIX Security Symposium (USENIX Security 23), pages 2187–2204, 2023

  37. [37]

    Membership inference attacks against machine learning models

    Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In2017 IEEE symposium on security and privacy (SP), pages 3–18. IEEE, 2017

  38. [38]

    Fast yet effective machine unlearning.IEEE transactions on neural networks and learning systems, 35 (9):13046–13055, 2023

    Ayush K Tarun, Vikram S Chundawat, Murari Mandal, and Mohan Kankanhalli. Fast yet effective machine unlearning.IEEE transactions on neural networks and learning systems, 35 (9):13046–13055, 2023

  39. [39]

    Unrolling sgd: Understanding factors influencing machine unlearning

    Anvith Thudi, Gabriel Deza, Varun Chandrasekaran, and Nicolas Papernot. Unrolling sgd: Understanding factors influencing machine unlearning. In2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P), pages 303–319. IEEE, 2022. 12

  40. [40]

    Deltagrad: Rapid retraining of machine learning models

    Yinjun Wu, Edgar Dobriban, and Susan Davidson. Deltagrad: Rapid retraining of machine learning models. InInternational Conference on Machine Learning, pages 10355–10366. PMLR, 2020

  41. [41]

    Ties-merging: Resolving interference when merging models.Advances in neural information processing systems, 36:7093–7115, 2023

    Prateek Yadav, Derek Tam, Leshem Choshen, Colin A Raffel, and Mohit Bansal. Ties-merging: Resolving interference when merging models.Advances in neural information processing systems, 36:7093–7115, 2023

  42. [42]

    Gradient surgery for multi-task learning.Advances in neural information processing systems, 33:5824–5836, 2020

    Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. Gradient surgery for multi-task learning.Advances in neural information processing systems, 33:5824–5836, 2020

  43. [43]

    Decoupled distillation to erase: A general unlearning method for any class-centric tasks

    Yu Zhou, Dian Zheng, Qijie Mo, Renjie Lu, Kun-Yu Lin, and Wei-Shi Zheng. Decoupled distillation to erase: A general unlearning method for any class-centric tasks. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 20350–20359, 2025. 13 Appendix for TrustErase A Experimental Setup A.1 Training details and configurations Architec...