Towards Certified Unlearning for Deep Neural Networks

Binchi Zhang; Jundong Li; Tianhao Wang; Yushun Dong

arxiv: 2408.00920 · v4 · submitted 2024-08-01 · 💻 cs.LG · stat.ML

Towards Certified Unlearning for Deep Neural Networks

Binchi Zhang , Yushun Dong , Tianhao Wang , Jundong Li This is my paper

Pith reviewed 2026-05-23 21:57 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords machine unlearningcertified unlearningdeep neural networksnonconvex optimizationHessian approximationdata privacysequential unlearning

0 comments

The pith

Certified unlearning methods can be extended to deep neural networks using simple techniques and inverse Hessian approximations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to adapt certified unlearning, which offers efficiency and strong guarantees for convex models, to deep neural networks whose loss landscapes are highly nonconvex. It introduces straightforward adjustments for nonconvex objectives, replaces exact Hessian computations with an inverse approximation that keeps guarantees intact, and broadens the framework to cases where training has not converged and to sequences of unlearning requests arriving at different times. A sympathetic reader would care because this would let model owners remove specific training examples from DNNs with verifiable privacy assurances rather than relying on heuristic retraining.

Core claim

We propose several simple techniques to extend certified unlearning methods to nonconvex objectives. To reduce the time complexity, we develop an efficient computation method by inverse Hessian approximation without compromising certification guarantees. In addition, we extend our discussion of certification to nonconvergence training and sequential unlearning, considering that real-world users can send unlearning requests at different time points.

What carries the argument

Inverse Hessian approximation that replaces exact second-order computations while preserving the original certification bounds for nonconvex objectives.

If this is right

Certified unlearning becomes feasible for DNNs with the same theoretical removal guarantees previously limited to convex models.
Computation of the unlearning update becomes practical through the inverse Hessian approximation without weakening the guarantee.
The certification continues to hold when training stops before full convergence.
Sequential unlearning requests arriving over time can each be certified independently.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could support regulatory compliance for data-deletion requests in deployed neural-network services.
Approximation strategies of this kind may prove useful for certifying other post-training operations on nonconvex models.
Validation on larger-scale architectures would test whether the efficiency gains scale with model size.

Load-bearing premise

The simple extensions and inverse Hessian approximation preserve the certification guarantees when applied to the nonconvex loss landscapes of deep neural networks.

What would settle it

A DNN experiment in which the certified unlearning bound is violated, shown by the retained model still achieving lower loss or higher accuracy on the removed data points than the certified threshold allows.

Figures

Figures reproduced from arXiv: 2408.00920 by Binchi Zhang, Jundong Li, Tianhao Wang, Yushun Dong.

**Figure 1.** Figure 1: Illustration of certified unlearning, where the first step is to estimate the retrained model based on the original model, and the second step is to add noise to it. According to Theorem 2.3, we can guarantee the difference in distributions between the unlearned model and the retrained model is bounded by certification budgets. literature (Balle & Wang, 2018) in differential privacy, we notice that our cer… view at source ↗

**Figure 2.** Figure 2: Comparison of unlearning time between the certified unlearning method and unlearning baselines over three popular DNNs across three datasets. method with unlearning baselines. (1). The micro F1-score of the certified unlearning method on the unlearned set Du is closest to the retrained model for most cases. (2). The micro F1-score of the certified unlearning method on the retained set Dr and the test set D… view at source ↗

**Figure 3.** Figure 3: Gradient norm, approximation error bound, and model utility after each unlearning step. 0 2 4 6 Certification Level ( ) 1e3 20 40 60 80 Test F1-Score / % = 1e 3 = 1e 2 = 1e 1 (a) Test F1-ε curve 0.00 0.25 0.50 0.75 1.00 1.25 Certification Level ( ) 0 20 40 60 80 Test F1-Score / % = 1e 3 = 1e 2 = 1e 1 (b) Test F1-δ curve [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: The effect of local convex coefficient λ and certification budget ε and δ over the MLP backbone on MNIST. dard deviations. We repeat the experiment with different values of the local convex coefficient λ and present the results in [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

read the original abstract

In the field of machine unlearning, certified unlearning has been extensively studied in convex machine learning models due to its high efficiency and strong theoretical guarantees. However, its application to deep neural networks (DNNs), known for their highly nonconvex nature, still poses challenges. To bridge the gap between certified unlearning and DNNs, we propose several simple techniques to extend certified unlearning methods to nonconvex objectives. To reduce the time complexity, we develop an efficient computation method by inverse Hessian approximation without compromising certification guarantees. In addition, we extend our discussion of certification to nonconvergence training and sequential unlearning, considering that real-world users can send unlearning requests at different time points. Extensive experiments on three real-world datasets demonstrate the efficacy of our method and the advantages of certified unlearning in DNNs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper tries to certify unlearning on DNNs with inverse Hessian approximation and sequential handling, but nonconvexity likely breaks the guarantees.

read the letter

Hey, the main takeaway on this one is they want to do certified unlearning on DNNs by approximating the inverse Hessian and adding tricks for non-convex objectives plus sequential requests, but I doubt the certs survive the nonconvexity. They do a decent job laying out the practical problems like nonconvergent training and multiple requests over time. The experiments on real datasets give some sense that the method runs and has advantages. What they do well is addressing the sequential case, which most prior work skips. The weak part is the theory. The stress test note is right: without convexity, the usual Taylor or influence function bounds don't hold, and there's no sign they replaced them with something that works for indefinite Hessians. The claim of 'without compromising certification guarantees' needs a lot more justification than the abstract provides. This would be worth bringing to a reading group if you're into unlearning or privacy for neural nets. A reader could pull the experimental setup or the sequential handling idea. I'd say send it for review. The area needs work, and this tries to fill a gap even if the guarantees need fixing.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes several simple techniques to extend certified unlearning methods from convex models to nonconvex deep neural networks, develops an efficient inverse Hessian approximation for computation that claims to preserve certification guarantees, and extends the certification discussion to nonconvergent training and sequential unlearning requests. Experiments on three real-world datasets are used to demonstrate efficacy and advantages over prior approaches.

Significance. If the nonconvex extensions and inverse Hessian approximation are shown to rigorously preserve the certification bounds, the work would meaningfully advance certified unlearning toward practical DNN settings, where strong convexity assumptions fail. The sequential and nonconvergence extensions address realistic deployment scenarios. The paper does not ship machine-checked proofs or parameter-free derivations, so the significance hinges on the correctness of the new error analysis.

major comments (2)

[Method section on inverse Hessian approximation] The central claim that the inverse Hessian approximation extends the guarantees 'without compromising certification guarantees' (abstract) is load-bearing. Standard certified unlearning derivations (typically via first-order Taylor or influence functions) require the Hessian to be positive definite and the objective strongly convex to bound the parameter displacement after data removal. The manuscript must provide an explicit nonconvex error bound or local strong-convexity assumption that survives the approximation; without it, the guarantee does not transfer.
[Section deriving the nonconvex extension] The 'simple techniques' for nonconvex objectives must be shown to replace the convexity-dependent terms in the certification proof with quantities that remain valid for DNN loss landscapes (e.g., via Lipschitz-gradient or local-convexity assumptions). If the proof still invokes positive-definiteness or strong convexity after the extension, the claim is internally inconsistent with the nonconvex setting.

minor comments (2)

Notation for the approximated inverse Hessian and the resulting certification radius should be introduced with explicit dependence on the approximation error term.
The experimental section should report the empirical certification radius (or failure rate under the certified bound) alongside accuracy to allow direct assessment of the guarantee tightness.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on the theoretical foundations of our nonconvex extensions and inverse Hessian approximation. The points raised about preserving certification guarantees are important, and we address them directly below. We agree that explicit clarification of the error bounds is warranted and will revise the manuscript to strengthen these aspects.

read point-by-point responses

Referee: [Method section on inverse Hessian approximation] The central claim that the inverse Hessian approximation extends the guarantees 'without compromising certification guarantees' (abstract) is load-bearing. Standard certified unlearning derivations (typically via first-order Taylor or influence functions) require the Hessian to be positive definite and the objective strongly convex to bound the parameter displacement after data removal. The manuscript must provide an explicit nonconvex error bound or local strong-convexity assumption that survives the approximation; without it, the guarantee does not transfer.

Authors: We acknowledge the referee's concern that standard derivations rely on strong convexity. Our inverse Hessian approximation incorporates an additive error term derived from the approximation residual, which is bounded using the gradient Lipschitz constant (a local smoothness property that holds in nonconvex DNN landscapes without global strong convexity). This term is folded into the overall certification bound on parameter displacement. We will add an explicit lemma in the revised Method section stating the nonconvex error bound under local Lipschitz-gradient assumptions to make the transfer of guarantees fully rigorous. revision: yes
Referee: [Section deriving the nonconvex extension] The 'simple techniques' for nonconvex objectives must be shown to replace the convexity-dependent terms in the certification proof with quantities that remain valid for DNN loss landscapes (e.g., via Lipschitz-gradient or local-convexity assumptions). If the proof still invokes positive-definiteness or strong convexity after the extension, the claim is internally inconsistent with the nonconvex setting.

Authors: The simple techniques replace the global strong-convexity parameter with a local curvature estimate extracted from the Hessian at the trained parameters, combined with gradient-Lipschitz bounds on the displacement. The nonconvex extension section already avoids invoking global positive-definiteness; instead it uses these local quantities. To address the referee's point, we will revise the derivation to explicitly flag each replaced term and restate the proof under the Lipschitz-gradient assumption, ensuring internal consistency with the nonconvex setting. revision: yes

Circularity Check

0 steps flagged

No significant circularity; proposals are algorithmic extensions rather than self-referential derivations.

full rationale

The paper proposes new techniques for extending certified unlearning to nonconvex DNN objectives and an inverse-Hessian approximation for efficiency, with claims that guarantees are preserved. No load-bearing derivation step reduces a claimed result to a fitted parameter, self-citation chain, or input by construction. The central contributions are the proposed methods themselves, which are presented as independent algorithmic changes rather than re-derivations of prior results. The abstract and described structure contain no equations or proofs that equate outputs to inputs tautologically, and external benchmarks or assumptions are not invoked in a self-referential manner. This is the expected outcome for a methods-focused extension paper whose claims rest on the novelty of the techniques rather than closed-loop mathematics.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only view yields no explicit free parameters, axioms, or invented entities; the central claim implicitly rests on the unstated assumption that nonconvexity can be handled by simple extensions without breaking certification.

pith-pipeline@v0.9.0 · 5667 in / 974 out tokens · 22813 ms · 2026-05-23T21:57:47.339048+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 3.4... ∥w̃−w̃∗∥₂≤2C(MC+λ)/(λ+λ_min)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

WIN-U: Woodbury-Informed Newton-Unlearning as a retain-free Machine Unlearning Framework
cs.LG 2026-04 unverdicted novelty 6.0

WIN-U delivers a retain-free unlearning update that approximates the gold-standard retrained model via a Woodbury-informed Newton step using only forget-set curvature information.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

Second-Order Stochastic Optimization for Machine Learning in Linear Time

Agarwal, N., Bullins, B., and Hazan, E. Second-order stochastic optimization for machine learning in linear time. arXiv preprint arXiv:1602.03943,

work page internal anchor Pith review Pith/arXiv arXiv
[2]

and Liebig, T

Becker, A. and Liebig, T. Certified data removal in sum- product networks. arXiv preprint arXiv:2210.01451 ,

work page arXiv
[3]

A., Jia, H., Travers, A., Zhang, B., Lie, D., and Papernot, N

Bourtoule, L., Chandrasekaran, V ., Choquette-Choo, C. A., Jia, H., Travers, A., Zhang, B., Lie, D., and Papernot, N. Machine unlearning. In 2021 IEEE Symposium on Security and Privacy (SP), pp. 141–159,

work page 2021
[4]

and Yang, J

Cao, Y . and Yang, J. Towards making systems forget with machine unlearning. In 2015 IEEE symposium on security and privacy, pp. 463–480,

work page 2015
[5]

Chen, M., Zhang, Z., Wang, T., Backes, M., Humbert, M., and Zhang, Y

URLhttps: //oag.ca.gov/privacy/ccpa. Chen, M., Zhang, Z., Wang, T., Backes, M., Humbert, M., and Zhang, Y . When machine unlearning jeopardizes privacy. In Proceedings of the 2021 ACM SIGSAC Con- ference on Computer and Communications Security, pp. 896–911,

work page 2021
[6]

Graph unlearning

Chen, M., Zhang, Z., Wang, T., Backes, M., Humbert, M., and Zhang, Y . Graph unlearning. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pp. 499–513,

work page 2022
[7]

and Grimmer, B

Davis, D. and Grimmer, B. Proximally guided stochastic subgradient method for nonsmooth, nonconvex problems. SIAM Journal on Optimization, 29(3):1908–1930,

work page 1908
[8]

Idea: A flexible framework of certified unlearning for graph neural networks

Dong, Y ., Zhang, B., Lei, Z., Zou, N., and Li, J. Idea: A flexible framework of certified unlearning for graph neural networks. arXiv preprint arXiv:2407.19398,

work page arXiv
[9]

A., Chaudhuri, K., and Zou, J

Izzo, Z., Smart, M. A., Chaudhuri, K., and Zou, J. Approx- imate data deletion from machine learning models. In International Conference on Artificial Intelligence and Statistics, pp. 2008–2016,

work page 2008
[10]

Certifiable machine unlearning for linear models

Mahadevan, A. and Mathioudakis, M. Certifiable ma- chine unlearning for linear models. arXiv preprint arXiv:2106.15093,

work page arXiv
[11]

An introduction to machine unlearning

Mercuri, S., Khraishi, R., Okhrati, R., Batra, D., Hamill, C., Ghasempour, T., and Nowlan, A. An introduction to machine unlearning. arXiv preprint arXiv:2209.00939,

work page arXiv
[12]

A survey of machine unlearning

Nguyen, T. T., Huynh, T. T., Nguyen, P. L., Liew, A. W.-C., Yin, H., and Nguyen, Q. V . H. A survey of machine unlearning. arXiv preprint arXiv:2209.02299,

work page arXiv
[13]

Unlearning graph classifiers with limited data resources

Pan, C., Chien, E., and Milenkovic, O. Unlearning graph classifiers with limited data resources. In Proceedings of the ACM Web Conference 2023, pp. 716–726,

work page 2023
[14]

Gif: A general graph unlearning strategy via influence function

Wu, J., Yang, Y ., Qian, Y ., Sui, Y ., Wang, X., and He, X. Gif: A general graph unlearning strategy via influence function. In Proceedings of the ACM Web Conference 2023, pp. 651–661, 2023a. Wu, K., Shen, J., Ning, Y ., Wang, T., and Wang, W. H. Certified edge unlearning for graph neural networks. In Proceedings of the 29th ACM SIGKDD Conference on Know...

work page 2023
[15]

Proofs A.1

12 Towards Certified Unlearning for Deep Neural Networks A. Proofs A.1. Proof of Proposition 2.2 Proof. Let the unlearning process be an identical map in terms of the model, i.e., U(D, Du, A(D)) = A(D). Since A is a differentially private algorithm, we have ∀ T ⊆ H , Pr(A(D) ∈ T ) ≤ eεPr(A(Du) ∈ T ) + δ. (12) Similarly, we also have ∀ T ⊆ H , Pr(A(Du) ∈ T...

work page 2021
[16]

(28) Consequently, we have E " ˜H −1 ∞,λ H # =E[(Hw∗ + λI)−1]

We then compute the limit for both sides of Equation (27) and have E[ ˜H −1 ∞,λ] =I + E[ ˜H −1 ∞,λ] − 1 H E[Hw∗ + λI]E[ ˜H −1 ∞,λ]. (28) Consequently, we have E " ˜H −1 ∞,λ H # =E[(Hw∗ + λI)−1]. (29) Hence, we have proved that ˜H −1 s,λ H is an asymptotic unbiased estimator of the inverse Hessian (Hw∗ + λI)−1. A.6. Proof of Theorem 3.6 Proof. Following th...

work page 2016
[17]

All experiments are conducted based on three real-world datasets: MNIST (LeCun et al., 1998), CIFAR-10 (Krizhevsky et al., 2009), and SVHN (Netzer et al., 2011)

We ran all experiments on an Nvidia RTX A6000 GPU. All experiments are conducted based on three real-world datasets: MNIST (LeCun et al., 1998), CIFAR-10 (Krizhevsky et al., 2009), and SVHN (Netzer et al., 2011). All datasets are publicly accessible (MNIST with GNU General Public License, CIFAR-10 with MIT License, and SVHN with CC BY-NC License). We repo...

work page 1998
[18]

• L-CODEC: size of unlearned set: 1,000; number of perturbations: 25; Hessian type: Sekhari; ε: 100; δ: 0.1; ℓ-2 regularization: 5e−4

• Fisher forgetting: size of unlearned set: 1,000; α: 1e−6 for MLP, 1e−8 for AllCNN and ResNet. • L-CODEC: size of unlearned set: 1,000; number of perturbations: 25; Hessian type: Sekhari; ε: 100; δ: 0.1; ℓ-2 regularization: 5e−4. • Certified unlearning: size of unlearned set: 1,000; number of recursion s: 1,000; standard deviation σ: 1e−2 for MLP, 1e−3 f...

work page 2017
[19]

In this case, the certification requires adding a larger noise to hide the (overestimated) remaining information of the unlearned data

When λ < 12.11, the computed error bound is still valid (from Table 3, as the value of λ decreases, the computed error bound increases correspondingly, indicating that a smaller choice of λ can still lead to a valid but larger error bound). In this case, the certification requires adding a larger noise to hide the (overestimated) remaining information of ...

work page 2020
[20]

We record the micro F1-score of the predictions on the unlearned set Du, retained set Dr, and test set Dt

Comparison of accuracy among three advanced unlearning baselines over three popular DNNs across three real-world datasets. We record the micro F1-score of the predictions on the unlearned set Du, retained set Dr, and test set Dt. Method MLP & MNIST AllCNN & CIFAR-10 ResNet18 & SVHN F1 on Du F1 on Dr F1 on Dt F1 on Du F1 on Dr F1 on Dt F1 on Du F1 on Dr F1...

work page 2023

[1] [1]

Second-Order Stochastic Optimization for Machine Learning in Linear Time

Agarwal, N., Bullins, B., and Hazan, E. Second-order stochastic optimization for machine learning in linear time. arXiv preprint arXiv:1602.03943,

work page internal anchor Pith review Pith/arXiv arXiv

[2] [2]

and Liebig, T

Becker, A. and Liebig, T. Certified data removal in sum- product networks. arXiv preprint arXiv:2210.01451 ,

work page arXiv

[3] [3]

A., Jia, H., Travers, A., Zhang, B., Lie, D., and Papernot, N

Bourtoule, L., Chandrasekaran, V ., Choquette-Choo, C. A., Jia, H., Travers, A., Zhang, B., Lie, D., and Papernot, N. Machine unlearning. In 2021 IEEE Symposium on Security and Privacy (SP), pp. 141–159,

work page 2021

[4] [4]

and Yang, J

Cao, Y . and Yang, J. Towards making systems forget with machine unlearning. In 2015 IEEE symposium on security and privacy, pp. 463–480,

work page 2015

[5] [5]

Chen, M., Zhang, Z., Wang, T., Backes, M., Humbert, M., and Zhang, Y

URLhttps: //oag.ca.gov/privacy/ccpa. Chen, M., Zhang, Z., Wang, T., Backes, M., Humbert, M., and Zhang, Y . When machine unlearning jeopardizes privacy. In Proceedings of the 2021 ACM SIGSAC Con- ference on Computer and Communications Security, pp. 896–911,

work page 2021

[6] [6]

Graph unlearning

Chen, M., Zhang, Z., Wang, T., Backes, M., Humbert, M., and Zhang, Y . Graph unlearning. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pp. 499–513,

work page 2022

[7] [7]

and Grimmer, B

Davis, D. and Grimmer, B. Proximally guided stochastic subgradient method for nonsmooth, nonconvex problems. SIAM Journal on Optimization, 29(3):1908–1930,

work page 1908

[8] [8]

Idea: A flexible framework of certified unlearning for graph neural networks

Dong, Y ., Zhang, B., Lei, Z., Zou, N., and Li, J. Idea: A flexible framework of certified unlearning for graph neural networks. arXiv preprint arXiv:2407.19398,

work page arXiv

[9] [9]

A., Chaudhuri, K., and Zou, J

Izzo, Z., Smart, M. A., Chaudhuri, K., and Zou, J. Approx- imate data deletion from machine learning models. In International Conference on Artificial Intelligence and Statistics, pp. 2008–2016,

work page 2008

[10] [10]

Certifiable machine unlearning for linear models

Mahadevan, A. and Mathioudakis, M. Certifiable ma- chine unlearning for linear models. arXiv preprint arXiv:2106.15093,

work page arXiv

[11] [11]

An introduction to machine unlearning

Mercuri, S., Khraishi, R., Okhrati, R., Batra, D., Hamill, C., Ghasempour, T., and Nowlan, A. An introduction to machine unlearning. arXiv preprint arXiv:2209.00939,

work page arXiv

[12] [12]

A survey of machine unlearning

Nguyen, T. T., Huynh, T. T., Nguyen, P. L., Liew, A. W.-C., Yin, H., and Nguyen, Q. V . H. A survey of machine unlearning. arXiv preprint arXiv:2209.02299,

work page arXiv

[13] [13]

Unlearning graph classifiers with limited data resources

Pan, C., Chien, E., and Milenkovic, O. Unlearning graph classifiers with limited data resources. In Proceedings of the ACM Web Conference 2023, pp. 716–726,

work page 2023

[14] [14]

Gif: A general graph unlearning strategy via influence function

Wu, J., Yang, Y ., Qian, Y ., Sui, Y ., Wang, X., and He, X. Gif: A general graph unlearning strategy via influence function. In Proceedings of the ACM Web Conference 2023, pp. 651–661, 2023a. Wu, K., Shen, J., Ning, Y ., Wang, T., and Wang, W. H. Certified edge unlearning for graph neural networks. In Proceedings of the 29th ACM SIGKDD Conference on Know...

work page 2023

[15] [15]

Proofs A.1

12 Towards Certified Unlearning for Deep Neural Networks A. Proofs A.1. Proof of Proposition 2.2 Proof. Let the unlearning process be an identical map in terms of the model, i.e., U(D, Du, A(D)) = A(D). Since A is a differentially private algorithm, we have ∀ T ⊆ H , Pr(A(D) ∈ T ) ≤ eεPr(A(Du) ∈ T ) + δ. (12) Similarly, we also have ∀ T ⊆ H , Pr(A(Du) ∈ T...

work page 2021

[16] [16]

(28) Consequently, we have E " ˜H −1 ∞,λ H # =E[(Hw∗ + λI)−1]

We then compute the limit for both sides of Equation (27) and have E[ ˜H −1 ∞,λ] =I + E[ ˜H −1 ∞,λ] − 1 H E[Hw∗ + λI]E[ ˜H −1 ∞,λ]. (28) Consequently, we have E " ˜H −1 ∞,λ H # =E[(Hw∗ + λI)−1]. (29) Hence, we have proved that ˜H −1 s,λ H is an asymptotic unbiased estimator of the inverse Hessian (Hw∗ + λI)−1. A.6. Proof of Theorem 3.6 Proof. Following th...

work page 2016

[17] [17]

All experiments are conducted based on three real-world datasets: MNIST (LeCun et al., 1998), CIFAR-10 (Krizhevsky et al., 2009), and SVHN (Netzer et al., 2011)

We ran all experiments on an Nvidia RTX A6000 GPU. All experiments are conducted based on three real-world datasets: MNIST (LeCun et al., 1998), CIFAR-10 (Krizhevsky et al., 2009), and SVHN (Netzer et al., 2011). All datasets are publicly accessible (MNIST with GNU General Public License, CIFAR-10 with MIT License, and SVHN with CC BY-NC License). We repo...

work page 1998

[18] [18]

• L-CODEC: size of unlearned set: 1,000; number of perturbations: 25; Hessian type: Sekhari; ε: 100; δ: 0.1; ℓ-2 regularization: 5e−4

• Fisher forgetting: size of unlearned set: 1,000; α: 1e−6 for MLP, 1e−8 for AllCNN and ResNet. • L-CODEC: size of unlearned set: 1,000; number of perturbations: 25; Hessian type: Sekhari; ε: 100; δ: 0.1; ℓ-2 regularization: 5e−4. • Certified unlearning: size of unlearned set: 1,000; number of recursion s: 1,000; standard deviation σ: 1e−2 for MLP, 1e−3 f...

work page 2017

[19] [19]

In this case, the certification requires adding a larger noise to hide the (overestimated) remaining information of the unlearned data

When λ < 12.11, the computed error bound is still valid (from Table 3, as the value of λ decreases, the computed error bound increases correspondingly, indicating that a smaller choice of λ can still lead to a valid but larger error bound). In this case, the certification requires adding a larger noise to hide the (overestimated) remaining information of ...

work page 2020

[20] [20]

We record the micro F1-score of the predictions on the unlearned set Du, retained set Dr, and test set Dt

Comparison of accuracy among three advanced unlearning baselines over three popular DNNs across three real-world datasets. We record the micro F1-score of the predictions on the unlearned set Du, retained set Dr, and test set Dt. Method MLP & MNIST AllCNN & CIFAR-10 ResNet18 & SVHN F1 on Du F1 on Dr F1 on Dt F1 on Du F1 on Dr F1 on Dt F1 on Du F1 on Dr F1...

work page 2023