The Measure of Deception: An Analysis of Data Forging in Machine Unlearning

Rayan Saab; Rishabh Dixit; Yuan Hui

arxiv: 2509.05865 · v2 · submitted 2025-09-06 · 💻 cs.LG · stat.ML

The Measure of Deception: An Analysis of Data Forging in Machine Unlearning

Rishabh Dixit , Yuan Hui , Rayan Saab This is my paper

Pith reviewed 2026-05-18 17:36 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords machine unlearningdata forgingepsilon-forging setLebesgue measuresingular value analysisgradient variation matrixadversarial detection

0 comments

The pith

The epsilon-forging set of points that mimic a target gradient for unlearning has Lebesgue measure that shrinks with epsilon.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies the problem of data forging in machine unlearning, where an adversary tries to create fake data points that produce gradients close to those of a point being unlearned. It defines the epsilon-forging set as all such approximating points and analyzes its size using Lebesgue measure. For linear regression and single-layer networks the measure is order epsilon or even epsilon to the data dimension d. In the general case the measure decays as epsilon to the power (d minus r) over 2, with r coming from the number of small singular values in a gradient-based variation matrix. Probability bounds show that drawing a forging point at random becomes vanishingly unlikely.

Core claim

Under mild regularity assumptions the Lebesgue measure of the epsilon-forging set decays as epsilon raised to the power (d-r)/2, where d is the dimension of the data and r is the dimension of the space spanned by right singular vectors tied to small singular values of the variation matrix defined by model gradients. This scaling holds for linear regression, one-layer neural networks, batch SGD, and almost-everywhere smooth losses. The authors also prove that under non-degenerate distributions the probability of sampling a point from this set goes to zero.

What carries the argument

the variation matrix formed from gradients of the model at different data points, whose singular vectors and values control the exponent in the measure decay of the forging set

If this is right

If the central claim holds then adversarial forging becomes impossible for sufficiently small epsilon.
Verification of unlearning can use volume estimates to detect false claims.
The same scaling applies to stochastic gradient descent training.
Random data sampling cannot produce reliable forgers with high probability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This scaling could be used to design statistical tests for unlearning compliance in deployed systems.
High-dimensional data would make the exponent large and forgery exponentially harder.
The framework might extend to other adversarial settings where gradient matching is attempted.
Connections to robustness in optimization could be explored by viewing forging as a form of perturbation.

Load-bearing premise

Loss functions are almost everywhere smooth and data distributions are non-degenerate so that the variation matrix has a clear distinction between small and large singular values.

What would settle it

Compute the actual Lebesgue measure or volume estimate of the forging set for a simple linear model as epsilon varies and check if the observed scaling matches the predicted exponent.

Figures

Figures reproduced from arXiv: 2509.05865 by Rayan Saab, Rishabh Dixit, Yuan Hui.

**Figure 2.** Figure 2: A two dimensional representation of the ξ covers for the sets K1, D2 ∩ V2, ∂D2. Here, D2 is the closure of an ellipse in R 2 and the set D2 ∩V2 is represented by the three disconnected red curves. The sum of volumes in the yellow, red and blue regions is equal to ν1 and the set K1 ⊂ D2\(V2∪∂D2) is a function of ν1. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_2.png] view at source ↗

read the original abstract

Motivated by privacy regulations and the need to mitigate the effects of harmful data, machine unlearning seeks to modify trained models so that they effectively ``forget'' designated data. A key challenge in verifying unlearning is \emph{forging} -- adversarially crafting data that mimics the gradient of a target point, thereby creating the appearance of unlearning without actually removing information. To capture this phenomenon, we consider the collection of data points whose gradients approximate a target gradient within tolerance $\epsilon$ -- which we call an $\epsilon$-forging set -- and develop a framework for its analysis. For linear regression and one-layer neural networks, we show that the Lebesgue measure of this set is small. It scales on the order of $\epsilon$, and when $\epsilon$ is small enough, $\epsilon^d$. More generally, under mild regularity assumptions, we prove that the forging set measure decays as $\epsilon^{(d-r)/2}$, where $d$ is the data dimension and $r<d$ is the dimension of vector space of right singular vectors corresponding to ``small'' singular values of a variation matrix defined by the model gradients. Extensions to batch SGD and almost-everywhere smooth loss functions yield the same asymptotic scaling. In addition, we establish probability bounds showing that, under non-degenerate data distributions, the likelihood of randomly sampling a forging point is vanishingly small. These results provide evidence that adversarial forging is fundamentally limited and that false unlearning claims can, in principle, be detected.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows that ε-forging sets have small Lebesgue measure scaling as ε^{(d-r)/2} under regularity assumptions, with explicit calculations for linear regression and one-layer nets but conditional generalization.

read the letter

The main thing to know is that this paper defines an ε-forging set of points whose gradients mimic a target within tolerance and proves its Lebesgue measure decays in a controlled way. For linear regression and single-layer networks the scaling is explicit and direct. In the general case it comes out as ε^{(d-r)/2} from the SVD structure of the gradient variation matrix, with r the dimension tied to small singular values. Probability bounds under non-degenerate distributions are included as well. That geometric framing is new in the unlearning literature and gives a concrete criterion for when forged unlearning might be detectable in principle. The calculations for the simple models are solid and avoid heavy assumptions. The general scaling and extensions to batch SGD and almost-everywhere smooth losses follow from standard measure theory and change-of-variable arguments once the regularity conditions are granted. The soft spot is exactly where the stress-test note flags it. The headline exponent relies on almost-everywhere smoothness and non-degenerate data to keep the variation matrix rank well-behaved. If smoothness fails on a positive-measure set or the data sits on a lower-dimensional manifold, the coarea or Jacobian control can break and the measure need not decay at that rate. The paper states the assumptions but does not quantify how much the bound degrades when they are mildly violated. This is a real limitation for the general claim, though it does not undermine the explicit cases. Readers working on verifiable unlearning or regulatory data-deletion requirements will find the framework useful. The geometric approach and the scaling results are worth referee attention even if the general case needs tighter robustness checks. I would send it for peer review.

Referee Report

2 major / 2 minor

Summary. The manuscript analyzes data forging in machine unlearning by defining the ε-forging set as the collection of points whose model gradients approximate a target gradient within tolerance ε. For linear regression and one-layer neural networks, explicit scaling arguments show the Lebesgue measure of this set is O(ε) and O(ε^d) for small ε. Under mild regularity assumptions, a general result establishes that the measure decays as ε^{(d-r)/2}, where d is data dimension and r < d is the dimension of the space of right singular vectors associated with small singular values of the variation matrix constructed from model gradients. The work also derives probability bounds for randomly sampling forging points under non-degenerate distributions and extends the analysis to batch SGD and almost-everywhere smooth loss functions.

Significance. If the central claims hold, this work provides important theoretical evidence that forging attacks on unlearning are geometrically constrained and have vanishingly small measure, supporting the detectability of false unlearning claims. Strengths include the parameter-free scaling derivations for the linear and one-layer cases using direct calculation, and the use of SVD techniques on the variation matrix for the general case under stated conditions. These contribute to the foundations of verifiable machine unlearning.

major comments (2)

[General scaling result (Theorem establishing ε^{(d-r)/2} decay)] The derivation of the exponent (d-r)/2 for the general case depends on the variation matrix's singular value structure and the coarea formula under almost-everywhere smoothness; the manuscript should explicitly state the threshold used to define 'small' singular values and discuss the robustness of r when data distributions deviate from non-degeneracy, as this directly impacts the claimed scaling and is load-bearing for the generalization beyond the linear and one-layer cases.
[Probability bounds section] The probability bounds showing vanishing likelihood of sampling a forging point assume non-degenerate data distributions; it would be useful to quantify how the bounds degrade under mild degeneracy to better support the practical implications for detecting forging.

minor comments (2)

[Notation and definitions] The definition of the variation matrix should be highlighted with an equation number for easier reference throughout the proofs.
[References] Consider adding references to related work on measure-theoretic analysis in optimization or unlearning verification methods.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for your positive evaluation and insightful comments on our manuscript. We have carefully considered each suggestion and outline our responses below, indicating planned revisions where appropriate.

read point-by-point responses

Referee: [General scaling result (Theorem establishing ε^{(d-r)/2} decay)] The derivation of the exponent (d-r)/2 for the general case depends on the variation matrix's singular value structure and the coarea formula under almost-everywhere smoothness; the manuscript should explicitly state the threshold used to define 'small' singular values and discuss the robustness of r when data distributions deviate from non-degeneracy, as this directly impacts the claimed scaling and is load-bearing for the generalization beyond the linear and one-layer cases.

Authors: We agree with the need for explicit clarification. In the revision, we will define 'small' singular values as those satisfying σ_i < τ, where τ is a fixed positive constant independent of ε (chosen smaller than the smallest non-small singular value under the non-degeneracy assumption). For robustness, we will add a paragraph explaining that if the data distribution deviates mildly from non-degeneracy (e.g., by having a small perturbation in the null space directions), r may increase, but the exponent (d-r)/2 remains positive provided r < d. This preserves the vanishing measure property, though the constant factors may worsen. We believe this addresses the load-bearing aspect for generalization. revision: yes
Referee: [Probability bounds section] The probability bounds showing vanishing likelihood of sampling a forging point assume non-degenerate data distributions; it would be useful to quantify how the bounds degrade under mild degeneracy to better support the practical implications for detecting forging.

Authors: We thank the referee for this suggestion. While a complete quantitative analysis under arbitrary degeneracy would require substantial additional technical development beyond the current scope, we will include a discussion in the revised version. Specifically, we will note that under mild degeneracy—characterized by the distribution having a density ρ bounded below by a positive constant on a set of measure at least δ > 0—the probability bound degrades by a multiplicative factor of 1/δ, but the ε^{(d-r)/2} scaling is retained. This supports that detection remains feasible in practice for near-non-degenerate cases. We will also mention that complete degeneracy (zero measure support) is outside the assumptions and would invalidate the model. revision: partial

Circularity Check

0 steps flagged

No circularity: scaling derived from gradient geometry and rank analysis

full rationale

The paper's central results on the Lebesgue measure of ε-forging sets are obtained by direct calculation for linear regression and one-layer networks, then extended via change-of-variable arguments under almost-everywhere smoothness and non-degenerate distribution assumptions that fix the variation matrix rank r. These steps rely on standard differential geometry (coarea formula, singular value decomposition of the gradient map) rather than any fitted parameter, self-referential definition, or load-bearing self-citation. The exponent (d-r)/2 emerges from the kernel dimension of small singular values and does not reduce to the target measure by construction. The derivation is therefore self-contained against external mathematical benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claims rest on the introduction of the ε-forging set and variation matrix plus standard analytic assumptions; no free parameters are fitted and no new physical entities are postulated.

axioms (2)

domain assumption Mild regularity assumptions (almost-everywhere smoothness of the loss)
Invoked to obtain the general decay rate ε^{(d-r)/2} and to extend the result to batch SGD.
domain assumption Non-degenerate data distributions
Required for the probability bounds showing vanishing likelihood of sampling a forging point.

invented entities (2)

ε-forging set no independent evidence
purpose: Collection of points whose gradients lie within ε of a target gradient
Central object whose Lebesgue measure is bounded in the paper.
variation matrix no independent evidence
purpose: Matrix of model gradients whose small singular values define the effective dimension r
Introduced to characterize the geometry that determines the measure exponent.

pith-pipeline@v0.9.0 · 5808 in / 1560 out tokens · 42989 ms · 2026-05-18T17:36:09.563025+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the forging set measure decays as ε^{(d-r)/2}, where d is the data dimension and r<d is the dimension of vector space of right singular vectors corresponding to small singular values of a variation matrix defined by the model gradients
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

under mild regularity assumptions on the loss functions (almost-everywhere smoothness) and the non-degenerate data distributions

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 2 internal anchors

[1]

On the structure of singular sets of convex functions.Calculus of Variations and Partial Differential Equations, 2(1):17–27, 1994

Giovanni Alberti. On the structure of singular sets of convex functions.Calculus of Variations and Partial Differential Equations, 2(1):17–27, 1994

work page 1994
[2]

Unforge- ability in stochastic gradient descent

Teodora Baluta, Ivica Nikolic, Racchit Jain, Divesh Aggarwal, and Prateek Saxena. Unforge- ability in stochastic gradient descent. InProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, pages 1138–1152, 2023

work page 2023
[3]

Towardsaregularitytheory for relu networks–chain rule and global error estimates

JuliusBerner, DennisElbrächter, PhilippGrohs, andArnulfJentzen. Towardsaregularitytheory for relu networks–chain rule and global error estimates. In2019 13th International conference on Sampling Theory and Applications (SampTA), pages 1–5. IEEE, 2019

work page 2019
[4]

Numerical influence of relu’(0) on backpropagation.Advances in Neural Information Processing Systems, 34:468–479, 2021

David Bertoin, Jérôme Bolte, Sébastien Gerchinovitz, and Edouard Pauwels. Numerical influence of relu’(0) on backpropagation.Advances in Neural Information Processing Systems, 34:468–479, 2021

work page 2021
[5]

A derivation of n-dimensional spherical coordinates.The American Mathematical Monthly, 67(1):63–66, 1960

LE Blumenson. A derivation of n-dimensional spherical coordinates.The American Mathematical Monthly, 67(1):63–66, 1960

work page 1960
[6]

Machine unlearning

Lucas Bourtoule, Varun Chandrasekaran, Christopher A Choquette-Choo, Hengrui Jia, Adelin Travers, Baiwu Zhang, David Lie, and Nicolas Papernot. Machine unlearning. In2021 IEEE symposium on security and privacy (SP), pages 141–159. IEEE, 2021

work page 2021
[7]

Towards making systems forget with machine unlearning

Yinzhi Cao and Junfeng Yang. Towards making systems forget with machine unlearning. In 2015 IEEE symposium on security and privacy, pages 463–480. IEEE, 2015. 11Hered H (·,·)is the Hausdorff distance andx c is the center ofD2. 48

work page 2015
[8]

Langevin unlearning: A new perspective of noisy gradient descent for machine unlearning.Advances in neural information processing systems, 37:79666–79703, 2024

Eli Chien, Haoyu Wang, Ziang Chen, and Pan Li. Langevin unlearning: A new perspective of noisy gradient descent for machine unlearning.Advances in neural information processing systems, 37:79666–79703, 2024

work page 2024
[9]

Towards scalable exact machine unlearning using parameter-efficient fine-tuning

Somnath Basu Roy Chowdhury, Krzysztof Choromanski, Arijit Sehanobish, Avinava Dubey, and Snigdha Chaturvedi. Towards scalable exact machine unlearning using parameter-efficient fine- tuning.arXiv preprint arXiv:2406.16257, 2024

work page arXiv 2024
[10]

Stochastic subgradient method converges at the rate $O(k^{-1/4})$ on weakly convex functions

Damek Davis and Dmitriy Drusvyatskiy. Stochastic subgradient method converges at the rate O(k−1/4)on weakly convex functions.arXiv preprint arXiv:1802.02988, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[11]

Approximating the fréchet distance for realistic curves in near linear time

Anne Driemel, Sariel Har-Peled, and Carola Wenk. Approximating the fréchet distance for realistic curves in near linear time. InProceedings of the twenty-sixth annual symposium on Computational geometry, pages 365–374, 2010

work page 2010
[12]

The algorithmic foundations of differential privacy.Founda- tions and trends®in theoretical computer science, 9(3–4):211–407, 2014

Cynthia Dwork, Aaron Roth, et al. The algorithmic foundations of differential privacy.Founda- tions and trends®in theoretical computer science, 9(3–4):211–407, 2014

work page 2014
[13]

Springer, 2014

Herbert Federer.Geometric measure theory. Springer, 2014

work page 2014
[14]

Adaptive machine unlearning.Advances in Neural Information Processing Systems, 34: 16319–16330, 2021

Varun Gupta, Christopher Jung, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, and Chris Waites. Adaptive machine unlearning.Advances in Neural Information Processing Systems, 34: 16319–16330, 2021

work page 2021
[15]

Smooth manifolds

John M Lee. Smooth manifolds. InIntroduction to smooth manifolds, pages 1–29. Springer, 2003

work page 2003
[16]

The eu proposal for a general data protection regulation and the roots of the ‘right to be forgotten’.Computer Law & Security Review, 29(3):229–235, 2013

Alessandro Mantelero. The eu proposal for a general data protection regulation and the roots of the ‘right to be forgotten’.Computer Law & Security Review, 29(3):229–235, 2013

work page 2013
[17]

Springer Science & Business Media, 2013

Jiri Matousek.Lectures on discrete geometry, volume 212. Springer Science & Business Media, 2013

work page 2013
[18]

The probabilistic method.Lecture Notes, Department of Applied Mathematics, Charles University, Prague, 2001

Jiri Matousek and Jan Vondrak. The probabilistic method.Lecture Notes, Department of Applied Mathematics, Charles University, Prague, 2001

work page 2001
[19]

Descent-to-delete: Gradient-based meth- ods for machine unlearning

Seth Neel, Aaron Roth, and Saeed Sharifi-Malvajerdi. Descent-to-delete: Gradient-based meth- ods for machine unlearning. InAlgorithmic Learning Theory, pages 931–962. PMLR, 2021

work page 2021
[20]

Springer Science & Business Media, 2013

Yurii Nesterov.Introductory lectures on convex optimization: A basic course, volume 87. Springer Science & Business Media, 2013

work page 2013
[21]

A survey of machine unlearning.arXiv preprint arXiv:2209.02299, 2022

Thanh Tam Nguyen, Thanh Trung Huynh, Zhao Ren, Phi Le Nguyen, Alan Wee-Chung Liew, Hongzhi Yin, and Quoc Viet Hung Nguyen. A survey of machine unlearning.arXiv preprint arXiv:2209.02299, 2022

work page arXiv 2022
[22]

State of the union (of geometric objects)

János Pach, Pankaj K Agarwal, and Micha Sharir. State of the union (of geometric objects). Surveys on Discrete and Computational Geometry-Twenty Years Later, pages 9–48, 2008

work page 2008
[23]

Z., Lu, Y ., Sekhari, A., Kamath, G., and Neel, S

Martin Pawelczyk, Jimmy Z Di, Yiwei Lu, Ayush Sekhari, Gautam Kamath, and Seth Neel. Machine unlearning fails to remove data poisoning attacks.arXiv preprint arXiv:2406.17216, 2024

work page arXiv 2024
[24]

Springer, 1998

R Tyrrell Rockafellar and Roger JB Wets.Variational analysis. Springer, 1998. 49

work page 1998
[25]

Remember what you want to forget: Algorithms for machine unlearning.Advances in Neural Information Processing Systems, 34:18075–18086, 2021

Ayush Sekhari, Jayadev Acharya, Gautam Kamath, and Ananda Theertha Suresh. Remember what you want to forget: Algorithms for machine unlearning.Advances in Neural Information Processing Systems, 34:18075–18086, 2021

work page 2021
[26]

Data forging is harder than you think

Mohamed Suliman, Swanand Kadhe, Anisa Halimi, Douglas Leith, Nathalie Baracaldo, and Ambrish Rawat. Data forging is harder than you think. InPrivacy Regulation and Protection in Machine Learning, 2024

work page 2024
[27]

On the necessity of auditable algorithmic definitions for machine unlearning

Anvith Thudi, Hengrui Jia, Ilia Shumailov, and Nicolas Papernot. On the necessity of auditable algorithmic definitions for machine unlearning. In31st USENIX Security Symposium (USENIX Security 22), pages 4007–4022, 2022

work page 2022
[28]

Weak convexity and approximate subdifferentials.Journal of Optimization Theory and Applications, 203(2):1686–1709, 2024

Wim van Ackooij, Felipe Atenas, and Claudia Sagastizábal. Weak convexity and approximate subdifferentials.Journal of Optimization Theory and Applications, 203(2):1686–1709, 2024

work page 2024
[29]

Verification of Machine Unlearning is Fragile

Binchi Zhang, Zihan Chen, Cong Shen, and Jundong Li. Verification of machine unlearning is fragile.arXiv preprint arXiv:2408.00929, 2024. 50

work page internal anchor Pith review Pith/arXiv arXiv 2024

[1] [1]

On the structure of singular sets of convex functions.Calculus of Variations and Partial Differential Equations, 2(1):17–27, 1994

Giovanni Alberti. On the structure of singular sets of convex functions.Calculus of Variations and Partial Differential Equations, 2(1):17–27, 1994

work page 1994

[2] [2]

Unforge- ability in stochastic gradient descent

Teodora Baluta, Ivica Nikolic, Racchit Jain, Divesh Aggarwal, and Prateek Saxena. Unforge- ability in stochastic gradient descent. InProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, pages 1138–1152, 2023

work page 2023

[3] [3]

Towardsaregularitytheory for relu networks–chain rule and global error estimates

JuliusBerner, DennisElbrächter, PhilippGrohs, andArnulfJentzen. Towardsaregularitytheory for relu networks–chain rule and global error estimates. In2019 13th International conference on Sampling Theory and Applications (SampTA), pages 1–5. IEEE, 2019

work page 2019

[4] [4]

Numerical influence of relu’(0) on backpropagation.Advances in Neural Information Processing Systems, 34:468–479, 2021

David Bertoin, Jérôme Bolte, Sébastien Gerchinovitz, and Edouard Pauwels. Numerical influence of relu’(0) on backpropagation.Advances in Neural Information Processing Systems, 34:468–479, 2021

work page 2021

[5] [5]

A derivation of n-dimensional spherical coordinates.The American Mathematical Monthly, 67(1):63–66, 1960

LE Blumenson. A derivation of n-dimensional spherical coordinates.The American Mathematical Monthly, 67(1):63–66, 1960

work page 1960

[6] [6]

Machine unlearning

Lucas Bourtoule, Varun Chandrasekaran, Christopher A Choquette-Choo, Hengrui Jia, Adelin Travers, Baiwu Zhang, David Lie, and Nicolas Papernot. Machine unlearning. In2021 IEEE symposium on security and privacy (SP), pages 141–159. IEEE, 2021

work page 2021

[7] [7]

Towards making systems forget with machine unlearning

Yinzhi Cao and Junfeng Yang. Towards making systems forget with machine unlearning. In 2015 IEEE symposium on security and privacy, pages 463–480. IEEE, 2015. 11Hered H (·,·)is the Hausdorff distance andx c is the center ofD2. 48

work page 2015

[8] [8]

Langevin unlearning: A new perspective of noisy gradient descent for machine unlearning.Advances in neural information processing systems, 37:79666–79703, 2024

Eli Chien, Haoyu Wang, Ziang Chen, and Pan Li. Langevin unlearning: A new perspective of noisy gradient descent for machine unlearning.Advances in neural information processing systems, 37:79666–79703, 2024

work page 2024

[9] [9]

Towards scalable exact machine unlearning using parameter-efficient fine-tuning

Somnath Basu Roy Chowdhury, Krzysztof Choromanski, Arijit Sehanobish, Avinava Dubey, and Snigdha Chaturvedi. Towards scalable exact machine unlearning using parameter-efficient fine- tuning.arXiv preprint arXiv:2406.16257, 2024

work page arXiv 2024

[10] [10]

Stochastic subgradient method converges at the rate $O(k^{-1/4})$ on weakly convex functions

Damek Davis and Dmitriy Drusvyatskiy. Stochastic subgradient method converges at the rate O(k−1/4)on weakly convex functions.arXiv preprint arXiv:1802.02988, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[11] [11]

Approximating the fréchet distance for realistic curves in near linear time

Anne Driemel, Sariel Har-Peled, and Carola Wenk. Approximating the fréchet distance for realistic curves in near linear time. InProceedings of the twenty-sixth annual symposium on Computational geometry, pages 365–374, 2010

work page 2010

[12] [12]

The algorithmic foundations of differential privacy.Founda- tions and trends®in theoretical computer science, 9(3–4):211–407, 2014

Cynthia Dwork, Aaron Roth, et al. The algorithmic foundations of differential privacy.Founda- tions and trends®in theoretical computer science, 9(3–4):211–407, 2014

work page 2014

[13] [13]

Springer, 2014

Herbert Federer.Geometric measure theory. Springer, 2014

work page 2014

[14] [14]

Adaptive machine unlearning.Advances in Neural Information Processing Systems, 34: 16319–16330, 2021

Varun Gupta, Christopher Jung, Seth Neel, Aaron Roth, Saeed Sharifi-Malvajerdi, and Chris Waites. Adaptive machine unlearning.Advances in Neural Information Processing Systems, 34: 16319–16330, 2021

work page 2021

[15] [15]

Smooth manifolds

John M Lee. Smooth manifolds. InIntroduction to smooth manifolds, pages 1–29. Springer, 2003

work page 2003

[16] [16]

The eu proposal for a general data protection regulation and the roots of the ‘right to be forgotten’.Computer Law & Security Review, 29(3):229–235, 2013

Alessandro Mantelero. The eu proposal for a general data protection regulation and the roots of the ‘right to be forgotten’.Computer Law & Security Review, 29(3):229–235, 2013

work page 2013

[17] [17]

Springer Science & Business Media, 2013

Jiri Matousek.Lectures on discrete geometry, volume 212. Springer Science & Business Media, 2013

work page 2013

[18] [18]

The probabilistic method.Lecture Notes, Department of Applied Mathematics, Charles University, Prague, 2001

Jiri Matousek and Jan Vondrak. The probabilistic method.Lecture Notes, Department of Applied Mathematics, Charles University, Prague, 2001

work page 2001

[19] [19]

Descent-to-delete: Gradient-based meth- ods for machine unlearning

Seth Neel, Aaron Roth, and Saeed Sharifi-Malvajerdi. Descent-to-delete: Gradient-based meth- ods for machine unlearning. InAlgorithmic Learning Theory, pages 931–962. PMLR, 2021

work page 2021

[20] [20]

Springer Science & Business Media, 2013

Yurii Nesterov.Introductory lectures on convex optimization: A basic course, volume 87. Springer Science & Business Media, 2013

work page 2013

[21] [21]

A survey of machine unlearning.arXiv preprint arXiv:2209.02299, 2022

Thanh Tam Nguyen, Thanh Trung Huynh, Zhao Ren, Phi Le Nguyen, Alan Wee-Chung Liew, Hongzhi Yin, and Quoc Viet Hung Nguyen. A survey of machine unlearning.arXiv preprint arXiv:2209.02299, 2022

work page arXiv 2022

[22] [22]

State of the union (of geometric objects)

János Pach, Pankaj K Agarwal, and Micha Sharir. State of the union (of geometric objects). Surveys on Discrete and Computational Geometry-Twenty Years Later, pages 9–48, 2008

work page 2008

[23] [23]

Z., Lu, Y ., Sekhari, A., Kamath, G., and Neel, S

Martin Pawelczyk, Jimmy Z Di, Yiwei Lu, Ayush Sekhari, Gautam Kamath, and Seth Neel. Machine unlearning fails to remove data poisoning attacks.arXiv preprint arXiv:2406.17216, 2024

work page arXiv 2024

[24] [24]

Springer, 1998

R Tyrrell Rockafellar and Roger JB Wets.Variational analysis. Springer, 1998. 49

work page 1998

[25] [25]

Remember what you want to forget: Algorithms for machine unlearning.Advances in Neural Information Processing Systems, 34:18075–18086, 2021

Ayush Sekhari, Jayadev Acharya, Gautam Kamath, and Ananda Theertha Suresh. Remember what you want to forget: Algorithms for machine unlearning.Advances in Neural Information Processing Systems, 34:18075–18086, 2021

work page 2021

[26] [26]

Data forging is harder than you think

Mohamed Suliman, Swanand Kadhe, Anisa Halimi, Douglas Leith, Nathalie Baracaldo, and Ambrish Rawat. Data forging is harder than you think. InPrivacy Regulation and Protection in Machine Learning, 2024

work page 2024

[27] [27]

On the necessity of auditable algorithmic definitions for machine unlearning

Anvith Thudi, Hengrui Jia, Ilia Shumailov, and Nicolas Papernot. On the necessity of auditable algorithmic definitions for machine unlearning. In31st USENIX Security Symposium (USENIX Security 22), pages 4007–4022, 2022

work page 2022

[28] [28]

Weak convexity and approximate subdifferentials.Journal of Optimization Theory and Applications, 203(2):1686–1709, 2024

Wim van Ackooij, Felipe Atenas, and Claudia Sagastizábal. Weak convexity and approximate subdifferentials.Journal of Optimization Theory and Applications, 203(2):1686–1709, 2024

work page 2024

[29] [29]

Verification of Machine Unlearning is Fragile

Binchi Zhang, Zihan Chen, Cong Shen, and Jundong Li. Verification of machine unlearning is fragile.arXiv preprint arXiv:2408.00929, 2024. 50

work page internal anchor Pith review Pith/arXiv arXiv 2024