Fundamental Limitations of Favorable Privacy-Utility Guarantees for DP-SGD

Marten van Dijk; Murat Bilgehan Ertan

arxiv: 2601.10237 · v2 · submitted 2026-01-15 · 💻 cs.LG · cs.CR

Fundamental Limitations of Favorable Privacy-Utility Guarantees for DP-SGD

Murat Bilgehan Ertan , Marten van Dijk This is my paper

Pith reviewed 2026-05-16 13:33 UTC · model grok-4.3

classification 💻 cs.LG cs.CR

keywords differential privacyDP-SGDf-differential privacyshuffled samplingprivacy utility tradeoffgaussian noisestochastic optimization

0 comments

The pith

Shuffled DP-SGD cannot achieve strong privacy and high utility at once under standard worst-case analysis.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that in the f-differential privacy framework, DP-SGD using shuffled sampling over one epoch with M updates faces a hard limit on its privacy-utility tradeoff. Specifically, to keep the privacy separation small, the Gaussian noise multiplier must exceed one over the square root of twice the natural log of M. This bound persists for practical values of M, leading to noise levels that degrade model performance substantially. The result highlights why achieving favorable guarantees is difficult without relaxing the adversarial model or changing the sampling approach.

Core claim

We analyze DP-SGD in the f-DP framework for shuffled sampling with M gradient updates and derive an explicit suboptimal upper bound on the achievable trade-off curve. This induces a geometric lower bound on the separation κ between the mechanism's curve and the random-guessing line. Consequently, either the noise multiplier σ satisfies σ ≥ 1/√(2 ln M) or κ ≥ (1/√8)(1 - 1/√(4π ln M)), showing that strong privacy and high utility cannot be achieved simultaneously.

What carries the argument

The separation κ, which measures the maximum vertical distance from the f-DP trade-off curve to the diagonal random-guessing line, serving as a proxy for adversarial advantage.

If this is right

Shuffled DP-SGD requires σ at least 1/√(2 ln M) to achieve small κ.
The same limitation applies to Poisson subsampling up to constant factors.
For typical training with moderate M, the implied noise causes notable accuracy loss.
As M increases the bound decreases but does so too slowly for practical relief.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Practitioners may need to adopt relaxed privacy notions or non-shuffled methods to bypass this bound.
The slow asymptotic improvement suggests rethinking single-epoch assumptions in private training.
This bound could inform minimum noise levels in DP-SGD to ensure theoretical privacy guarantees.

Load-bearing premise

The standard worst-case adversarial model in the f-DP framework applies directly without modification to the single-epoch shuffled sampling process.

What would settle it

Running shuffled DP-SGD with σ below the bound and verifying whether the observed trade-off curve separation κ stays below the predicted lower bound would falsify the claim.

Figures

Figures reproduced from arXiv: 2601.10237 by Marten van Dijk, Murat Bilgehan Ertan.

**Figure 1.** Figure 1: Trade-off view of privacy in the f-DP framework [21]. The black line shows the ideal random-guessing trade-off between type I and type II errors. The vertical red segment κ denotes the the maximum distance between the achievable f-DP trade-off and the ideal limit. primarily a modeling convenience: in practice, modern deep learning systems do not sample examples independently but instead shuffle the entire … view at source ↗

**Figure 2.** Figure 2: Illustrative geometry of the suboptimal and true trade-off functions in our impossibility [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: Explicit lower bound on the separation sep Gµ(M,1,σ−1) as a function of the number of rounds per epoch M under noise schedules of the form σ = s/√ ln M, with E = 1. where the last equality holds whenever M1/s2 ≥ 4. Finally, we simplify the expression further by observing that for sufficiently large M, M1/s2 − 4 ≥ 1 2 M1/s2 . Applying this bound to (51) yields µ(M, 1, σ−1 ) ≥ 1 √ M q 1 2 M1/s2 = 1 √ 2 M 1… view at source ↗

**Figure 4.** Figure 4: GDP-predicted separation κµ-GDP(M, E, σ−1 ) as a function of the number of rounds per epoch M under the noise schedule σ = 1/ √ 2 ln M, for several epoch counts E. We emphasize that, unlike the asymptotic µ-GDP approximation discussed above, our main separation bound is non-asymptotic. In particular, it holds for every finite M in the single-epoch setting, without requiring any limiting regime or asymptoti… view at source ↗

read the original abstract

Differentially Private Stochastic Gradient Descent (DP-SGD) is the dominant paradigm for private training, but its fundamental limitations under worst-case adversarial privacy definitions remain poorly understood. We analyze DP-SGD in the $f$-differential privacy framework, which characterizes privacy via hypothesis-testing trade-off curves, and study shuffled sampling over a single epoch with $M$ gradient updates. We derive an explicit suboptimal upper bound on the achievable trade-off curve. This result induces a geometric lower bound on the separation $\kappa$ which is the maximum distance between the mechanism's trade-off curve and the ideal random-guessing line. Because a large separation implies significant adversarial advantage, meaningful privacy requires small $\kappa$. However, we prove that enforcing a small separation imposes a strict lower bound on the Gaussian noise multiplier $\sigma$, which directly limits the achievable utility. In particular, under the standard worst-case adversarial model, shuffled DP-SGD must satisfy $\sigma \ge \frac{1}{\sqrt{2\ln M}}$ $\quad\text{or}\quad$ $\kappa \ge\ \frac{1}{\sqrt{8}}\!\left(1-\frac{1}{\sqrt{4\pi\ln M}}\right)$, and thus cannot simultaneously achieve strong privacy and high utility. Although this bound vanishes asymptotically as $M \to \infty$, the convergence is extremely slow: even for practically relevant numbers of updates the required noise magnitude remains substantial. We further show that the same limitation extends to Poisson subsampling up to constant factors. Our experiments confirm that the noise levels implied by this bound leads to significant accuracy degradation at realistic training settings, thus showing a critical bottleneck in DP-SGD under standard worst-case adversarial assumptions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper derives an explicit lower bound on noise for shuffled DP-SGD in f-DP from a stated suboptimal upper bound on the trade-off curve, which flags a practical utility cost but does not establish a tight fundamental limit.

read the letter

The main thing to know is that this work gives a concrete lower bound on the Gaussian noise multiplier for single-epoch shuffled DP-SGD: σ must be at least 1 over sqrt(2 ln M) to keep the separation κ small enough for meaningful privacy. The bound extends to Poisson subsampling up to constants and comes with experiments showing accuracy drops at realistic M values. That part is useful because it turns an abstract privacy-utility tension into numbers people can check against their training runs. What is new is the explicit form of the induced κ lower bound inside the f-DP framework for this sampling regime, building on earlier accounting results without relying on fitted parameters. The paper does a clean job laying out the hypothesis-testing view and confirming the numbers degrade utility in practice. The soft spot is exactly the one the stress-test note flags: the upper bound on the trade-off curve is explicitly suboptimal, so the true curve could sit lower and permit smaller σ while still satisfying the privacy target. If that gap is large, the claimed impossibility of strong privacy plus high utility does not follow at the stated strength. The single-epoch restriction also narrows the result; multi-epoch training could change the picture. This paper is for people who work on privacy accounting and want to see worst-case f-DP limits written out with experiments attached. A reader focused on DP-SGD design or f-DP would get value from the explicit expressions and the accuracy plots. It deserves a serious referee because the question of unavoidable costs is worth tightening, even if the current bound needs work to become sharp.

Referee Report

2 major / 1 minor

Summary. The paper claims that under the f-DP framework, single-epoch shuffled DP-SGD with M updates cannot simultaneously achieve strong privacy and high utility. It derives an explicit suboptimal upper bound on the achievable trade-off curve that induces a geometric lower bound on the separation κ, yielding the requirement that σ ≥ 1/√(2 ln M) or κ ≥ (1/√8)(1 - 1/√(4π ln M)). The same limitation extends to Poisson subsampling up to constant factors, and experiments show significant accuracy degradation at the implied noise levels.

Significance. If the derived bound is tight, the work identifies a concrete bottleneck in DP-SGD under standard worst-case f-DP assumptions, with the slow asymptotic vanishing of the bound as M → ∞ having direct practical implications. The geometric interpretation via κ and the experimental validation of utility loss are positive contributions, but the explicit suboptimality of the upper bound limits the strength of the impossibility claim.

major comments (2)

[Abstract and §3] Abstract and main theorem: the lower bound on κ is obtained by converting an explicitly suboptimal upper bound on the trade-off curve into a geometric separation. Because the paper states the upper bound is suboptimal without a matching lower bound or tightness verification for the shuffled mechanism, the induced κ threshold may be loose; the true curve could permit smaller κ (hence stronger privacy) at the same σ, so the claimed impossibility does not necessarily follow.
[§3] §3 (derivation): the analysis assumes the standard worst-case adversarial model in f-DP applies directly to single-epoch shuffled sampling. The paper should verify whether the dependencies introduced by shuffling weaken this worst-case bound, as this assumption is load-bearing for converting the trade-off upper bound into the stated σ-or-κ requirement.

minor comments (1)

[Abstract] The nested square-root expression for the κ bound in the abstract is difficult to parse at a glance; a parenthesized or multi-line rendering would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our paper. We address the major concerns below and have made revisions to clarify the scope of our results.

read point-by-point responses

Referee: [Abstract and §3] Abstract and main theorem: the lower bound on κ is obtained by converting an explicitly suboptimal upper bound on the trade-off curve into a geometric separation. Because the paper states the upper bound is suboptimal without a matching lower bound or tightness verification for the shuffled mechanism, the induced κ threshold may be loose; the true curve could permit smaller κ (hence stronger privacy) at the same σ, so the claimed impossibility does not necessarily follow.

Authors: We acknowledge that the upper bound derived in the paper is explicitly suboptimal, and thus the resulting lower bound on κ is not necessarily tight. This provides a sufficient condition for the privacy-utility limitation but may overestimate the required separation. We have revised the abstract and Section 3 to more clearly state that the bound is conservative and that tighter analyses could potentially allow better trade-offs. Nevertheless, even this bound demonstrates substantial practical implications, as validated by the experiments showing accuracy degradation. revision: partial
Referee: [§3] §3 (derivation): the analysis assumes the standard worst-case adversarial model in f-DP applies directly to single-epoch shuffled sampling. The paper should verify whether the dependencies introduced by shuffling weaken this worst-case bound, as this assumption is load-bearing for converting the trade-off upper bound into the stated σ-or-κ requirement.

Authors: The f-DP trade-off curve is inherently defined for the worst-case neighboring datasets, and our upper bound applies to the shuffled DP-SGD mechanism as a whole. Shuffling introduces dependencies, but these do not invalidate the worst-case analysis; the bound holds regardless of the sampling order because it is based on the overall distribution of the mechanism output. We have added a paragraph in §3 to elaborate on why the standard model applies and that dependencies from shuffling are accounted for in the mechanism definition. A complete characterization of the exact trade-off curve for shuffling is beyond the scope of this work but would be a valuable direction for future research. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained from f-DP definitions

full rationale

The paper derives an explicit suboptimal upper bound on the achievable f-DP trade-off curve for single-epoch shuffled sampling directly from the hypothesis-testing characterization and worst-case adversarial model. This bound is then converted geometrically into a lower bound on separation κ, yielding the stated σ or κ threshold. No step reduces by construction to a fitted parameter renamed as prediction, a self-definition, or a load-bearing self-citation chain; the suboptimality is stated explicitly and the assumptions (standard f-DP model) are external. The result is therefore independent of its own outputs and receives score 0.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper's claims rest on standard assumptions in the f-differential privacy literature and the definition of shuffled DP-SGD sampling; no new parameters are fitted or entities invented.

axioms (2)

domain assumption The f-differential privacy framework accurately captures privacy via hypothesis testing trade-offs
Central to deriving the trade-off curve bound
domain assumption Shuffled sampling over single epoch with M updates follows standard DP-SGD procedure
Used for the analysis of the mechanism

pith-pipeline@v0.9.0 · 5615 in / 1384 out tokens · 23231 ms · 2026-05-16T13:33:21.893173+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Trade-off Functions for DP-SGD with Subsampling based on Random Shuffling: Tight Upper and Lower Bounds
cs.LG 2026-05 conditional novelty 7.0

Tight closed-form bounds via Berry-Esseen show DP-SGD with random shuffling achieves near-ideal privacy (trade-off close to 1-a) for σ ≥ √(3/ln M) and large M, with δ linear in epochs restricting E to O(√M) and an asy...

Reference graph

Works this paper leans on

72 extracted references · 72 canonical work pages · cited by 1 Pith paper · 3 internal anchors

[1]

Mart´ın Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Man ´e, Rajat Monga, Sherry Moore, Derek Mu...

work page 2015
[2]

Goodfellow, H

Mart´ın Abadi, Andy Chu, Ian J. Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. In Edgar R. Weippl, Stefan Katzenbeisser, Christopher Kruegel, Andrew C. Myers, and Shai Halevi, editors,Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, O...

work page doi:10.1145/2976749.2978318 2016
[3]

John M. Abowd. The u.s. census bureau adopts differential privacy. InProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Min- ing, KDD ’18, page 2867, New York, NY , USA, 2018. Association for Computing Machinery. ISBN 9781450355520. doi: 10.1145/3219819.3226070. URL https://doi.org/10.1145/ 3219819.3226070

work page doi:10.1145/3219819.3226070 2018
[4]

Large-scale differentially private bert, 2021

Rohan Anil, Badih Ghazi, Vineet Gupta, Ravi Kumar, and Pasin Manurangsi. Large-scale differentially private bert, 2021. URLhttps://arxiv.org/abs/2108.01624

work page arXiv 2021
[5]

Differential privacy has disparate impact on model accuracy

Eugene Bagdasaryan, Omid Poursaeed, and Vitaly Shmatikov. Differential privacy has disparate impact on model accuracy. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Flo- rence d’Alch´e-Buc, Emily B. Fox, and Roman Garnett, editors,Advances in Neural Information 20 Processing Systems 32: Annual Conference on Neural Information Processing Systems...

work page 2019
[6]

Bal, Dick H

Henri E. Bal, Dick H. J. Epema, Cees de Laat, Rob van Nieuwpoort, John W. Romein, Frank J. Seinstra, Cees Snoek, and Harry A. G. Wijshoff. A medium-scale distributed system for computer science research: Infrastructure for the long term.Computer, 49(5):54–63, 2016. doi: 10.1109/MC.2016.127. URLhttps://doi.org/10.1109/MC.2016.127

work page doi:10.1109/mc.2016.127 2016
[7]

Privacy amplification by subsampling: Tight analyses via couplings and divergences

Borja Balle, Gilles Barthe, and Marco Gaboardi. Privacy amplification by subsampling: Tight analyses via couplings and divergences. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicol`o Cesa-Bianchi, and Roman Garnett, editors,Advances in Neural Infor- mation Processing Systems 31: Annual Conference on Neural Information Processing S...

work page 2018
[8]

JAX- Privacy: Algorithms for privacy-preserving machine learning in jax, 2025

Borja Balle, Leonard Berrada, Zachary Charles, Christopher A Choquette-Choo, Soham De, Vadym Doroshenko, Dj Dvijotham, Andrew Galen, Arun Ganesh, Sahra Ghalebikesabi, Jamie Hayes, Peter Kairouz, Ryan McKenna, Brendan McMahan, Aneesh Pappu, Natalia Ponomareva, Mikhail Pravilov, Keith Rush, Samuel L Smith, and Robert Stanforth. JAX- Privacy: Algorithms for ...

work page 2025
[9]

Differentially pri- vate stochastic gradient descent with fixed-size minibatches: Tighter RDP guarantees with or without replacement

Jeremiah Birrell, Reza Ebrahimi, Rouzbeh Behnia, and Jason Pacheco. Differentially pri- vate stochastic gradient descent with fixed-size minibatches: Tighter RDP guarantees with or without replacement. In Amir Globersons, Lester Mackey, Danielle Belgrave, Angela Fan, Ulrich Paquet, Jakub M. Tomczak, and Cheng Zhang, editors,Advances in Neural Information ...

work page 2024
[11]

URLhttps://arxiv.org/abs/2105.07985

work page arXiv
[12]

JAX: composable transformations of Python+NumPy programs, 2018

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/jax-ml/jax

work page 2018
[13]

Zhiqi Bu, Jinshuo Dong, Qi Long, and Weijie J. Su. Deep learning with gaussian differential privacy.CoRR, abs/1911.11607, 2019. URLhttp://arxiv.org/abs/1911.11607

work page arXiv 1911
[14]

Gs-wgan: A gradient-sanitized approach for learning differentially private generators, 2021

Dingfan Chen, Tribhuvanesh Orekondy, and Mario Fritz. Gs-wgan: A gradient-sanitized approach for learning differentially private generators, 2021. URL https://arxiv.org/abs/ 2006.08265

work page arXiv 2021
[15]

Lynn Chua, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, and Chiyuan Zhang. How private are DP-SGD implementations? In Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp, editors,Proceedings of the 41st International Conference on Machine Learning, volume 235 o...

work page
[16]

URLhttps://proceedings.mlr.press/v235/chua24a.html

PMLR. URLhttps://proceedings.mlr.press/v235/chua24a.html

work page
[17]

Scalable DP-SGD: shuffling vs

Lynn Chua, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, and Chiyuan Zhang. Scalable DP-SGD: shuffling vs. poisson subsampling. In Amir Globersons, Lester Mackey, Danielle Belgrave, Angela Fan, Ulrich Paquet, Jakub M. Tom- czak, and Cheng Zhang, editors,Advances in Neural Information Processing Systems 38: Annual Conference on Neu...

work page 2024
[18]

Smith, and Borja Balle

Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, and Borja Balle. Unlocking high-accuracy differentially private image classification through scale, 2022. URL https: //arxiv.org/abs/2204.13650

work page arXiv 2022
[19]

, author Dong, W

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20-25 June 2009, Miami, Florida, USA, pages 248–255, Miami, FLorida, USA, 2009. IEEE Computer Society. doi: 10.1109/CVPR.2009.5206848. UR...

work page doi:10.1109/cvpr.2009.5206848 2009
[20]

BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: pre-training of deep bidirectional transformers for language understanding. In Jill Burstein, Christy Doran, and Thamar Solorio, editors,Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAAC...

work page doi:10.18653/v1/n19-1423 2019
[21]

Collecting telemetry data privately,

Bolin Ding, Janardhan Kulkarni, and Sergey Yekhanin. Collecting telemetry data privately,

work page
[22]

URLhttps://arxiv.org/abs/1712.01524

work page internal anchor Pith review Pith/arXiv arXiv
[23]

Differentially private diffusion models, 2023

Tim Dockhorn, Tianshi Cao, Arash Vahdat, and Karsten Kreis. Differentially private diffusion models, 2023. URLhttps://arxiv.org/abs/2210.09929

work page arXiv 2023
[24]

Jinshuo Dong, Aaron Roth, and Weijie J. Su. Gaussian differential privacy.CoRR, abs/1905.02383, 2019. URLhttp://arxiv.org/abs/1905.02383

work page internal anchor Pith review Pith/arXiv arXiv 1905
[25]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In9th International Conference on Learning Representations, ICLR 2021, V...

work page 2021
[26]

A firm foundation for private data analysis.Commun

Cynthia Dwork. A firm foundation for private data analysis.Commun. ACM, 54(1):86–95, 2011. doi: 10.1145/1866739.1866758. URLhttps://doi.org/10.1145/1866739.1866758

work page doi:10.1145/1866739.1866758 2011
[27]

2014.The Algorithmic Foundations of Differential Privacy

Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy.Found. Trends Theor. Comput. Sci., 9(3-4):211–407, 2014. doi: 10.1561/0400000042. URL https: //doi.org/10.1561/0400000042

work page doi:10.1561/0400000042 2014
[28]

Our data, ourselves: Privacy via distributed noise generation,

Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Serge Vaudenay, editor, Advances in Cryptology - EUROCRYPT 2006, 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, May 28 - June 1, 2006,...

work page doi:10.1007/11761679 2006
[29]

Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. In Shai Halevi and Tal Rabin, editors,Theory of Cryptography, Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006, Proceedings, volume 3876 ofLecture Notes in Computer Science, pages 265–284, New York...

work page doi:10.1007/11681878 2006
[30]

The web never forgets: Persistent tracking mechanisms in the wild,

´Ulfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. Rappor: Randomized aggregatable privacy-preserving ordinal response. InProceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS ’14, page 1054–1067, New York, NY , USA, 2014. 22 Association for Computing Machinery. ISBN 9781450329576. doi: 10.1145/2660267.2660348. U...

work page doi:10.1145/2660267.2660348 2014
[31]

Amplification by shuffling: From local to central differential privacy via anonymity

´Ulfar Erlingsson, Vitaly Feldman, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, and Abhradeep Thakurta. Amplification by shuffling: From local to central differential privacy via anonymity. In Timothy M. Chan, editor,Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, ...

work page doi:10.1137/1.9781611975482 2019
[32]

URLhttps://doi.org/10.1137/1.9781611975482.151

work page doi:10.1137/1.9781611975482.151
[33]

Korhonen, A single-exponential time 2-approximation algorithm for treewidth, in: IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS 2021), 2022, pp

Vitaly Feldman, Audra McMillan, and Kunal Talwar. Hiding among the clones: A simple and nearly optimal analysis of privacy amplification by shuffling. In62nd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2021, Denver, CO, USA, February 7-10, 2022, pages 954–964, Denver, CO, USA, 2021. IEEE. doi: 10.1109/FOCS52979.2021.00096. URL https://d...

work page doi:10.1109/focs52979.2021.00096 2021
[34]

Stronger privacy amplification by shuffling for renyi and approximate differential privacy

Vitaly Feldman, Audra McMillan, and Kunal Talwar. Stronger privacy amplification by shuffling for renyi and approximate differential privacy. In Nikhil Bansal and Viswanath Nagarajan, editors,Proceedings of the 2023 ACM-SIAM Symposium on Discrete Algorithms, SODA 2023, Florence, Italy, January 22-25, 2023, pages 4966–4981, Florence, Italy, 2023. SIAM. doi...

work page doi:10.1137/1.9781611977554.ch181 2023
[35]

Ozdaglar, and Pablo A

Mert G ¨urb¨uzbalaban, Asuman E. Ozdaglar, and Pablo A. Parrilo. Why random reshuffling beats stochastic gradient descent.Math. Program., 186(1):49–84, 2021. doi: 10.1007/ S10107-019-01440-W. URLhttps://doi.org/10.1007/s10107-019-01440-w

work page doi:10.1007/s10107-019-01440-w 2021
[36]

HaoChen and Suvrit Sra

Jeff Z. HaoChen and Suvrit Sra. Random shuffling beats SGD after finite epochs. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97 of Proceedings of Machine Learning Research, pages 2624–2633, Long Beach, California, USA,

work page 2019
[37]

URLhttp://proceedings.mlr.press/v97/haochen19a.html

PMLR. URLhttp://proceedings.mlr.press/v97/haochen19a.html

work page
[38]

Exploring the limits of differentially private deep learning with group-wise clipping, 2022

Jiyan He, Xuechen Li, Da Yu, Huishuai Zhang, Janardhan Kulkarni, Yin Tat Lee, Arturs Backurs, Nenghai Yu, and Jiang Bian. Exploring the limits of differentially private deep learning with group-wise clipping, 2022. URLhttps://arxiv.org/abs/2212.01539

work page arXiv 2022
[39]

Deep residual learning for image recognition,

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV , USA, June 27-30, 2016, pages 770–778, Las Vegas, NV , USA, 2016. IEEE Computer Society. doi: 10.1109/CVPR.2016.90. URL https://doi.org/10.1109/CVPR. 2016.90

work page doi:10.1109/cvpr.2016.90 2016
[40]

Dp-nmt: Scalable differentially-private machine translation, 2024

Timour Igamberdiev, Doan Nam Long Vu, Felix K ¨unnecke, Zhuo Yu, Jannik Holmer, and Ivan Habernal. Dp-nmt: Scalable differentially-private machine translation, 2024. URL https://arxiv.org/abs/2311.14465

work page arXiv 2024
[41]

Practical and private (deep) learning without sampling or shuffling

Peter Kairouz, Brendan Mcmahan, Shuang Song, Om Thakkar, Abhradeep Thakurta, and Zheng Xu. Practical and private (deep) learning without sampling or shuffling. In Marina Meila and Tong Zhang, editors,Proceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine Learning Research, pages 5213–5225, Virtual, 18–24 ...

work page 2021
[42]

Computing tight differential privacy guar- antees using FFT

Antti Koskela, Joonas J ¨alk¨o, and Antti Honkela. Computing tight differential privacy guar- antees using FFT. In Silvia Chiappa and Roberto Calandra, editors,The 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, 26-28 August 2020, Online [Palermo, Sicily, Italy], volume 108 ofProceedings of Machine Learning Research,...

work page 2020
[43]

Learning multiple layers of features from tiny images

Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, 2009. URLhttps://www.cs.utoronto.ca/ ~kriz/learning-features-2009-TR.pdf. 23

work page 2009
[44]

Toward training at imagenet scale with differential privacy

Alexey Kurakin, Steve Chien, Shuang Song, Roxana Geambasu, Andreas Terzis, and Abhradeep Thakurta. Toward training at imagenet scale with differential privacy.CoRR, abs/2201.12328: 1–25, 2022. URLhttps://arxiv.org/abs/2201.12328

work page arXiv 2022
[45]

Datasets: A community library for natural language processing

Quentin Lhoest, Albert Villanova del Moral, Yacine Jernite, Abhishek Thakur, Patrick von Platen, Suraj Patil, Julien Chaumond, Mariama Drame, Julien Plu, Lewis Tunstall, Joe Davison, Mario ˇSaˇsko, Gunjan Chhablani, Bhavitvya Malik, Simon Brandeis, Teven Le Scao, Victor Sanh, Canwen Xu, Nicolas Patry, Angelina McMillan-Major, Philipp Schmid, Sylvain Gugge...

work page 2021
[46]

Dickerson

Hao Liang, Wanrong Zhang, Xinlei He, Kaishun Wu, and Hong Xing. An improved privacy and utility analysis of differentially private SGD with bounded domain and smooth losses.CoRR, abs/2502.17772:1–19, 2025. doi: 10.48550/ARXIV .2502.17772. URL https://doi.org/10. 48550/arXiv.2502.17772

work page internal anchor Pith review doi:10.48550/arxiv 2025
[47]

Choquette-Choo, Badih Ghazi, George Kaissis, Ravi Kumar, Ruibo Liu, Da Yu, and Chiyuan Zhang

Ryan McKenna, Yangsibo Huang, Amer Sinha, Borja Balle, Zachary Charles, Christopher A. Choquette-Choo, Badih Ghazi, George Kaissis, Ravi Kumar, Ruibo Liu, Da Yu, and Chiyuan Zhang. Scaling laws for differentially private language models, 2025. URL https://arxiv. org/abs/2501.18914

work page arXiv 2025
[48]

On the accuracy of password strength meters,

Sebastian Meiser and Esfandiar Mohammadi. Tight on budget?: Tight bounds for r-fold approximate differential privacy. In David Lie, Mohammad Mannan, Michael Backes, and XiaoFeng Wang, editors,Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15-19, 2018, pages 247–264, Toronto, ON...

work page doi:10.1145/3243734.3243765 2018
[49]

Convergence analysis of distributed stochastic gradient descent with shuffling.Neurocomputing, 337:46–57, 2019

Qi Meng, Wei Chen, Yue Wang, Zhi-Ming Ma, and Tie-Yan Liu. Convergence analysis of distributed stochastic gradient descent with shuffling.Neurocomputing, 337:46–57, 2019. doi: 10.1016/J.NEUCOM.2019.01.037. URL https://doi.org/10.1016/j.neucom.2019.01. 037

work page doi:10.1016/j.neucom.2019.01.037 2019
[50]

URLhttp://dx.doi.org/10.1109/ CSF.2017.11

Ilya Mironov. R ´enyi differential privacy. In30th IEEE Computer Security Foundations Symposium, CSF 2017, Santa Barbara, CA, USA, August 21-25, 2017, pages 263–275, Barbara, CA, USA, 2017. IEEE Computer Society. doi: 10.1109/CSF.2017.11. URL https://doi. org/10.1109/CSF.2017.11

work page doi:10.1109/csf.2017.11 2017
[51]

arXiv preprint arXiv:1908.10530 (2019)

Ilya Mironov, Kunal Talwar, and Li Zhang. R´enyi differential privacy of the sampled gaussian mechanism.CoRR, abs/1908.10530, 2019. URLhttp://arxiv.org/abs/1908.10530

work page arXiv 1908
[52]

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y . Ng. Reading digits in natural images with unsupervised feature learning. InNIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011, 2011. URL http://ufldl.stanford. edu/housenumbers/nips2011_housenumbers.pdf

work page 2011
[53]

Nguyen, Quoc Tran-Dinh, Dzung T

Lam M. Nguyen, Quoc Tran-Dinh, Dzung T. Phan, Phuong Ha Nguyen, and Marten van Dijk. A unified convergence analysis for shuffling-type gradient methods.J. Mach. Learn. Res., 22: 207:1–207:44, 2021. URLhttps://jmlr.org/papers/v22/20-1238.html

work page 2021
[54]

URLhttps://doi.org/10.1613/jair.1.14649

Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta. How to dp-fy ML: A practical guide to machine learning with differential privacy.J. Artif. Intell. Res., 77:1113– 1201, 2023. doi: 10.1613/JAIR.1.14649. URL https://doi.org/10.1613/jair.1.14649

work page doi:10.1613/jair.1.14649 2023
[55]

LAION-5B: an open large-scale dataset for training next generation image-text models

Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wight- man, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert 24 Kaczmarczyk, and Jenia Jitsev. LAION-5B: an open large-scale dataset for training next generation image-text mo...

work page 2022
[56]

Towards understanding the impact of model size on differential private classification.CoRR, abs/2111.13895:1–14, 2021

Yinchen Shen, Zhiguo Wang, Ruoyu Sun, and Xiaojing Shen. Towards understanding the impact of model size on differential private classification.CoRR, abs/2111.13895:1–14, 2021. URLhttps://arxiv.org/abs/2111.13895

work page arXiv 2021
[57]

Amer Sinha, Thomas Mesnard, Ryan McKenna, Daogao Liu, Christopher A. Choquette- Choo, Yangsibo Huang, Da Yu, George Kaissis, Zachary Charles, Ruibo Liu, Lynn Chua, Pritish Kamath, Pasin Manurangsi, Steve He, Chiyuan Zhang, Badih Ghazi, Borja De Balle Pigem, Prem Eruvbetine, Tris Warkentin, Armand Joulin, and Ravi Kumar. Vaultgemma: A differentially privat...

work page arXiv 2025
[58]

Sommer, Sebastian Meiser, and Esfandiar Mohammadi

David M. Sommer, Sebastian Meiser, and Esfandiar Mohammadi. Privacy loss classes: The cen- tral limit theorem in differential privacy.Proc. Priv. Enhancing Technol., 2019(2):245–269, 2019. doi: 10.2478/POPETS-2019-0029. URL https://doi.org/10.2478/popets-2019-0029

work page doi:10.2478/popets-2019-0029 2019
[59]

arXiv preprint arXiv:2401.04343 (2024)

Xinyu Tang, Ashwinee Panda, Milad Nasr, Saeed Mahloujifar, and Prateek Mittal. Private fine-tuning of large language models with zeroth-order optimization, 2025. URL https: //arxiv.org/abs/2401.04343

work page arXiv 2025
[60]

TensorFlow Datasets, a collection of ready-to-use datasets

TensorFlow. TensorFlow Datasets, a collection of ready-to-use datasets. https://www. tensorflow.org/datasets

work page
[61]

Oseledets

Nurislam Tursynbek, Aleksandr Petiushko, and Ivan V . Oseledets. Robustness threats of differential privacy.CoRR, abs/2012.07828:1–16, 2020. URL https://arxiv.org/abs/ 2012.07828

work page arXiv 2012
[62]

Gomez, Lukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V . N. Vishwanathan, and Roman Garnett, editors,Advances in Neural Information Processing Systems 30: Annual Conference...

work page 2017
[63]

Chendi Wang, Buxin Su, Jiayuan Ye, Reza Shokri, and Weijie J. Su. Unified en- hancement of privacy bounds for mixture mechanisms via f-differential privacy. In Al- ice Oh, Tristan Naumann, Amir Globerson, Kate Saenko, Moritz Hardt, and Sergey Levine, editors,Advances in Neural Information Processing Systems 36: Annual Con- ference on Neural Information Pr...

work page 2023
[64]

PAC privacy: Automatic privacy measurement and control of data processing

Hanshen Xiao and Srinivas Devadas. PAC privacy: Automatic privacy measurement and control of data processing. In Helena Handschuh and Anna Lysyanskaya, editors,Advances in Cryptology - CRYPTO 2023 - 43rd Annual International Cryptology Conference, CRYPTO 2023, Santa Barbara, CA, USA, August 20-24, 2023, Proceedings, Part II, volume 14082 ofLecture Notes i...

work page doi:10.1007/978-3-031-38545-2_ 2023
[65]

Opacus: User-friendly differential privacy library in pytorch

Ashkan Yousefpour, Igor Shilov, Alexandre Sablayrolles, Davide Testuggine, Karthik Prasad, Mani Malek, John Nguyen, Sayan Ghosh, Akash Bharadwaj, Jessica Zhao, Graham Cormode, and Ilya Mironov. Opacus: User-friendly differential privacy library in pytorch.CoRR, abs/2109.12298, 2021. URLhttps://arxiv.org/abs/2109.12298. 25

work page arXiv 2021
[66]

Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Hoda Shajari, Huan Sun, David Levitan, and Robert Sim

Xiang Yue, Huseyin A. Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Hoda Shajari, Huan Sun, David Levitan, and Robert Sim. Synthetic text generation with differential privacy: A simple and practical recipe, 2023. URLhttps://arxiv.org/abs/2210.14348

work page arXiv 2023
[67]

Root mean square layer normalization

Biao Zhang and Rico Sennrich. Root mean square layer normalization. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alch ´e-Buc, Emily B. Fox, and Roman Garnett, editors,Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC...

work page 2019
[68]

Character-level convolutional net- works for text classification

Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. Character-level convolutional net- works for text classification. In Corinna Cortes, Neil D. Lawrence, Daniel D. Lee, Masashi Sugiyama, and Roman Garnett, editors,Advances in Neural Information Pro- cessing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montre...

work page 2015
[69]

P ´erez, Marten van Dijk, and Lydia Y

Chaoyi Zhu, Jiayi Tang, Juan F. P ´erez, Marten van Dijk, and Lydia Y . Chen. DP-TLDM: differentially private tabular latent diffusion model. In Mila Dalla Preda, Sebastian Schrittwieser, Vincent Naessens, and Bjorn De Sutter, editors,Availability, Reliability and Security - 20th International Conference, ARES 2025, Ghent, Belgium, August 11-14, 2025, Pro...

work page 2025
[70]

doi: 10.1007/978-3-032-00624-0 \ 17

Springer. doi: 10.1007/978-3-032-00624-0 \ 17. URL https://doi.org/10.1007/ 978-3-032-00624-0_17

work page doi:10.1007/978-3-032-00624-0
[71]

Poission subsampled r´enyi differential privacy

Yuqing Zhu and Yu-Xiang Wang. Poission subsampled r´enyi differential privacy. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 7634– 7642, ””, 09–15 Jun 2019. PMLR. URL https://proceedings.mlr.press/v97/zhu19c. html

work page 2019
[72]

Optimal accounting of differential privacy via characteristic function

Yuqing Zhu, Jinshuo Dong, and Yu-Xiang Wang. Optimal accounting of differential privacy via characteristic function. In Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera, editors,International Conference on Artificial Intelligence and Statistics, AISTATS 2022, 28-30 March 2022, Virtual Event, volume 151 ofProceedings of Machine Learning Research...

work page 2022
[73]

Figure 3 plots the fully explicit lower bound obtained by combining the Gaussian tail bound in Eq

Figures 3 and 4 visualize the implications of Lemma F.1 and its asymptotic instantiation for the separation metric as the number of rounds per epoch M increases. Figure 3 plots the fully explicit lower bound obtained by combining the Gaussian tail bound in Eq. (48) with the explicit lower bound on the µ-GDP parameter derived in Eqs. (51)–(52) under the no...

work page

[1] [1]

Mart´ın Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Man ´e, Rajat Monga, Sherry Moore, Derek Mu...

work page 2015

[2] [2]

Goodfellow, H

Mart´ın Abadi, Andy Chu, Ian J. Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. In Edgar R. Weippl, Stefan Katzenbeisser, Christopher Kruegel, Andrew C. Myers, and Shai Halevi, editors,Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, O...

work page doi:10.1145/2976749.2978318 2016

[3] [3]

John M. Abowd. The u.s. census bureau adopts differential privacy. InProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Min- ing, KDD ’18, page 2867, New York, NY , USA, 2018. Association for Computing Machinery. ISBN 9781450355520. doi: 10.1145/3219819.3226070. URL https://doi.org/10.1145/ 3219819.3226070

work page doi:10.1145/3219819.3226070 2018

[4] [4]

Large-scale differentially private bert, 2021

Rohan Anil, Badih Ghazi, Vineet Gupta, Ravi Kumar, and Pasin Manurangsi. Large-scale differentially private bert, 2021. URLhttps://arxiv.org/abs/2108.01624

work page arXiv 2021

[5] [5]

Differential privacy has disparate impact on model accuracy

Eugene Bagdasaryan, Omid Poursaeed, and Vitaly Shmatikov. Differential privacy has disparate impact on model accuracy. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Flo- rence d’Alch´e-Buc, Emily B. Fox, and Roman Garnett, editors,Advances in Neural Information 20 Processing Systems 32: Annual Conference on Neural Information Processing Systems...

work page 2019

[6] [6]

Bal, Dick H

Henri E. Bal, Dick H. J. Epema, Cees de Laat, Rob van Nieuwpoort, John W. Romein, Frank J. Seinstra, Cees Snoek, and Harry A. G. Wijshoff. A medium-scale distributed system for computer science research: Infrastructure for the long term.Computer, 49(5):54–63, 2016. doi: 10.1109/MC.2016.127. URLhttps://doi.org/10.1109/MC.2016.127

work page doi:10.1109/mc.2016.127 2016

[7] [7]

Privacy amplification by subsampling: Tight analyses via couplings and divergences

Borja Balle, Gilles Barthe, and Marco Gaboardi. Privacy amplification by subsampling: Tight analyses via couplings and divergences. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicol`o Cesa-Bianchi, and Roman Garnett, editors,Advances in Neural Infor- mation Processing Systems 31: Annual Conference on Neural Information Processing S...

work page 2018

[8] [8]

JAX- Privacy: Algorithms for privacy-preserving machine learning in jax, 2025

Borja Balle, Leonard Berrada, Zachary Charles, Christopher A Choquette-Choo, Soham De, Vadym Doroshenko, Dj Dvijotham, Andrew Galen, Arun Ganesh, Sahra Ghalebikesabi, Jamie Hayes, Peter Kairouz, Ryan McKenna, Brendan McMahan, Aneesh Pappu, Natalia Ponomareva, Mikhail Pravilov, Keith Rush, Samuel L Smith, and Robert Stanforth. JAX- Privacy: Algorithms for ...

work page 2025

[9] [9]

Differentially pri- vate stochastic gradient descent with fixed-size minibatches: Tighter RDP guarantees with or without replacement

Jeremiah Birrell, Reza Ebrahimi, Rouzbeh Behnia, and Jason Pacheco. Differentially pri- vate stochastic gradient descent with fixed-size minibatches: Tighter RDP guarantees with or without replacement. In Amir Globersons, Lester Mackey, Danielle Belgrave, Angela Fan, Ulrich Paquet, Jakub M. Tomczak, and Cheng Zhang, editors,Advances in Neural Information ...

work page 2024

[10] [11]

URLhttps://arxiv.org/abs/2105.07985

work page arXiv

[11] [12]

JAX: composable transformations of Python+NumPy programs, 2018

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/jax-ml/jax

work page 2018

[12] [13]

Zhiqi Bu, Jinshuo Dong, Qi Long, and Weijie J. Su. Deep learning with gaussian differential privacy.CoRR, abs/1911.11607, 2019. URLhttp://arxiv.org/abs/1911.11607

work page arXiv 1911

[13] [14]

Gs-wgan: A gradient-sanitized approach for learning differentially private generators, 2021

Dingfan Chen, Tribhuvanesh Orekondy, and Mario Fritz. Gs-wgan: A gradient-sanitized approach for learning differentially private generators, 2021. URL https://arxiv.org/abs/ 2006.08265

work page arXiv 2021

[14] [15]

Lynn Chua, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, and Chiyuan Zhang. How private are DP-SGD implementations? In Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp, editors,Proceedings of the 41st International Conference on Machine Learning, volume 235 o...

work page

[15] [16]

URLhttps://proceedings.mlr.press/v235/chua24a.html

PMLR. URLhttps://proceedings.mlr.press/v235/chua24a.html

work page

[16] [17]

Scalable DP-SGD: shuffling vs

Lynn Chua, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, and Chiyuan Zhang. Scalable DP-SGD: shuffling vs. poisson subsampling. In Amir Globersons, Lester Mackey, Danielle Belgrave, Angela Fan, Ulrich Paquet, Jakub M. Tom- czak, and Cheng Zhang, editors,Advances in Neural Information Processing Systems 38: Annual Conference on Neu...

work page 2024

[17] [18]

Smith, and Borja Balle

Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, and Borja Balle. Unlocking high-accuracy differentially private image classification through scale, 2022. URL https: //arxiv.org/abs/2204.13650

work page arXiv 2022

[18] [19]

, author Dong, W

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20-25 June 2009, Miami, Florida, USA, pages 248–255, Miami, FLorida, USA, 2009. IEEE Computer Society. doi: 10.1109/CVPR.2009.5206848. UR...

work page doi:10.1109/cvpr.2009.5206848 2009

[19] [20]

BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: pre-training of deep bidirectional transformers for language understanding. In Jill Burstein, Christy Doran, and Thamar Solorio, editors,Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAAC...

work page doi:10.18653/v1/n19-1423 2019

[20] [21]

Collecting telemetry data privately,

Bolin Ding, Janardhan Kulkarni, and Sergey Yekhanin. Collecting telemetry data privately,

work page

[21] [22]

URLhttps://arxiv.org/abs/1712.01524

work page internal anchor Pith review Pith/arXiv arXiv

[22] [23]

Differentially private diffusion models, 2023

Tim Dockhorn, Tianshi Cao, Arash Vahdat, and Karsten Kreis. Differentially private diffusion models, 2023. URLhttps://arxiv.org/abs/2210.09929

work page arXiv 2023

[23] [24]

Jinshuo Dong, Aaron Roth, and Weijie J. Su. Gaussian differential privacy.CoRR, abs/1905.02383, 2019. URLhttp://arxiv.org/abs/1905.02383

work page internal anchor Pith review Pith/arXiv arXiv 1905

[24] [25]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In9th International Conference on Learning Representations, ICLR 2021, V...

work page 2021

[25] [26]

A firm foundation for private data analysis.Commun

Cynthia Dwork. A firm foundation for private data analysis.Commun. ACM, 54(1):86–95, 2011. doi: 10.1145/1866739.1866758. URLhttps://doi.org/10.1145/1866739.1866758

work page doi:10.1145/1866739.1866758 2011

[26] [27]

2014.The Algorithmic Foundations of Differential Privacy

Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy.Found. Trends Theor. Comput. Sci., 9(3-4):211–407, 2014. doi: 10.1561/0400000042. URL https: //doi.org/10.1561/0400000042

work page doi:10.1561/0400000042 2014

[27] [28]

Our data, ourselves: Privacy via distributed noise generation,

Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Serge Vaudenay, editor, Advances in Cryptology - EUROCRYPT 2006, 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, May 28 - June 1, 2006,...

work page doi:10.1007/11761679 2006

[28] [29]

Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith. Calibrating noise to sensitivity in private data analysis. In Shai Halevi and Tal Rabin, editors,Theory of Cryptography, Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006, Proceedings, volume 3876 ofLecture Notes in Computer Science, pages 265–284, New York...

work page doi:10.1007/11681878 2006

[29] [30]

The web never forgets: Persistent tracking mechanisms in the wild,

´Ulfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. Rappor: Randomized aggregatable privacy-preserving ordinal response. InProceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS ’14, page 1054–1067, New York, NY , USA, 2014. 22 Association for Computing Machinery. ISBN 9781450329576. doi: 10.1145/2660267.2660348. U...

work page doi:10.1145/2660267.2660348 2014

[30] [31]

Amplification by shuffling: From local to central differential privacy via anonymity

´Ulfar Erlingsson, Vitaly Feldman, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, and Abhradeep Thakurta. Amplification by shuffling: From local to central differential privacy via anonymity. In Timothy M. Chan, editor,Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, ...

work page doi:10.1137/1.9781611975482 2019

[31] [32]

URLhttps://doi.org/10.1137/1.9781611975482.151

work page doi:10.1137/1.9781611975482.151

[32] [33]

Korhonen, A single-exponential time 2-approximation algorithm for treewidth, in: IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS 2021), 2022, pp

Vitaly Feldman, Audra McMillan, and Kunal Talwar. Hiding among the clones: A simple and nearly optimal analysis of privacy amplification by shuffling. In62nd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2021, Denver, CO, USA, February 7-10, 2022, pages 954–964, Denver, CO, USA, 2021. IEEE. doi: 10.1109/FOCS52979.2021.00096. URL https://d...

work page doi:10.1109/focs52979.2021.00096 2021

[33] [34]

Stronger privacy amplification by shuffling for renyi and approximate differential privacy

Vitaly Feldman, Audra McMillan, and Kunal Talwar. Stronger privacy amplification by shuffling for renyi and approximate differential privacy. In Nikhil Bansal and Viswanath Nagarajan, editors,Proceedings of the 2023 ACM-SIAM Symposium on Discrete Algorithms, SODA 2023, Florence, Italy, January 22-25, 2023, pages 4966–4981, Florence, Italy, 2023. SIAM. doi...

work page doi:10.1137/1.9781611977554.ch181 2023

[34] [35]

Ozdaglar, and Pablo A

Mert G ¨urb¨uzbalaban, Asuman E. Ozdaglar, and Pablo A. Parrilo. Why random reshuffling beats stochastic gradient descent.Math. Program., 186(1):49–84, 2021. doi: 10.1007/ S10107-019-01440-W. URLhttps://doi.org/10.1007/s10107-019-01440-w

work page doi:10.1007/s10107-019-01440-w 2021

[35] [36]

HaoChen and Suvrit Sra

Jeff Z. HaoChen and Suvrit Sra. Random shuffling beats SGD after finite epochs. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, volume 97 of Proceedings of Machine Learning Research, pages 2624–2633, Long Beach, California, USA,

work page 2019

[36] [37]

URLhttp://proceedings.mlr.press/v97/haochen19a.html

PMLR. URLhttp://proceedings.mlr.press/v97/haochen19a.html

work page

[37] [38]

Exploring the limits of differentially private deep learning with group-wise clipping, 2022

Jiyan He, Xuechen Li, Da Yu, Huishuai Zhang, Janardhan Kulkarni, Yin Tat Lee, Arturs Backurs, Nenghai Yu, and Jiang Bian. Exploring the limits of differentially private deep learning with group-wise clipping, 2022. URLhttps://arxiv.org/abs/2212.01539

work page arXiv 2022

[38] [39]

Deep residual learning for image recognition,

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV , USA, June 27-30, 2016, pages 770–778, Las Vegas, NV , USA, 2016. IEEE Computer Society. doi: 10.1109/CVPR.2016.90. URL https://doi.org/10.1109/CVPR. 2016.90

work page doi:10.1109/cvpr.2016.90 2016

[39] [40]

Dp-nmt: Scalable differentially-private machine translation, 2024

Timour Igamberdiev, Doan Nam Long Vu, Felix K ¨unnecke, Zhuo Yu, Jannik Holmer, and Ivan Habernal. Dp-nmt: Scalable differentially-private machine translation, 2024. URL https://arxiv.org/abs/2311.14465

work page arXiv 2024

[40] [41]

Practical and private (deep) learning without sampling or shuffling

Peter Kairouz, Brendan Mcmahan, Shuang Song, Om Thakkar, Abhradeep Thakurta, and Zheng Xu. Practical and private (deep) learning without sampling or shuffling. In Marina Meila and Tong Zhang, editors,Proceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine Learning Research, pages 5213–5225, Virtual, 18–24 ...

work page 2021

[41] [42]

Computing tight differential privacy guar- antees using FFT

Antti Koskela, Joonas J ¨alk¨o, and Antti Honkela. Computing tight differential privacy guar- antees using FFT. In Silvia Chiappa and Roberto Calandra, editors,The 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, 26-28 August 2020, Online [Palermo, Sicily, Italy], volume 108 ofProceedings of Machine Learning Research,...

work page 2020

[42] [43]

Learning multiple layers of features from tiny images

Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, 2009. URLhttps://www.cs.utoronto.ca/ ~kriz/learning-features-2009-TR.pdf. 23

work page 2009

[43] [44]

Toward training at imagenet scale with differential privacy

Alexey Kurakin, Steve Chien, Shuang Song, Roxana Geambasu, Andreas Terzis, and Abhradeep Thakurta. Toward training at imagenet scale with differential privacy.CoRR, abs/2201.12328: 1–25, 2022. URLhttps://arxiv.org/abs/2201.12328

work page arXiv 2022

[44] [45]

Datasets: A community library for natural language processing

Quentin Lhoest, Albert Villanova del Moral, Yacine Jernite, Abhishek Thakur, Patrick von Platen, Suraj Patil, Julien Chaumond, Mariama Drame, Julien Plu, Lewis Tunstall, Joe Davison, Mario ˇSaˇsko, Gunjan Chhablani, Bhavitvya Malik, Simon Brandeis, Teven Le Scao, Victor Sanh, Canwen Xu, Nicolas Patry, Angelina McMillan-Major, Philipp Schmid, Sylvain Gugge...

work page 2021

[45] [46]

Dickerson

Hao Liang, Wanrong Zhang, Xinlei He, Kaishun Wu, and Hong Xing. An improved privacy and utility analysis of differentially private SGD with bounded domain and smooth losses.CoRR, abs/2502.17772:1–19, 2025. doi: 10.48550/ARXIV .2502.17772. URL https://doi.org/10. 48550/arXiv.2502.17772

work page internal anchor Pith review doi:10.48550/arxiv 2025

[46] [47]

Choquette-Choo, Badih Ghazi, George Kaissis, Ravi Kumar, Ruibo Liu, Da Yu, and Chiyuan Zhang

Ryan McKenna, Yangsibo Huang, Amer Sinha, Borja Balle, Zachary Charles, Christopher A. Choquette-Choo, Badih Ghazi, George Kaissis, Ravi Kumar, Ruibo Liu, Da Yu, and Chiyuan Zhang. Scaling laws for differentially private language models, 2025. URL https://arxiv. org/abs/2501.18914

work page arXiv 2025

[47] [48]

On the accuracy of password strength meters,

Sebastian Meiser and Esfandiar Mohammadi. Tight on budget?: Tight bounds for r-fold approximate differential privacy. In David Lie, Mohammad Mannan, Michael Backes, and XiaoFeng Wang, editors,Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15-19, 2018, pages 247–264, Toronto, ON...

work page doi:10.1145/3243734.3243765 2018

[48] [49]

Convergence analysis of distributed stochastic gradient descent with shuffling.Neurocomputing, 337:46–57, 2019

Qi Meng, Wei Chen, Yue Wang, Zhi-Ming Ma, and Tie-Yan Liu. Convergence analysis of distributed stochastic gradient descent with shuffling.Neurocomputing, 337:46–57, 2019. doi: 10.1016/J.NEUCOM.2019.01.037. URL https://doi.org/10.1016/j.neucom.2019.01. 037

work page doi:10.1016/j.neucom.2019.01.037 2019

[49] [50]

URLhttp://dx.doi.org/10.1109/ CSF.2017.11

Ilya Mironov. R ´enyi differential privacy. In30th IEEE Computer Security Foundations Symposium, CSF 2017, Santa Barbara, CA, USA, August 21-25, 2017, pages 263–275, Barbara, CA, USA, 2017. IEEE Computer Society. doi: 10.1109/CSF.2017.11. URL https://doi. org/10.1109/CSF.2017.11

work page doi:10.1109/csf.2017.11 2017

[50] [51]

arXiv preprint arXiv:1908.10530 (2019)

Ilya Mironov, Kunal Talwar, and Li Zhang. R´enyi differential privacy of the sampled gaussian mechanism.CoRR, abs/1908.10530, 2019. URLhttp://arxiv.org/abs/1908.10530

work page arXiv 1908

[51] [52]

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y . Ng. Reading digits in natural images with unsupervised feature learning. InNIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011, 2011. URL http://ufldl.stanford. edu/housenumbers/nips2011_housenumbers.pdf

work page 2011

[52] [53]

Nguyen, Quoc Tran-Dinh, Dzung T

Lam M. Nguyen, Quoc Tran-Dinh, Dzung T. Phan, Phuong Ha Nguyen, and Marten van Dijk. A unified convergence analysis for shuffling-type gradient methods.J. Mach. Learn. Res., 22: 207:1–207:44, 2021. URLhttps://jmlr.org/papers/v22/20-1238.html

work page 2021

[53] [54]

URLhttps://doi.org/10.1613/jair.1.14649

Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, and Abhradeep Guha Thakurta. How to dp-fy ML: A practical guide to machine learning with differential privacy.J. Artif. Intell. Res., 77:1113– 1201, 2023. doi: 10.1613/JAIR.1.14649. URL https://doi.org/10.1613/jair.1.14649

work page doi:10.1613/jair.1.14649 2023

[54] [55]

LAION-5B: an open large-scale dataset for training next generation image-text models

Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wight- man, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert 24 Kaczmarczyk, and Jenia Jitsev. LAION-5B: an open large-scale dataset for training next generation image-text mo...

work page 2022

[55] [56]

Towards understanding the impact of model size on differential private classification.CoRR, abs/2111.13895:1–14, 2021

Yinchen Shen, Zhiguo Wang, Ruoyu Sun, and Xiaojing Shen. Towards understanding the impact of model size on differential private classification.CoRR, abs/2111.13895:1–14, 2021. URLhttps://arxiv.org/abs/2111.13895

work page arXiv 2021

[56] [57]

Amer Sinha, Thomas Mesnard, Ryan McKenna, Daogao Liu, Christopher A. Choquette- Choo, Yangsibo Huang, Da Yu, George Kaissis, Zachary Charles, Ruibo Liu, Lynn Chua, Pritish Kamath, Pasin Manurangsi, Steve He, Chiyuan Zhang, Badih Ghazi, Borja De Balle Pigem, Prem Eruvbetine, Tris Warkentin, Armand Joulin, and Ravi Kumar. Vaultgemma: A differentially privat...

work page arXiv 2025

[57] [58]

Sommer, Sebastian Meiser, and Esfandiar Mohammadi

David M. Sommer, Sebastian Meiser, and Esfandiar Mohammadi. Privacy loss classes: The cen- tral limit theorem in differential privacy.Proc. Priv. Enhancing Technol., 2019(2):245–269, 2019. doi: 10.2478/POPETS-2019-0029. URL https://doi.org/10.2478/popets-2019-0029

work page doi:10.2478/popets-2019-0029 2019

[58] [59]

arXiv preprint arXiv:2401.04343 (2024)

Xinyu Tang, Ashwinee Panda, Milad Nasr, Saeed Mahloujifar, and Prateek Mittal. Private fine-tuning of large language models with zeroth-order optimization, 2025. URL https: //arxiv.org/abs/2401.04343

work page arXiv 2025

[59] [60]

TensorFlow Datasets, a collection of ready-to-use datasets

TensorFlow. TensorFlow Datasets, a collection of ready-to-use datasets. https://www. tensorflow.org/datasets

work page

[60] [61]

Oseledets

Nurislam Tursynbek, Aleksandr Petiushko, and Ivan V . Oseledets. Robustness threats of differential privacy.CoRR, abs/2012.07828:1–16, 2020. URL https://arxiv.org/abs/ 2012.07828

work page arXiv 2012

[61] [62]

Gomez, Lukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V . N. Vishwanathan, and Roman Garnett, editors,Advances in Neural Information Processing Systems 30: Annual Conference...

work page 2017

[62] [63]

Chendi Wang, Buxin Su, Jiayuan Ye, Reza Shokri, and Weijie J. Su. Unified en- hancement of privacy bounds for mixture mechanisms via f-differential privacy. In Al- ice Oh, Tristan Naumann, Amir Globerson, Kate Saenko, Moritz Hardt, and Sergey Levine, editors,Advances in Neural Information Processing Systems 36: Annual Con- ference on Neural Information Pr...

work page 2023

[63] [64]

PAC privacy: Automatic privacy measurement and control of data processing

Hanshen Xiao and Srinivas Devadas. PAC privacy: Automatic privacy measurement and control of data processing. In Helena Handschuh and Anna Lysyanskaya, editors,Advances in Cryptology - CRYPTO 2023 - 43rd Annual International Cryptology Conference, CRYPTO 2023, Santa Barbara, CA, USA, August 20-24, 2023, Proceedings, Part II, volume 14082 ofLecture Notes i...

work page doi:10.1007/978-3-031-38545-2_ 2023

[64] [65]

Opacus: User-friendly differential privacy library in pytorch

Ashkan Yousefpour, Igor Shilov, Alexandre Sablayrolles, Davide Testuggine, Karthik Prasad, Mani Malek, John Nguyen, Sayan Ghosh, Akash Bharadwaj, Jessica Zhao, Graham Cormode, and Ilya Mironov. Opacus: User-friendly differential privacy library in pytorch.CoRR, abs/2109.12298, 2021. URLhttps://arxiv.org/abs/2109.12298. 25

work page arXiv 2021

[65] [66]

Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Hoda Shajari, Huan Sun, David Levitan, and Robert Sim

Xiang Yue, Huseyin A. Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Hoda Shajari, Huan Sun, David Levitan, and Robert Sim. Synthetic text generation with differential privacy: A simple and practical recipe, 2023. URLhttps://arxiv.org/abs/2210.14348

work page arXiv 2023

[66] [67]

Root mean square layer normalization

Biao Zhang and Rico Sennrich. Root mean square layer normalization. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alch ´e-Buc, Emily B. Fox, and Roman Garnett, editors,Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC...

work page 2019

[67] [68]

Character-level convolutional net- works for text classification

Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. Character-level convolutional net- works for text classification. In Corinna Cortes, Neil D. Lawrence, Daniel D. Lee, Masashi Sugiyama, and Roman Garnett, editors,Advances in Neural Information Pro- cessing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montre...

work page 2015

[68] [69]

P ´erez, Marten van Dijk, and Lydia Y

Chaoyi Zhu, Jiayi Tang, Juan F. P ´erez, Marten van Dijk, and Lydia Y . Chen. DP-TLDM: differentially private tabular latent diffusion model. In Mila Dalla Preda, Sebastian Schrittwieser, Vincent Naessens, and Bjorn De Sutter, editors,Availability, Reliability and Security - 20th International Conference, ARES 2025, Ghent, Belgium, August 11-14, 2025, Pro...

work page 2025

[69] [70]

doi: 10.1007/978-3-032-00624-0 \ 17

Springer. doi: 10.1007/978-3-032-00624-0 \ 17. URL https://doi.org/10.1007/ 978-3-032-00624-0_17

work page doi:10.1007/978-3-032-00624-0

[70] [71]

Poission subsampled r´enyi differential privacy

Yuqing Zhu and Yu-Xiang Wang. Poission subsampled r´enyi differential privacy. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 7634– 7642, ””, 09–15 Jun 2019. PMLR. URL https://proceedings.mlr.press/v97/zhu19c. html

work page 2019

[71] [72]

Optimal accounting of differential privacy via characteristic function

Yuqing Zhu, Jinshuo Dong, and Yu-Xiang Wang. Optimal accounting of differential privacy via characteristic function. In Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera, editors,International Conference on Artificial Intelligence and Statistics, AISTATS 2022, 28-30 March 2022, Virtual Event, volume 151 ofProceedings of Machine Learning Research...

work page 2022

[72] [73]

Figure 3 plots the fully explicit lower bound obtained by combining the Gaussian tail bound in Eq

Figures 3 and 4 visualize the implications of Lemma F.1 and its asymptotic instantiation for the separation metric as the number of rounds per epoch M increases. Figure 3 plots the fully explicit lower bound obtained by combining the Gaussian tail bound in Eq. (48) with the explicit lower bound on the µ-GDP parameter derived in Eqs. (51)–(52) under the no...

work page