Revisiting Privacy Amplification by Subsampling in Selective Release DPSGD
Pith reviewed 2026-06-28 06:54 UTC · model grok-4.3
The pith
Correcting the privacy analysis for selective release in DPSGD produces rigorous guarantees and high model utility.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The selective release mechanism alters the per-step sampling probability in a way that prior accounting ignored; by grounding selective release in clipped gradients and applying a fresh privacy analysis that incorporates this variation, DPSR-CG satisfies strict differential privacy while preserving substantially more model utility than either vanilla DPSGD or the earlier DPSUR algorithm.
What carries the argument
DPSR-CG mechanism, which triggers release only on sufficiently large clipped gradients and uses a privacy bound that explicitly tracks the resulting change in sampling probability.
If this is right
- DPSR-CG satisfies strict differential privacy where DPSUR's accounting was incomplete.
- Model accuracy and convergence speed improve on image and text classification tasks without relaxing the privacy target.
- Gradient clipping becomes the explicit trigger for both noise addition and selective release.
- The corrected analysis restores the validity of privacy amplification by subsampling in this adaptive setting.
Where Pith is reading between the lines
- Similar sampling-probability shifts may appear in other adaptive or selective private-training schemes and would require parallel re-analysis.
- If the new bound is tight, it could guide the design of release thresholds that trade privacy loss against utility more precisely.
- The approach suggests that utility gains in private SGD often come from reducing the number of noisy steps rather than from reducing noise magnitude alone.
Load-bearing premise
The new privacy analysis correctly captures every effect the selective release step has on the overall sampling distribution.
What would settle it
An empirical privacy audit or membership-inference attack on models trained with DPSR-CG that either violates the claimed epsilon bound or shows no utility gain over DPSUR under identical privacy parameters.
Figures
read the original abstract
Machine learning's reliance on sensitive data necessitates privacy-preserving techniques like Differentially Private Stochastic Gradient Descent (DPSGD). However, DPSGD suffers from substantial utility degradation and slow convergence due to gradient clipping and noise injection. Prior works have attempted to improve DPSGD from various perspectives; notably, the Differentially Private Selective Update and Release (DPSUR) algorithm has achieved remarkable model utility. However, the privacy accounting in DPSUR overlooks the variation in sampling probability introduced by the selective release mechanism, which compromises the rigor of its privacy guarantees. To address these limitations, we re-evaluate the privacy analysis of the selective release mechanism and propose a novel algorithm: Differentially Private Selective Release based on Clipped Gradients (DPSR-CG). Through a rigorous, newly derived privacy analysis and extensive experiments on multiple datasets (MNIST, CIFAR-10, IMDB, and FMNIST), we demonstrate that our DPSR-CG mechanism maintains strict privacy guarantees while achieving exceptional model performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript identifies an oversight in the privacy accounting of DPSUR, which fails to account for data-dependent variation in sampling probability caused by the selective release mechanism after gradient clipping. It proposes the DPSR-CG algorithm with a newly derived privacy analysis intended to restore strict (ε,δ)-DP guarantees, and reports strong empirical results across MNIST, CIFAR-10, IMDB, and FMNIST.
Significance. If the new privacy analysis is correct and handles the dependence between clipped gradient norms and release decisions, the work would strengthen the theoretical foundation for selective-release variants of DPSGD and could enable higher-utility private training. The multi-dataset experimental evaluation is a positive feature when accompanied by proper baselines.
major comments (2)
- [Abstract and privacy analysis] Abstract and privacy analysis (likely §3–4): The central claim rests on a 'rigorous, newly derived privacy analysis' that corrects DPSUR by addressing variation in sampling probability. However, no derivation steps, key equations, theorem statements, or proof sketches are visible, preventing verification that the bound properly extends subsampling amplification to the data-dependent case via coupling or privacy-loss random variables.
- [§4 (privacy theorem)] §4 (or equivalent privacy theorem): Standard privacy amplification by subsampling assumes fixed, data-independent probabilities. The selective release decision depends on the clipped gradient norm, inducing dependence; without an explicit argument showing how the new bound accounts for this (or why it does not require an extra term), the claimed strict (ε,δ)-DP guarantee remains unverified and load-bearing for the paper's contribution.
minor comments (1)
- [Abstract] The abstract asserts 'exceptional model performance' and 'extensive experiments' but supplies no quantitative results, tables, baselines (e.g., vs. DPSUR or standard DPSGD), privacy parameters (ε,δ), or error bars, making it impossible to assess the empirical claims.
Simulated Author's Rebuttal
We thank the referee for their careful review and for identifying the need for greater clarity around our privacy analysis. We address the two major comments point by point below. We will revise the manuscript to make the derivation fully explicit.
read point-by-point responses
-
Referee: [Abstract and privacy analysis] Abstract and privacy analysis (likely §3–4): The central claim rests on a 'rigorous, newly derived privacy analysis' that corrects DPSUR by addressing variation in sampling probability. However, no derivation steps, key equations, theorem statements, or proof sketches are visible, preventing verification that the bound properly extends subsampling amplification to the data-dependent case via coupling or privacy-loss random variables.
Authors: We agree that the current presentation does not make the derivation steps sufficiently visible. The analysis appears in §4, but we will add an explicit theorem statement, the key equations for the privacy-loss random variables, and a proof sketch that shows the coupling argument used to extend subsampling amplification to the data-dependent sampling probabilities induced by selective release after clipping. revision: yes
-
Referee: [§4 (privacy theorem)] §4 (or equivalent privacy theorem): Standard privacy amplification by subsampling assumes fixed, data-independent probabilities. The selective release decision depends on the clipped gradient norm, inducing dependence; without an explicit argument showing how the new bound accounts for this (or why it does not require an extra term), the claimed strict (ε,δ)-DP guarantee remains unverified and load-bearing for the paper's contribution.
Authors: The referee correctly notes the challenge posed by data-dependent probabilities. Our §4 analysis derives an adjusted bound that incorporates the dependence between clipped gradient norms and release decisions. In the revision we will expand this section with the explicit argument (via privacy-loss random variables) demonstrating that the bound accounts for the dependence without requiring an additional term, thereby verifying the claimed (ε,δ)-DP guarantee. revision: yes
Circularity Check
No circularity: new privacy analysis presented as independent derivation without reduction to fitted inputs or self-citations
full rationale
The paper's central contribution is a newly derived privacy analysis for the DPSR-CG mechanism that explicitly accounts for data-dependent sampling probability variation induced by selective release after clipping. The abstract and visible text contain no equations, parameter fits, or self-citations that reduce this analysis to its own inputs by construction. No self-definitional loops, fitted-input predictions, or load-bearing self-citations are exhibited. The derivation is treated as self-contained against the prior DPSUR flaw, yielding a score of 0.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Xiaobo Huang and Fang Xie Table 3: Results of classification accuracy Dataset Method 𝜖=1𝜖=2𝜖=3non-private MNIST (Image Dataset) DPSR-CG 99.06% 99.22% 99.24% DPSUR [15] 96.32% 97.51% 98.12% DPSGD Matrix Mech...
2016
-
[2]
Galen Andrew, Om Thakkar, Brendan McMahan, and Swaroop Ramaswamy
-
[3]
InAdvances in Neural Information Processing Systems, Vol
Differentially private learning with adaptive clipping. InAdvances in Neural Information Processing Systems, Vol. 34. 17455–17466
-
[4]
George J Annas. 2003. HIPAA regulations—a new era of medical-record privacy? New England Journal of Medicine348, 15 (2003), 1486–1490
2003
-
[5]
Borja Balle and Yu-Xiang Wang. 2018. Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. InInternational Conference on Machine Learning. PMLR, 394–403
2018
-
[6]
Raef Bassily, Adam Smith, and Abhradeep Thakurta. 2014. Private empirical risk minimization: Efficient algorithms and tight error bounds. In2014 IEEE 55th Annual Symposium on Foundations of Computer Science. IEEE, 464–473
2014
-
[7]
Zhiqi Bu, Jinshuo Dong, Qi Long, and Weijie J Su. [n.d.]. Deep learning with Gaussion differential privacy.Harvard Data Science Review23 ([n. d.]), 10–1162
-
[8]
Zhiqi Bu, Hua Wang, Zongyu Dai, and Qi Long. 2023. On the convergence and calibration of deep learning with differential privacy.Transactions on Machine Learning Research(2023)
2023
-
[9]
Xiangyi Chen, Steven Z Wu, and Mingyi Hong. [n.d.]. Understanding gradi- ent clipping in private SGD: A geometric perspective. InAdvances in Neural Information Processing Systems, Vol. 33. 13773–13782
-
[10]
Christopher Choquette-Choo, Arun Ganesh, Saminul Haque, Thomas Steinke, and Abhradeep Guha Thakurta. 2025. Near-exact privacy amplification for matrix mechanisms. InInternational Conference on Learning Representations. 98772–98802
2025
-
[11]
Christopher A Choquette-Choo, Arun Ganesh, Ryan McKenna, H Brendan McMa- han, John Rush, Abhradeep Guha Thakurta, and Zheng Xu. 2023. (amplified) banded matrix factorization: A unified approach to private training. InAdvances in Neural Information Processing Systems, Vol. 36. 74856–74889. Revisiting Privacy Amplification by Subsampling in Selective Release DPSGD
2023
-
[12]
Rachel Cummings and Deven Desai. 2018. The role of differential privacy in GDPR compliance. InFAT’18: Proceedings of the Conference on Fairness, Account- ability, and Transparency, Vol. 20. 2
2018
-
[13]
Rong Du, Qingqing Ye, Yue Fu, Haibo Hu, Jin Li, Chengfang Fang, and Jie Shi
-
[14]
In2023 IEEE 39th International Conference on Data Engineering (ICDE)
Differential aggregation against general colluding attackers. In2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 2180–2193
-
[15]
Jiawei Duan, Haibo Hu, Qingqing Ye, and Xinyue Sun. 2025. Analyzing and optimizing perturbation of DP-SGD geometrically. In2025 IEEE 41st International Conference on Data Engineering (ICDE). IEEE, 3439–3452
2025
-
[16]
Cynthia Dwork and Aaron Roth. 2014. The algorithmic foundations of differential privacy.Foundations and Trends®in Theoretical Computer Science9, 3-4 (2014), 211–487
2014
-
[17]
Jie Fu, Qingqing Ye, Haibo Hu, Zhili Chen, Lulu Wang, Kuncan Wang, and Xun Ran. 2023. DPSUR: Accelerating differentially private stochastic gradient descent using selective update and release.arXiv:2311.14056(2023)
arXiv 2023
-
[18]
Rong Ge, Furong Huang, Chi Jin, and Yang Yuan. 2015. Escaping from saddle points—online stochastic gradient for tensor decomposition. InConference on Learning Theory. PMLR, 797–842
2015
-
[19]
Xiaobo Huang and Fang Xie. 2026. Step-Wise Dual Dynamic DPSGD: Enhancing Performance on Imbalanced Medical Datasets with Differential Privacy.Entropy 28, 4 (2026), 409
2026
-
[20]
Peter Kairouz, Brendan McMahan, Shuang Song, Om Thakkar, Abhradeep Thakurta, and Zheng Xu. 2021. Practical and private (deep) learning without sampling or shuffling. InInternational Conference on Machine Learning. PMLR, 5213–5225
2021
-
[21]
Peter Kairouz, Sewoong Oh, and Pramod Viswanath. 2015. The composition theorem for differential privacy. InInternational conference on machine learning. PMLR, 1376–1385
2015
-
[22]
Nikhil Ketkar. 2017. Introduction to Keras. InDeep learning with python: a hands-on introduction. Springer, 97–111
2017
-
[23]
Anastasia Koloskova, Hadrien Hendrikx, and Sebastian U Stich. 2023. Revis- iting gradient clipping: Stochastic bias and tight convergence guarantees. In International Conference on Machine Learning. PMLR, 17343–17363
2023
-
[24]
2009.Learning multiple layers of features from tiny images
Alex Krizhevsky and Geoffrey Hinton. 2009.Learning multiple layers of features from tiny images. Master’s thesis. Department of Computer Science, University of Toronto
2009
-
[25]
Jaewoo Lee and Daniel Kifer. 2018. Concentrated differentially private gradient descent with adaptive per-iteration privacy budget. InProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1656–1665
2018
-
[26]
Andrew Maas, Raymond E Daly, Peter T Pham, Dan Huang, Andrew Y Ng, and Christopher Potts. 2011. Learning word vectors for sentiment analysis. InProceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 142–150
2011
-
[27]
Ilya Mironov. 2017. Rényi differential privacy. In2017 IEEE 30th Computer Security Foundations Symposium (CSF). IEEE, 263–275
2017
-
[28]
Milad Nasr, Reza Shokri, and Amir Houmansadr. 2019. Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In2019 IEEE Symposium on Security and Privacy (SP). IEEE, 739–753
2019
-
[29]
Nicolas Papernot, Abhradeep Thakurta, Shuang Song, Steve Chien, and Úlfar Er- lingsson. 2021. Tempered sigmoid activations for deep learning with differential privacy. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 9312–9321
2021
-
[30]
NhatHai Phan, Xintao Wu, Han Hu, and Dejing Dou. 2017. Adaptive laplace mechanism: Differential privacy preservation in deep learning. In2017 IEEE international Conference on Data Mining (ICDM). IEEE, 385–394
2017
-
[31]
Venkatadheeraj Pichapati, Ananda Theertha Suresh, Felix X Yu, Sashank J Reddi, and Sanjiv Kumar. 2019. Adaclip: Adaptive clipping for private SGD. arXiv:1908.07643(2019)
arXiv 2019
-
[32]
Formerly Data Protection. 2018. General data protection regulation (GDPR). Intersoft Consulting, Accessed in October24, 1 (2018)
2018
-
[33]
Ahmed Salem, Yang Zhang, Mathias Humbert, Pascal Berrang, Mario Fritz, and Michael Backes. 2018. Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models.arXiv:1806.01246 (2018)
Pith/arXiv arXiv 2018
-
[34]
Shuang Song, Kamalika Chaudhuri, and Anand D Sarwate. 2013. Stochastic gra- dient descent with differentially private updates. In2013 IEEE Global Conference on Signal and Information Processing. IEEE, 245–248
2013
-
[35]
Florian Tramer and Dan Boneh. 2020. Differentially private learning needs better features (or much more data).arXiv:2011.11660(2020)
arXiv 2020
-
[36]
Yu-Lin Tsai, Yizhe Li, Chia-Mu Yu, Xuebin Ren, Po-Yu Chen, Zekai Chen, and Francois Buet-Golfouse. 2025. Differentially private fine-tuning of diffusion models. InProceedings of the IEEE/CVF International Conference on Computer Vision. 4561–4571
2025
-
[37]
Chengkun Wei, Weixian Li, Gong Chen, and Wenzhi Chen. 2025. DC-SGD: Differentially Private SGD with Dynamic Clipping through Gradient Norm Distribution Estimation.IEEE Transactions on Information Forensics and Security (2025)
2025
-
[38]
Jianxin Wei, Ergute Bao, Xiaokui Xiao, and Yin Yang. 2022. DPIS: An enhanced mechanism for differentially private SGD with importance sampling. InPro- ceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security. 2885–2899
2022
-
[39]
Liyao Xiang, Jingbo Yang, and Baochun Li. 2019. Differentially-private deep learn- ing from an optimization perspective. InIEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 559–567
2019
-
[40]
Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms.arXiv:1708.07747 (2017)
Pith/arXiv arXiv 2017
-
[41]
Zhiying Xu, Shuyu Shi, Alex X Liu, Jun Zhao, and Lin Chen. 2020. An adaptive and fast convergent approach to differentially private deep learning. InIEEE INFOCOM 2020-IEEE Conference on Computer Communications. IEEE, 1867–1876
2020
-
[42]
Xiaodong Yang, Huishuai Zhang, Wei Chen, and Tie-Yan Liu. 2022. Normal- ized/clipped SGD with perturbation for differentially private non-convex opti- mization.arXiv:2206.13033(2022)
arXiv 2022
-
[43]
LeCun Yann. 2010. MNIST handwritten digit database.ATT Labs.(2010)
2010
-
[44]
Guanzi Yao. 2024. Privacy-preserving low-rank instruction tuning for large language models via DP-LoRA.Journal of Computer Technology and Software3, 5 (2024)
2024
-
[45]
Jingzhao Zhang, Tianxing He, Suvrit Sra, and Ali Jadbabaie. 2019. Why gradient clipping accelerates training: A theoretical justification for adaptivity.arXiv preprint arXiv:1905.11881(2019)
arXiv 2019
-
[46]
Xinyang Zhang, Shouling Ji, and Ting Wang. 2018. Differentially private releasing via deep generative model (technical report).arXiv:1801.01594(2018)
Pith/arXiv arXiv 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.