The Normal Distributions Indistinguishability Spectrum and its Application to Privacy-Preserving Machine Learning
Pith reviewed 2026-05-24 06:44 UTC · model grok-4.3
The pith
The differential privacy of any algorithm with multivariate Gaussian outputs is exactly given by a closed-form expression for the hockey-stick divergence between two Gaussians.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The Normal Distributions Indistinguishability Spectrum (NDIS) is a closed-form analytic computation of the hockey-stick divergence delta between an arbitrary pair of multivariate Gaussians, parameterized by privacy parameter epsilon. This determines the differential privacy of any algorithm whose outputs follow a multivariate Gaussian distribution.
What carries the argument
The NDIS lemma, which supplies the closed-form value of the hockey-stick divergence between any two multivariate Gaussians.
If this is right
- Privacy analyses for any algorithm whose outputs are multivariate Gaussians become simpler by applying the NDIS toolbox of properties.
- Random projection receives a tighter privacy parameterization that produces a more noise-frugal differentially private mechanism.
- Any Gaussian-output algorithm equipped with a sensitivity value can be converted into a differentially private mechanism whose guarantee reduces to NDIS on one pair of Gaussians.
- White-box auditing of Gaussian-output algorithms is supported by the connection between NDIS and the CDF of the generalized chi-squared distribution.
Where Pith is reading between the lines
- NDIS could be applied to additional Gaussian-based algorithms in machine learning to obtain improved privacy-utility tradeoffs.
- The auditing connection to the generalized chi-squared distribution might be turned into practical software for verifying deployed mechanisms.
- Similar closed-form spectra could be sought for other output distributions that appear frequently in privacy-preserving learning.
Load-bearing premise
The algorithm outputs must follow exactly a multivariate Gaussian distribution for the computed privacy value to be precise.
What would settle it
Numerically estimate the hockey-stick divergence for two concrete multivariate Gaussians with known means and covariances, then compare the estimate to the value returned by the NDIS closed-form expression.
Figures
read the original abstract
We investigate the privacy of {\em any} algorithm whose outputs have Gaussian distribution. This work is motivated by the prevalence of such algorithms in several useful (ML) applications, and the comparatively little research that focuses on privacy-preserving learning outside of adding Gaussian noise to the data (such as DP-SGD). {\em What is the DP of any algorithm with multivariate Gaussian output?} We answer the above research question with a general lemma which we call {\em Normal Distributions Indistinguishability Spectrum} (NDIS), a closed-form analytic computation of the hockey-stick divergence $\delta$ between an arbitrary pair of multivariate Gaussians, parameterized by privacy parameter $\epsilon$. To show its practical implications, we prove several properties of our NDIS lemma. These properties form a {\em toolbox} of results which lead to potentially {\em easier} privacy proofs for any Gaussian-output algorithm. As an example application of our toolbox, we prove a tighter parametrisation of the privacy of {\em random projection (RP)}, and obtaining from it a more noise-frugal DP mechanism. Beyond random projection, NDIS can be used to lift {\em any} Gaussian-output algorithm with a `sensitivity' (which we define) to a Gaussian-output DP mechanism. The mechanism boosts the existing randomness in the algorithm, so that one can describe the mechanism's privacy as the IS between a single pair of Gaussians, which can then be analyzed via NDIS. Lastly, we leverage the connections between NDIS and the CDF of the generalized $\chi^2$ distribution (which have efficient empirical estimators) to present a tool for white-box auditing of Gaussian-output algorithms.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the Normal Distributions Indistinguishability Spectrum (NDIS) lemma providing a closed-form analytic computation of the hockey-stick divergence δ between arbitrary pairs of multivariate Gaussians N(μ₀,Σ₀) and N(μ₁,Σ₁) parameterized by ε. It proves several properties of NDIS forming a toolbox for privacy analysis of any algorithm with Gaussian outputs. Applications include a tighter privacy parameterization for random projection (RP) yielding a more noise-frugal mechanism, a general boosting construction that adds randomness to lift Gaussian-output algorithms with defined sensitivity to DP mechanisms whose privacy reduces to a single NDIS instance, and an auditing tool based on the connection between NDIS and the CDF of the generalized χ² distribution with efficient empirical estimators.
Significance. If the NDIS closed-form derivation and properties hold, the work supplies a general analytic tool for DP analysis of Gaussian-output algorithms beyond standard noise-addition mechanisms such as DP-SGD. The toolbox of properties, the concrete tighter RP bound, and the auditing method leveraging efficient χ² estimators constitute clear strengths. The parameter-free character of the core lemma for exact Gaussian pairs is a notable asset when the modeling assumption is satisfied.
major comments (2)
- [Boosting construction] Boosting construction: the claim that privacy of the boosted mechanism reduces exactly to the indistinguishability spectrum between one pair of Gaussians (and is therefore analyzable via NDIS) requires that the added randomness preserves exact multivariate Gaussianity. The manuscript supplies no quantitative robustness bound on allowable deviation from the Gaussian model (e.g., from discretization, clipping, or non-Gaussian components introduced by the sensitivity-boosting step) before the claimed (ε,δ) guarantee ceases to hold; this assumption is load-bearing for the general applicability statement.
- [NDIS lemma] NDIS lemma (derivation section): the abstract asserts a closed-form expression derived from Gaussian properties and the hockey-stick divergence definition, yet the manuscript must explicitly display the final analytic formula for δ(ε) together with the key algebraic steps and any accompanying verification (e.g., reduction to the univariate case or numerical checks) so that the central claim can be independently confirmed.
minor comments (1)
- [Notation] Notation for the sensitivity parameter and the pair of Gaussians should be introduced once and used uniformly when stating the toolbox properties and the RP application.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive review. We address the two major comments point by point below.
read point-by-point responses
-
Referee: [Boosting construction] Boosting construction: the claim that privacy of the boosted mechanism reduces exactly to the indistinguishability spectrum between one pair of Gaussians (and is therefore analyzable via NDIS) requires that the added randomness preserves exact multivariate Gaussianity. The manuscript supplies no quantitative robustness bound on allowable deviation from the Gaussian model (e.g., from discretization, clipping, or non-Gaussian components introduced by the sensitivity-boosting step) before the claimed (ε,δ) guarantee ceases to hold; this assumption is load-bearing for the general applicability statement.
Authors: We agree that the boosting construction relies on the outputs remaining exactly multivariate Gaussian. The manuscript presents the construction and the reduction to a single NDIS instance under this modeling assumption, which is standard when the base algorithm is defined to produce Gaussians. We will revise the text to state the assumption explicitly and note that deviations (e.g., from discretization or clipping) fall outside the claimed guarantee and would require separate analysis. A general quantitative robustness bound is not provided, as deriving one lies beyond the scope of the present work. revision: partial
-
Referee: [NDIS lemma] NDIS lemma (derivation section): the abstract asserts a closed-form expression derived from Gaussian properties and the hockey-stick divergence definition, yet the manuscript must explicitly display the final analytic formula for δ(ε) together with the key algebraic steps and any accompanying verification (e.g., reduction to the univariate case or numerical checks) so that the central claim can be independently confirmed.
Authors: The derivation appears in Section 3, where the multivariate hockey-stick divergence is reduced to a univariate problem via the distribution of a quadratic form. In the revision we will extract the final closed-form expression for δ(ε) into a prominent theorem statement in the main text, include the key algebraic steps inline rather than solely in the appendix, and add explicit verification consisting of the univariate reduction and numerical checks against Monte-Carlo estimates obtained from the generalized χ² connection. revision: yes
Circularity Check
NDIS lemma is a direct analytic derivation; no circular reductions identified.
full rationale
The paper presents NDIS as a closed-form computation of hockey-stick divergence for arbitrary multivariate Gaussians, derived from standard Gaussian properties and the divergence definition. No steps reduce by construction to fitted parameters, self-citations, or renamed inputs. The derivation chain is self-contained against external mathematical benchmarks (Gaussian integrals, divergence definitions). The requirement of exact Gaussian outputs is an applicability condition, not a circularity in the lemma itself. No load-bearing self-citation chains or ansatz smuggling appear in the abstract or described claims.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Hockey-stick divergence is the correct quantity for (ε,δ)-DP
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Theorem 1 (NDIS): δN1,N2(ε) = E[gε(Z)] where gε(Z) = max{0, 1 - exp(½ Zᵀ A Z + bᵀ Z + c)} … U A Uᵀ = EVD of Id - Σ1^{1/2} Σ2^{-1} Σ1^{1/2}
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Corollary 2 … δDT G,D′T G(ε) … block-diagonal Σ with DT D, D′T D′ … reduces to χ²(r) tails via leverage p(i)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
The algorithmic foundations of differential privacy,
C. Dwork and A. Roth, “The algorithmic foundations of differential privacy,” Found. Trends Theor. Comput. Sci. , vol. 9, no. 3–4, p. 211–407, Aug. 2014. [Online]. Available: https://doi.org/10.1561/ 0400000042
work page 2014
-
[2]
Privacy profiles and amplifi- cation by subsampling,
B. Balle, G. Barthe, and M. Gaboardi, “Privacy profiles and amplifi- cation by subsampling,” J. Priv. Confidentiality, vol. 10, no. 1, 2020
work page 2020
-
[3]
Eureka: A general framework for black-box differential privacy estimators,
Y . Lu, M. Magdon-Ismail, Y . Wei, and V . Zikas, “Eureka: A general framework for black-box differential privacy estimators,” in 2024 IEEE Symposium on Security and Privacy (SP) , 2024, pp. 169–169
work page 2024
-
[4]
Calibrating noise to sensitivity in private data analysis,
C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in Theory of cryptography conference. Springer, 2006, pp. 265–284
work page 2006
-
[5]
B. Balle and Y . Wang, “Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising,” in ICML, ser. Proceedings of Machine Learning Research, vol. 80. PMLR, 2018, pp. 403–412
work page 2018
-
[6]
The johnson- lindenstrauss transform itself preserves differential privacy,
J. Blocki, A. Blum, A. Datta, and O. Sheffet, “The johnson- lindenstrauss transform itself preserves differential privacy,” inFOCS. IEEE Computer Society, 2012, pp. 410–419
work page 2012
-
[7]
On differentially private graph sparsifi- cation and applications,
R. Arora and J. Upadhyay, “On differentially private graph sparsifi- cation and applications,” in NeurIPS, 2019, pp. 13 378–13 389
work page 2019
-
[8]
A framework for private matrix analysis in sliding window model,
J. Upadhyay and S. Upadhyay, “A framework for private matrix analysis in sliding window model,” in Proceedings of the 38th Inter- national Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, ser. Proceedings of Machine Learning Research, vol. 139. PMLR, 2021, pp. 10 465–10 475
work page 2021
-
[9]
Private outsourced bayesian optimization,
D. Kharkovskii, Z. Dai, and B. K. H. Low, “Private outsourced bayesian optimization,” in ICML, ser. Proceedings of Machine Learn- ing Research, vol. 119. PMLR, 2020, pp. 5231–5242
work page 2020
-
[10]
Differentially Private Linear Algebra in the Streaming Model
J. Upadhyay, “Differentially private linear algebra in the streaming model,” CoRR, vol. abs/1409.5414, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[11]
S. Chatterjee and A. S. Hadi, Sensitivity analysis in linear regression. John Wiley & Sons, 2009
work page 2009
-
[12]
W. H. Greene, Econometric analysis . Pearson Education Limited, 2012
work page 2012
-
[13]
Improved approximation algorithms for large matrices via random projections,
T. Sarl ´os, “Improved approximation algorithms for large matrices via random projections,” in FOCS. IEEE Computer Society, 2006, pp. 143–152
work page 2006
-
[14]
The algorithmic foundations of differential privacy,
C. Dwork and A. Roth, “The algorithmic foundations of differential privacy,” Foundations and Trends in Theoretical Computer Science , vol. 9, no. 3-4, pp. 211–407, 2014
work page 2014
-
[15]
Old techniques in differentially private linear regres- sion,
O. Sheffet, “Old techniques in differentially private linear regres- sion,” in Algorithmic Learning Theory, ALT 2019, 22-24 March 2019, Chicago, Illinois, USA , ser. Proceedings of Machine Learning Research, vol. 98. PMLR, 2019, pp. 788–826
work page 2019
-
[16]
Differentially private ordinary least squares,
——, “Differentially private ordinary least squares,” in ICML, ser. Proceedings of Machine Learning Research, vol. 70. PMLR, 2017, pp. 3105–3114
work page 2017
-
[17]
Privacy-utility trade-off of linear regression under random projections and additive noise,
M. Showkatbakhsh, C. Karakus, and S. N. Diggavi, “Privacy-utility trade-off of linear regression under random projections and additive noise,” in 2018 IEEE International Symposium on Information The- ory, ISIT 2018, Vail, CO, USA, June 17-22, 2018 . IEEE, 2018, pp. 186–190
work page 2018
-
[18]
Smooth sensitivity and sampling in private data analysis,
K. Nissim, S. Raskhodnikova, and A. D. Smith, “Smooth sensitivity and sampling in private data analysis,” in Proceedings of the 39th Annual ACM Symposium on Theory of Computing, San Diego, Cali- fornia, USA, June 11-13, 2007 . ACM, 2007, pp. 75–84
work page 2007
-
[19]
Sensitivity analysis for non-interactive differential privacy: Bounds and efficient algorithms,
A. Inan, M. E. Gursoy, and Y . Saygin, “Sensitivity analysis for non-interactive differential privacy: Bounds and efficient algorithms,” IEEE Trans. Dependable Secur. Comput., vol. 17, no. 1, pp. 194–207, 2020
work page 2020
-
[20]
Structure and sensitivity in differential privacy: Comparing k-norm mechanisms,
J. Awan and A. Slavkovi ´c, “Structure and sensitivity in differential privacy: Comparing k-norm mechanisms,” Journal of the American Statistical Association, vol. 116, no. 534, pp. 935–954, 2021
work page 2021
-
[21]
Algorithm as 155: The distribution of a linear com- bination of χ 2 random variables,
R. B. Davies, “Algorithm as 155: The distribution of a linear com- bination of χ 2 random variables,” Applied Statistics , pp. 323–333, 1980
work page 1980
-
[22]
A method to integrate and classify normal distributions,
A. Das and W. S. Geisler, “A method to integrate and classify normal distributions,” CoRR, vol. abs/2012.14331, 2020
-
[23]
Private empirical risk minimization: Efficient algorithms and tight error bounds,
R. Bassily, A. D. Smith, and A. Thakurta, “Private empirical risk minimization: Efficient algorithms and tight error bounds,” in 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2014, Philadelphia, PA, USA, October 18-21, 2014. IEEE Computer Society, 2014, pp. 464–473
work page 2014
-
[24]
T. T. Cai, Y . Wang, and L. Zhang, “The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy,” CoRR, vol. abs/1902.04495, 2019
-
[25]
(nearly) optimal private linear regression for sub-gaussian data via adaptive clipping,
P. Varshney, A. Thakurta, and P. Jain, “(nearly) optimal private linear regression for sub-gaussian data via adaptive clipping,” in Conference on Learning Theory, 2-5 July 2022, London, UK , ser. Proceedings of Machine Learning Research, P. Loh and M. Raginsky, Eds., vol. 178. PMLR, 2022, pp. 1126–1166
work page 2022
-
[26]
Near optimal private and robust linear regression,
X. Liu, P. Jain, W. Kong, S. Oh, and A. S. Suggala, “Near optimal private and robust linear regression,” CoRR, vol. abs/2301.13273, 2023
-
[27]
Improved differentially private regression via gradient boosting,
S. Tang, S. Ayd ¨ore, M. Kearns, S. Rho, A. Roth, Y . Wang, Y . Wang, and Z. S. Wu, “Improved differentially private regression via gradient boosting,” in IEEE Conference on Secure and Trustworthy Machine Learning, SaTML 2024, Toronto, ON, Canada, April 9-11, 2024 . IEEE, 2024, pp. 33–56
work page 2024
-
[28]
Easy differen- tially private linear regression,
K. Amin, M. Joseph, M. Ribero, and S. Vassilvitskii, “Easy differen- tially private linear regression,” in The Eleventh International Confer- ence on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023
work page 2023
-
[29]
Differential privacy and robust statistics in high dimensions,
X. Liu, W. Kong, and S. Oh, “Differential privacy and robust statistics in high dimensions,” in Conference on Learning Theory, 2-5 July 2022, London, UK , ser. Proceedings of Machine Learning Research, vol. 178. PMLR, 2022, pp. 1167–1246
work page 2022
-
[30]
No subclass left behind: Fine-grained robustness in coarse-grained classification problems,
N. S. Sohoni, J. Dunnmon, G. Angus, A. Gu, and C. R ´e, “No subclass left behind: Fine-grained robustness in coarse-grained classification problems,” in Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual , 2020
work page 2020
-
[31]
DSTAGNN: dynamic spatial-temporal aware graph neural network for traffic flow forecasting,
S. Lan, Y . Ma, W. Huang, W. Wang, H. Yang, and P. Li, “DSTAGNN: dynamic spatial-temporal aware graph neural network for traffic flow forecasting,” in International Conference on Machine Learn- ing, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA , ser. Proceedings of Machine Learning Research, vol. 162. PMLR, 2022, pp. 11 906–11 917
work page 2022
-
[32]
No free lunch theorem for security and utility in federated learning,
X. Zhang, H. Gu, L. Fan, K. Chen, and Q. Yang, “No free lunch theorem for security and utility in federated learning,” ACM Trans. Intell. Syst. Technol., vol. 14, no. 1, pp. 1:1–1:35, 2023
work page 2023
-
[33]
Privately estimating a gaussian: Efficient, robust, and optimal,
D. Alabi, P. K. Kothari, P. Tankala, P. Venkat, and F. Zhang, “Privately estimating a gaussian: Efficient, robust, and optimal,” in Proceedings of the 55th Annual ACM Symposium on Theory of Computing, STOC 2023, Orlando, FL, USA, June 20-23, 2023 . ACM, 2023, pp. 483– 496
work page 2023
-
[34]
The total variation distance between high-dimensional gaussians with the same mean,
L. Devroye, A. Mehrabian, and T. Reddad, “The total variation distance between high-dimensional gaussians with the same mean,” arXiv preprint arXiv:1810.08693 , 2018. 14
-
[35]
Polynomial time and private learning of unbounded gaussian mixture models,
J. Arbas, H. Ashtiani, and C. Liaw, “Polynomial time and private learning of unbounded gaussian mixture models,” in International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Hon- olulu, Hawaii, USA, ser. Proceedings of Machine Learning Research, vol. 202. PMLR, 2023, pp. 1018–1040
work page 2023
-
[36]
MVG mecha- nism: Differential privacy under matrix-valued query,
T. Chanyaswad, A. Dytso, H. V . Poor, and P. Mittal, “MVG mecha- nism: Differential privacy under matrix-valued query,” inProceedings of the 2018 ACM SIGSAC Conference on Computer and Communi- cations Security, CCS 2018, Toronto, ON, Canada, October 15-19,
work page 2018
-
[37]
ACM, 2018, pp. 230–246
work page 2018
-
[38]
Fast private data release algorithms for sparse queries,
A. Blum and A. Roth, “Fast private data release algorithms for sparse queries,” in Approximation, Randomization, and Combinatorial Opti- mization. Algorithms and Techniques - 16th International Workshop, APPROX 2013, and 17th International Workshop, RANDOM 2013, Berkeley, CA, USA, August 21-23, 2013. Proceedings , ser. Lecture Notes in Computer Science, v...
work page 2013
-
[40]
——, “Circulant matrices and differential privacy,” CoRR, vol. abs/1410.2470, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[41]
Distributed sketching methods for privacy preserving regression,
B. Bartan and M. Pilanci, “Distributed sketching methods for privacy preserving regression,” CoRR, vol. abs/2002.06538, 2020
-
[42]
Differential privacy for clinical trial data: Preliminary evaluations,
D. Vu and A. B. Slavkovic, “Differential privacy for clinical trial data: Preliminary evaluations,” in ICDM Workshops 2009, IEEE Interna- tional Conference on Data Mining Workshops, Miami, Florida, USA, 6 December 2009 . IEEE Computer Society, 2009, pp. 138–143
work page 2009
-
[43]
Y . Wang, “Revisiting differentially private linear regression: opti- mal and adaptive prediction & estimation in unbounded domain,” in Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, UAI 2018, Monterey, California, USA, August 6-10, 2018. AUAI Press, 2018, pp. 93–103
work page 2018
-
[44]
Differentially private simple linear regression,
D. Alabi, A. McMillan, J. Sarathy, A. D. Smith, and S. P. Vadhan, “Differentially private simple linear regression,” Proc. Priv. Enhanc- ing Technol., vol. 2022, no. 2, pp. 184–204, 2022
work page 2022
-
[45]
Analyzing the differentially pri- vate theil-sen estimator for simple linear regression,
J. Sarathy and S. P. Vadhan, “Analyzing the differentially pri- vate theil-sen estimator for simple linear regression,” CoRR, vol. abs/2207.13289, 2022
-
[46]
Differentially private distributed bayesian linear regression with MCMC,
B. Alparslan, S. Yildirim, and S. I. Birbil, “Differentially private distributed bayesian linear regression with MCMC,” in International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Hon- olulu, Hawaii, USA, ser. Proceedings of Machine Learning Research, vol. 202. PMLR, 2023, pp. 627–641
work page 2023
-
[47]
On the theory and practice of privacy-preserving bayesian data analysis,
J. R. Foulds, J. Geumlek, M. Welling, and K. Chaudhuri, “On the theory and practice of privacy-preserving bayesian data analysis,” in Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, UAI 2016, June 25-29, 2016, New York City, NY, USA. AUAI Press, 2016
work page 2016
-
[48]
Differentially private bayesian linear regression,
G. Bernstein and D. Sheldon, “Differentially private bayesian linear regression,” in Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada , 2019, pp. 523–533
work page 2019
-
[49]
Robust and private bayesian inference,
C. Dimitrakakis, B. Nelson, A. Mitrokotsa, and B. I. P. Rubinstein, “Robust and private bayesian inference,” in Algorithmic Learning Theory - 25th International Conference, ALT 2014, Bled, Slovenia, October 8-10, 2014. Proceedings , ser. Lecture Notes in Computer Science, vol. 8776. Springer, 2014, pp. 291–305
work page 2014
-
[50]
Privacy for free: Posterior sampling and stochastic gradient monte carlo,
Y . Wang, S. E. Fienberg, and A. J. Smola, “Privacy for free: Posterior sampling and stochastic gradient monte carlo,” in Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015 , ser. JMLR Workshop and Conference Proceedings, vol. 37. JMLR.org, 2015, pp. 2493–2502
work page 2015
-
[51]
Differential privacy without sensitivity,
K. Minami, H. Arai, I. Sato, and H. Nakagawa, “Differential privacy without sensitivity,” in Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain , 2016, pp. 956–964
work page 2016
-
[52]
Probability inequalities for sums of bounded random variables,
W. Hoeffding, “Probability inequalities for sums of bounded random variables,” Journal of the American Statistical Association , vol. 58, no. 301, pp. 13–30, 1963
work page 1963
-
[53]
Privsyn: Differentially private data synthesis,
Z. Zhang, T. Wang, N. Li, J. Honorio, M. Backes, S. He, J. Chen, and Y . Zhang, “Privsyn: Differentially private data synthesis,” inUSENIX Security Symposium. USENIX Association, 2021, pp. 929–946
work page 2021
-
[54]
Differentially private mixture of generative neural networks,
G. ´Acs, L. Melis, C. Castelluccia, and E. D. Cristofaro, “Differentially private mixture of generative neural networks,” IEEE Trans. Knowl. Data Eng., vol. 31, no. 6, pp. 1109–1121, 2019
work page 2019
-
[55]
Practical consid- erations on using private sampling for synthetic data,
C. Pierquin, B. Zimmermann, and M. Boussard, “Practical consid- erations on using private sampling for synthetic data,” CoRR, vol. abs/2312.07139, 2023
-
[56]
Inverses of 2 × 2 block matrices,
T.-T. Lu and S.-H. Shiou, “Inverses of 2 × 2 block matrices,” Comput- ers & Mathematics with Applications , vol. 43, no. 1-2, pp. 119–129, 2002. Appendix A. Survey of Related Work We already discussed in our Technical Overview section the works most relevant to our results, and compared with them. Form completeness, here we include a more compre- hensive ...
work page 2002
-
[57]
studied the DP of random projection using random matrices other than standard Gaussian matrices. Beyond the study of RP itself, it has been explored as a pre-processing 15 step to achieve DP in many applications [40], [8], [7], [9], [10]. Our objective aligns with [6], [16], [15], focusing on Gaussian RP, and our results improved upon the (as we demonstra...
-
[58]
study linear regression under the linear Gaussian model, assuming that the records follows a Gaussian distribution. For algorithmic approaches, [6], [15], [41], [46], [47], [45] release perturbed sufficient statistics and then compute the linear regression solution through post-processing. Other approaches include [23], [24], [25], [26], [27], which use n...
-
[59]
max{0, 1 − exp − rX i=1 (aT Zi)2 + c ! } # = E
Our experiments only compare with [15], [17], which improve upon the work of [6], [16]. = Z S exp − 1 2(x − µ1)T Σ−1 1 (x − µ1) (2π)d/2 det (Σ1)1/2 dx − exp (ε) Z S exp − 1 2(x − µ2)T Σ−1 2 (x − µ2) (2π)d/2 det (Σ2)1/2 dx = Z S exp − 1 2(x − µ1)T Σ−1 1 (x − µ1) (2π)d/2 det (Σ1)1/2 − exp ε − 1 2(x − µ2)T Σ−1 2 (x − µ2) (2π)d/2 det (Σ2)1/2 dx = Z Σ1/2 1 S+µ...
-
[60]
Now we want to show that for all ε ∈ R≥0 and p(i) ∈ [0, 1], δDT G,D′T G(ε) ≥ δD′T g,DT g(ε)
Plugging the above expression of δD′T G,DT G(ε) into Proposition 2, we have δD′T G,DT G(ε) = Pr Y ≤ −2ε − r ln (1 − p(i)) p(i) − exp ε + r 2 ln (1 − p(i)) (1 − p(i)) r 2 · Pr Y ≤ (−2ε − r ln (1 − p(i)))(1 − p(i)) p(i) = Pr Y ≤ −2ε − r ln (1 − p(i)) p(i) − exp (ε) Pr Y ≤ (−2ε − r ln (1 − p(i)))(1 − p(i)) p(i) , where Y ∼ χ2(r). Now we want to show that for...
-
[61]
The following considers the case 0 < p (i) < 1 and ε < − r 2 ln (1 − p(i))
And from the above we know that δD′T g,DT g(ε) = 0 when ε ≥ − r 2 ln (1 − p(i)), so f(ε, p(i)) ≥ 0 at this range. The following considers the case 0 < p (i) < 1 and ε < − r 2 ln (1 − p(i)). Fixing p(i) and treat f as a function of ε, then we have f(ε) = 0 at ε = 0. We first compute the first derivative of f with respect to ε, and have d dε f = exp (ε) − 1...
-
[62]
Use binary search to compute ¯s such that δ = δε RLC(¯s)
-
[64]
Return ˜x = DT g + N, where g ∼ N (− →0 , In), N ∼ N (− →0 , σ2Id). Figure 8: Random Linear Combination Mechanism MRLC Figure 8 presents a RLC-based DP mechanism MRLC, where the construction shares the same flavor of MRP . Proposition 10 states that the larger the leverage, the easier to distinguish DT g and D′T g. Proposition 10 is a special case of Prop...
-
[65]
Now we want to show that for all ε ∈ R≥0 and p(i) ∈ [0, 1], δDT g,D′T g(ε) ≥ δD′T g,DT g(ε)
Plugging the above expres- sion of δD′T g,DT g(ε) in Proposition 9, we have δD′T g,DT g(ε) = erf s −ε − 1 2 ln 1 − p(i) p(i) − exp ε + 1 2 ln 1 − p(i) p1 − p(i) · erf s (ε + 1 2 ln 1 − p(i))(p(i) − 1) p(i) = erf s −ε − 1 2 ln 1 − p(i) p(i) − exp (ε) erf s (ε + 1 2 ln 1 − p(i))(p(i) − 1) p(i) . Now we want to show that for a...
-
[66]
The following considers the case 0 < p (i) < 1 and ε < − 1 2 ln (1 − p(i))
And from the above we know that δD′T g,DT g(ε) = 0 when ε ≥ − 1 2 ln (1 − p(i)), so f(ε, p(i)) ≥ 0 at this range. The following considers the case 0 < p (i) < 1 and ε < − 1 2 ln (1 − p(i)). Fixing p(i) and treat f as a function of ε, then we have f(ε) = 0 at ε = 0. We first compute the first derivative of f with respect to ε, and have d dε f = − exp (ε) +...
-
[67]
W1 ≤ dX i=1 b2 i ai − 2c # − dY i=1 1√1 − ai ! · exp c + 1 2 dX i=1 b2 i 1 − ai ! Pr
(5) Then, we compute the off-diagonal entries of Q, for any fixed a, b ∈ [d], a ̸= b, continue the Equation (4), we have (Q)a,b = X i eeT i,i E gagbg2 i + X i̸=j eeT i,j E [gagbgigj] = X i̸=j eeT i,j E [gagbgigj] = X (i,j)=(a,b) or(j,i)=(b,a) eeT i,j E [gagbgigj] + X (i,j)̸=(a,b) or(j,i)̸=(b,a) eeT i,j E [gagbgigj] = eeT a,b E g2 ag2 b + eeT b,a E g2 ag2 ...
-
[68]
Use binary search to compute σ such that δε ALS( 2l2 σ2 , l2 σ2 ) ≤ δ, where l is the largest ℓ2 norm of all possible row in the application’s universe
-
[70]
Return ˜xopt = (ΠB)−1Πb. Figure 11: Approximate Least Square (ALS) Mechanism MALS Please see Figure 11 for the least square mechanism constructed from ALS. Appendix L. Relative DP Mechanisms We refer the reader to Figures 12 and 13. Appendix M. Algorithm of RP LS Please see Figure 14 for the random projection LS (RP LS) algorithm. Input: A matrix D ∈ Rn×d...
-
[71]
[If RP’s inherent privacy is enough, i.e., ¯s ≥ s: ]
Use binary search to compute ¯s such that δ = δε RP (¯s). [If RP’s inherent privacy is enough, i.e., ¯s ≥ s: ]
-
[72]
[Otherwise, add extra Gaussian noise to bridge the privacy gap:]
Return ˜M = DT G, where each entry of G ∈ Rn×r is an independent Gaussian with 0 mean and 1 variance. [Otherwise, add extra Gaussian noise to bridge the privacy gap:]
-
[73]
Set σ = l q 1 ¯s , where l is the largest ℓ2 norm of all possible row in the application’s universe
-
[74]
Return ˜M = DT G+N, where each entry of G ∈ Rn×r is an independent Gaussian with 0 mean and 1 variance and each entry of N ∈ Rn×r is an independent Gaussian with 0 mean and σ2 variance. Figure 12: Relative DP Random Projection (RP) Mechanism Mrel RP Input: A matrix D = [B, b] ∈ Rn×(d+1), where B ∈ Rn×d and b ∈ Rn, privacy parameters ε ∈ R≥0, δ ∈ [0, 1), a...
-
[75]
[Otherwise, add extra noise to bridge the privacy gap:]
Return ˜xopt sampled from N (xopt, ∥e∥2 2(BT B)−1 r ). [Otherwise, add extra noise to bridge the privacy gap:]
-
[76]
Use binary search to compute σ such that δε ALS(2 l2 σ2 , l2 σ2 ) ≤ δ, where l is the largest ℓ2 norm of all possible row in the application’s universe
-
[77]
Set D = [B, b] as the concatenation of D with σId+1
-
[78]
Return ˜xopt sampled from N (xopt, ∥e∥2 2(BT B)−1 r ), where xopt and e are OLS solution and residual vector of D, accordingly. Figure 13: Relative DP Least Square Mechanism MLS Input: A matrix D = [ B, b] ∈ Rn×(d+1), where B ∈ Rn×d and b ∈ Rn. Privacy parameter ϵ ∈ R>0, δ ∈ (0, 1). Output: The approximate solution ˜xopt of Bx = b
-
[79]
Set the constant c1 = max{ D(i) 2}i∈[n]
-
[80]
Set the constant c2 = r 4c2 1( q 2r ln 4 δ + ln 4 δ )/ϵ
-
[81]
Set ¯D = [ ¯B, ¯b] as the concatenation of D with c2Id+1
-
[82]
Let matrix Π ∈ Rr×(n+d) be a random Gaussian matrix such that any of Π’s entry (Π)i,j ∼ N (0, 1) is an indepen- dent standard normal random variable
-
[83]
Return ˜xopt = (ΠB)−1Πb. Figure 14: Random Projection Least Squares (RP LS) algo- rithm from [15] 32 Figure 15: Numerical evidence of δε ALS is an increasing function with respect to residual and leverage. The first five rows plot δε ALS as a function of residual, each figure corresponding to a different leverage. The second five rows plot the δε ALS as a...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.