Differentially Private Hyperparameter Tuning using Local Bayesian Optimization
Pith reviewed 2026-05-23 03:16 UTC · model grok-4.3
The pith
DP-GIBO converges to locally optimal hyperparameters under differential privacy with polynomial dimensional scaling.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under suitable conditions, DP-GIBO converges to a locally optimal hyperparameter configuration up to a privacy-dependent error, with dimensional dependence that is polynomial rather than exponential. Empirically, DP-GIBO provides scalable private hyperparameter tuning across multiple tasks, substantially outperforming non-private random search and global Bayesian optimization baselines in moderate-to-high-dimensional hyperparameter spaces.
What carries the argument
DP-GIBO, a differentially private local Bayesian optimization framework that privately approximates gradients using a Gaussian Process surrogate
If this is right
- Convergence holds to a local optimum whose distance from the true optimum is controlled by the privacy parameter.
- The method operates on noisy black-box validation objectives without requiring direct gradient access.
- Empirical results show clear gains over random search and global Bayesian optimization once dimension reaches moderate values.
Where Pith is reading between the lines
- Local rather than global search may be the natural regime for privacy-preserving black-box optimization because privacy noise interacts less destructively with local structure.
- The same surrogate-based private gradient mechanism could be tested on other sequential decision problems that require both privacy and high-dimensional search.
- Downstream model accuracy after private tuning would be a direct test of whether the local convergence error remains tolerable in practice.
Load-bearing premise
Suitable conditions on noise levels, objective smoothness, and privacy parameters exist so the Gaussian process surrogate yields a private gradient approximation whose error still permits local convergence.
What would settle it
An experiment in which DP-GIBO's error grows exponentially with dimension or its performance falls below random search for any fixed privacy budget would falsify the polynomial scaling and convergence claims.
Figures
read the original abstract
Hyperparameter tuning is a key component of machine learning procedures, but when validation data contain sensitive user information, search mechanisms can leak private information through the selected configuration. Existing differentially private hyperparameter tuning methods often rely on near-random search, while prior differentially private Bayesian optimization approaches are typically global and, therefore, scale poorly with the hyperparameter dimensionality. We study differentially private hyperparameter tuning using local Bayesian optimization, focusing on settings where the validation objective is available only through noisy black box evaluations and gradients are unavailable or impractical to compute. We introduce DP-GIBO, a differentially private local Bayesian optimization framework that privately approximates gradients using a Gaussian Process surrogate. Under suitable conditions, we prove that DP-GIBO converges to a locally optimal hyperparameter configuration up to a privacy-dependent error, with dimensional dependence that is polynomial rather than exponential.Empirically, we show that DP-GIBO provides scalable private hyperparameter tuning across multiple tasks, substantially outperforming non-private random search and global Bayesian optimization baselines in moderate-to-high-dimensional hyperparameter spaces.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces DP-GIBO, a differentially private local Bayesian optimization method for hyperparameter tuning when validation objectives are available only via noisy black-box evaluations. It privately approximates gradients via a Gaussian process surrogate and claims that, under suitable conditions, DP-GIBO converges to a locally optimal configuration up to a privacy-dependent error with polynomial (rather than exponential) dimensional dependence. Empirical results are reported showing substantial outperformance over non-private random search and global BO baselines in moderate-to-high-dimensional spaces.
Significance. If the convergence result holds under conditions that are realistic for typical privacy budgets (ε ≪ 1) and validation objectives, the work would supply a scalable private HPO primitive that fills a gap between near-random-search DP methods and dimensionally fragile global DP-BO approaches. The polynomial dimension scaling and local-search focus are potentially high-impact for modern ML pipelines whose hyperparameter spaces exceed a few dozen dimensions.
major comments (2)
- [Abstract] Abstract: The central convergence claim is stated only 'under suitable conditions' with no explicit bounds relating the GP length-scale, smoothness constant, noise variance, and privacy parameters (ε, δ) to the threshold required for the local descent step to make progress. Without these relations it is impossible to determine whether the claimed polynomial dimensional dependence remains meaningful for practical ε values.
- [Theoretical analysis] Theoretical analysis section (assumed to contain the proof): The manuscript provides no derivation details, explicit error bounds, or validation of the privacy mechanism implementation. The reader's report notes the absence of these elements, which are load-bearing for assessing whether the total error (approximation + privacy noise) stays below the local-convergence threshold.
minor comments (1)
- [Abstract] The abstract is information-dense; expanding the 'suitable conditions' clause into one additional sentence would improve readability without lengthening the paper.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We agree that explicit parameter relations and expanded derivation details would strengthen the presentation of the convergence result and will incorporate these changes. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central convergence claim is stated only 'under suitable conditions' with no explicit bounds relating the GP length-scale, smoothness constant, noise variance, and privacy parameters (ε, δ) to the threshold required for the local descent step to make progress. Without these relations it is impossible to determine whether the claimed polynomial dimensional dependence remains meaningful for practical ε values.
Authors: We agree that the abstract would benefit from greater specificity. The theoretical analysis establishes the polynomial dimensional dependence under assumptions on the GP length-scale, smoothness constant, noise variance, and privacy parameters (ε, δ), with the local descent step progressing when the total error remains below a derived threshold. In the revision we will update the abstract to reference these conditions explicitly and add a remark (or short corollary) in the theory section that relates the parameters to the progress threshold, allowing readers to assess relevance for practical ε ≪ 1. revision: yes
-
Referee: [Theoretical analysis] Theoretical analysis section (assumed to contain the proof): The manuscript provides no derivation details, explicit error bounds, or validation of the privacy mechanism implementation. The reader's report notes the absence of these elements, which are load-bearing for assessing whether the total error (approximation + privacy noise) stays below the local-convergence threshold.
Authors: The main text states the convergence theorem while the complete proof appears in the appendix; however, we acknowledge that the main body lacks intermediate derivation steps and an explicit combined error bound. We will expand the theoretical analysis section to include key proof steps and a bound showing that the sum of GP approximation error and privacy noise remains below the local-convergence threshold under the stated conditions. The privacy mechanism follows the Gaussian mechanism applied to the GP gradient estimate; we will add a short paragraph clarifying this and referencing the standard privacy analysis to address the implementation validation point. revision: yes
Circularity Check
No circularity: convergence theorem is a self-contained proof under stated assumptions
full rationale
The paper's central claim is a convergence theorem for DP-GIBO to a locally optimal point (up to privacy error with polynomial dimension scaling). This is presented as a mathematical result relying on suitable conditions for the GP surrogate and noise. No quoted equations or steps reduce the claimed prediction or theorem to a fitted parameter, self-definition, or self-citation chain by construction. The derivation chain is independent of the target result itself and does not invoke load-bearing self-citations or ansatzes smuggled from prior author work. This is the normal case of a self-contained theoretical analysis.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Suitable conditions exist under which the private gradient approximation preserves local convergence
Reference graph
Works this paper leans on
-
[1]
Deep learning with differential privacy
Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC conference on computer and communications security , pages 308--318, 2016
work page 2016
-
[2]
Differentially private inference via noisy optimization
Marco Avella-Medina, Casey Bradshaw, and Po-Ling Loh. Differentially private inference via noisy optimization. The Annals of Statistics , 51(5):2067--2092, 2023
work page 2067
-
[3]
Private empirical risk minimization: Efficient algorithms and tight error bounds
Raef Bassily, Adam Smith, and Abhradeep Thakurta. Private empirical risk minimization: Efficient algorithms and tight error bounds. In Foundations of Computer Science , 2014
work page 2014
-
[4]
The role of differential privacy in gdpr compliance
Rachel Cummings and Deven Desai. The role of differential privacy in gdpr compliance. In Proceedings of the Conference on Fairness, Accountability, and Transparency , 2018
work page 2018
-
[5]
Jinshuo Dong, Aaron Roth, and Weijie J Su. Gaussian differential privacy. Journal of the Royal Statistical Society: Series B , 84(1):3--37, 2022
work page 2022
-
[6]
Cynthia Dwork. Differential privacy. In International Colloquium on Automata, Languages, and Programming . Springer, 2006
work page 2006
-
[7]
High-dimensional bayesian optimization with sparse axis-aligned subspaces
David Eriksson and Martin Jankowiak. High-dimensional bayesian optimization with sparse axis-aligned subspaces. In Uncertainty in Artificial Intelligence , pages 493--503. PMLR, 2021
work page 2021
-
[8]
Scalable global optimization via local Bayesian optimization
David Eriksson, Michael Pearce, Jacob Gardner, Ryan D Turner, and Matthias Poloczek. Scalable global optimization via local Bayesian optimization. Advances in Neural Information Processing Systems , 2019
work page 2019
-
[9]
Private stochastic convex optimization: Optimal rates in linear time
Vitaly Feldman, Tomer Koren, and Kunal Talwar. Private stochastic convex optimization: Optimal rates in linear time. In Symposium on Theory of Computing , 2020
work page 2020
-
[10]
Roman Garnett. Bayesian optimization . Cambridge University Press, 2023
work page 2023
-
[11]
Faster differentially private convex optimization via second-order methods
Arun Ganesh, Mahdi Haghifam, Thomas Steinke, and Abhradeep Guha Thakurta. Faster differentially private convex optimization via second-order methods. Advances in Neural Information Processing Systems , 36:79426--79438, 2023
work page 2023
-
[12]
Relative location of CT slices on axial axis
Franz Graf, Hans-Peter Kriegel, Matthias Schubert, Sebastian Poelsterl, and Alexander Cavallaro. Relative location of CT slices on axial axis . UCI Machine Learning Repository, 2011. DOI : https://doi.org/10.24432/C5CP6G
-
[13]
Vanilla bayesian optimization performs great in high dimensions
Carl Hvarfner, Erik Orm Hellsten, and Luigi Nardi. Vanilla bayesian optimization performs great in high dimensions. In International Conference on Machine Learning , pages 20793--20817. PMLR, 2024
work page 2024
-
[14]
Towards practical differentially private convex optimization
Roger Iyengar, Joseph P Near, Dawn Song, Om Thakkar, Abhradeep Thakurta, and Lun Wang. Towards practical differentially private convex optimization. In 2019 IEEE symposium on security and privacy (SP) , pages 299--316. IEEE, 2019
work page 2019
-
[15]
Machine learning applications in cancer prognosis and prediction
Konstantina Kourou, Themis P Exarchos, Konstantinos P Exarchos, Michalis V Karamouzis, and Dimitrios I Fotiadis. Machine learning applications in cancer prognosis and prediction. Computational and Structural Biotechnology Journal , 2015
work page 2015
-
[16]
Differentially private Bayesian optimization
Matt Kusner, Jacob Gardner, Roman Garnett, and Kilian Weinberger. Differentially private Bayesian optimization. In International Conference on Machine Learning , 2015
work page 2015
-
[17]
Private selection from private candidates
Jingcheng Liu and Kunal Talwar. Private selection from private candidates. In Symposium on Theory of Computing , 2019
work page 2019
-
[18]
Local latent space bayesian optimization over structured inputs
Natalie Maus, Haydn Jones, Juston Moore, Matt J Kusner, John Bradshaw, and Jacob Gardner. Local latent space bayesian optimization over structured inputs. In Advances in Neural Information Processing Systems , 2022
work page 2022
-
[19]
The role of adaptive optimizers for honest private hyperparameter selection
Shubhankar Mohapatra, Sajin Sasy, Xi He, Gautam Kamath, and Om Thakkar. The role of adaptive optimizers for honest private hyperparameter selection. In AAAI Conference on Artificial Intelligence , 2022
work page 2022
-
[20]
Local policy search with bayesian optimization
Sarah M \"u ller, Alexander von Rohr, and Sebastian Trimpe. Local policy search with bayesian optimization. Advances in Neural Information Processing Systems , 2021
work page 2021
-
[21]
Lectures on convex optimization , volume 137
Yurii Nesterov. Lectures on convex optimization , volume 137. Springer, 2018
work page 2018
-
[22]
Hyperparameter tuning with renyi differential privacy
Nicolas Papernot and Thomas Steinke. Hyperparameter tuning with renyi differential privacy. International Conference on Learning Representations , 2022
work page 2022
-
[23]
Philippe Rigollet and Jan-Christian H \"u tter. High-dimensional statistics. Lecture notes for Course 18S997 , 2023
work page 2023
-
[24]
Shuang Song, Kamalika Chaudhuri, and Anand D. Sarwate. Stochastic gradient descent with differentially private updates. In 2013 IEEE Global Conference on Signal and Information Processing , 2013
work page 2013
-
[25]
Practical Bayesian optimization of machine learning algorithms
Jasper Snoek, Hugo Larochelle, and Ryan P Adams. Practical Bayesian optimization of machine learning algorithms. Advances in Neural Information Processing Systems , 2012
work page 2012
-
[26]
Evading the curse of dimensionality in unconstrained private GLM s
Shuang Song, Thomas Steinke, Om Thakkar, and Abhradeep Thakurta. Evading the curse of dimensionality in unconstrained private GLM s. In International Conference on Artificial Intelligence and Statistics , pages 2638--2646. PMLR, 2021
work page 2021
-
[27]
Taking the human out of the loop: A review of Bayesian optimization
Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE , 2015
work page 2015
-
[28]
Dp-hypo: An adaptive private framework for hyperparameter optimization
Hua Wang, Sheng Gao, Huanyu Zhang, Weijie Su, and Milan Shen. Dp-hypo: An adaptive private framework for hyperparameter optimization. Advances in Neural Information Processing Systems , 2023
work page 2023
-
[29]
The behavior and convergence of local Bayesian optimization
Kaiwen Wu, Kyurae Kim, Roman Garnett, and Jacob Gardner. The behavior and convergence of local Bayesian optimization. Advances in Neural Information Processing Systems , 2024
work page 2024
-
[30]
A statistical framework for differential privacy
Larry Wasserman and Shuheng Zhou. A statistical framework for differential privacy. Journal of the American Statistical Association , 105(489):375--389, 2010
work page 2010
-
[31]
Gaussian differentially private robust mean estimation and inference
Myeonghun Yu, Zhao Ren, and Wen-Xin Zhou. Gaussian differentially private robust mean estimation and inference. Bernoulli , 30(4):3059--3088, 2024
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.