Bayesian Optimization by Kernel Regression and Density-based Exploration
Pith reviewed 2026-05-23 04:12 UTC · model grok-4.3
The pith
BOKE replaces Gaussian processes with kernel regression and density estimation to reduce Bayesian optimization to quadratic cost while proving global convergence under noisy evaluations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BOKE integrates kernel regression for efficient function approximation and kernel density estimation for exploration into the confidence-bound acquisition function, thereby lowering per-iteration complexity from cubic to quadratic while establishing rigorous global convergence guarantees under noisy evaluations.
What carries the argument
Kernel regression paired with kernel density estimation inside the upper-confidence-bound acquisition criterion.
If this is right
- BOKE performs competitively with Gaussian-process Bayesian optimization and other baselines on both synthetic and real-world tasks.
- The method exhibits markedly lower wall-clock time, making it suitable for resource-constrained engineering settings.
- Global convergence holds even when function evaluations are corrupted by noise.
- Overall runtime scales quadratically rather than quartically with the number of iterations.
Where Pith is reading between the lines
- The same substitution technique could be tested on acquisition functions other than upper confidence bound.
- Quadratic scaling may allow Bayesian optimization on data sets an order of magnitude larger than current Gaussian-process limits permit.
- Density-based exploration might automatically adapt step sizes on problems with widely separated local optima.
Load-bearing premise
Replacing the Gaussian-process posterior with kernel regression and kernel density estimates inside the acquisition function still yields the same convergence guarantees that standard analyses require.
What would settle it
A single-dimensional noisy test function on which BOKE repeatedly converges to a suboptimal point while a Gaussian-process Bayesian optimizer reaches the known global optimum.
Figures
read the original abstract
Bayesian optimization is highly effective for optimizing expensive-to-evaluate black-box functions, but it faces significant computational challenges due to the cubic per-iteration cost of Gaussian processes, which results in a total time complexity that is quartic with respect to the number of iterations. To address this limitation, we propose a novel algorithm, Bayesian optimization by kernel regression and density-based exploration (BOKE). BOKE uses kernel regression for efficient function approximation, kernel density for exploration, and integrates them into the confidence bound criteria to guide the optimization process, thus reducing computational costs to quadratic. Our theoretical analysis rigorously establishes the global convergence of BOKE under noisy evaluations. Through extensive numerical experiments on both synthetic and real-world optimization tasks, we demonstrate that BOKE not only performs competitively compared to Gaussian process-based methods and several other baseline methods but also exhibits superior computational efficiency. These results highlight BOKE's effectiveness in resource-constrained environments, providing a practical approach for optimization problems in engineering applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes BOKE, an algorithm replacing Gaussian process posteriors in Bayesian optimization with kernel regression for mean/variance approximation and kernel density estimation for an exploration term, both inserted into a confidence-bound acquisition function. It claims this reduces per-iteration cost such that total complexity is quadratic in the number of evaluations, rigorously proves global convergence under noisy black-box evaluations, and reports competitive empirical performance versus GP-based BO and baselines on synthetic and real tasks.
Significance. A correctly proved result would be significant: it would supply a scalable, non-GP Bayesian optimization method whose acquisition function still guarantees global convergence, directly addressing the O(n^4) bottleneck while retaining theoretical backing. The claimed quadratic complexity and noise-robust convergence, if established with explicit error control, would be a useful contribution to the optimization literature.
major comments (2)
- [§4 (Convergence Analysis)] §4 (Convergence Analysis), Theorem 1 (or equivalent statement of global convergence): the proof does not derive explicit bounds on the deviation between the kernel-regression estimates and the true posterior mean/variance, nor on the bias introduced by the KDE exploration term inside the acquisition function. Standard UCB regret analyses rely on controlling these quantities to ensure the modified acquisition still forces sufficient exploration; without such rates showing that the total approximation error vanishes fast enough relative to the information gain, the claimed extension of existing theory does not hold.
- [§5 (Numerical Experiments)] §5 (Numerical Experiments), Tables 1–3 and associated text: the experimental protocol omits the number of independent runs, the precise noise model and variance schedule, the cross-validation or selection procedure for kernel bandwidths in both regression and KDE, and any statistical significance testing. These omissions make it impossible to verify the competitiveness and efficiency claims against the GP baselines.
minor comments (1)
- [§2–3] Notation for the kernel regression estimator and the density-based bonus term is introduced without a dedicated preliminary section; a short subsection defining all symbols before the algorithm would improve readability.
Simulated Author's Rebuttal
We thank the referee for their thorough review and valuable feedback on our manuscript. We address each of the major comments below and outline the revisions we will make to strengthen the paper.
read point-by-point responses
-
Referee: [§4 (Convergence Analysis)] §4 (Convergence Analysis), Theorem 1 (or equivalent statement of global convergence): the proof does not derive explicit bounds on the deviation between the kernel-regression estimates and the true posterior mean/variance, nor on the bias introduced by the KDE exploration term inside the acquisition function. Standard UCB regret analyses rely on controlling these quantities to ensure the modified acquisition still forces sufficient exploration; without such rates showing that the total approximation error vanishes fast enough relative to the information gain, the claimed extension of existing theory does not hold.
Authors: We agree that explicit error bounds are necessary to rigorously extend the UCB convergence analysis to our approximated acquisition function. The current Section 4 provides a high-level argument but lacks the detailed deviation rates. In the revised version, we will add a new lemma deriving bounds on the kernel regression approximation error to the GP posterior (using properties of kernel ridge regression) and on the KDE bias term. We will show that with bandwidth h_t scaling appropriately (e.g., h_t ~ t^{-1/(d+4)}), the total error is sufficiently small relative to the information gain term, preserving the sublinear regret and thus global convergence. This will be incorporated into an updated proof of Theorem 1. revision: yes
-
Referee: [§5 (Numerical Experiments)] §5 (Numerical Experiments), Tables 1–3 and associated text: the experimental protocol omits the number of independent runs, the precise noise model and variance schedule, the cross-validation or selection procedure for kernel bandwidths in both regression and KDE, and any statistical significance testing. These omissions make it impossible to verify the competitiveness and efficiency claims against the GP baselines.
Authors: We acknowledge these omissions in the experimental description. In the revision, we will update Section 5 to include: the number of independent runs (20 runs per task), the noise model (additive Gaussian noise with fixed variance 0.01 for synthetic functions, and task-specific for real-world), the bandwidth selection via 5-fold cross-validation on a held-out validation set for both kernel regression and KDE, and statistical tests (reporting mean and standard deviation, with p-values from paired t-tests comparing to baselines). The tables will be expanded to reflect these details and ensure reproducibility. revision: yes
Circularity Check
No circularity; convergence established by analysis rather than by construction or self-citation.
full rationale
The provided abstract and context present BOKE as using kernel regression and KDE to modify the acquisition function, with a separate theoretical analysis claimed to prove global convergence under noise. No equations, self-citations, or fitted quantities are quoted that reduce the convergence result to an input by definition. The derivation chain is described as independent analysis controlling approximation errors, which is the standard non-circular structure for such algorithmic papers. Absence of explicit reduction steps in the visible text supports a finding of no circularity.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
BOKE uses kernel regression for efficient function approximation, kernel density for exploration, and integrates them into the confidence bound criteria... IKR-UCB acquisition function a(x) = m(x) + β_t σ̂(x) with σ̂ = W^{-1/2}
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Theorem 4 (Algorithmic consistency of BOKE)... inf_t h_{X,X_t} = 0 a.s. via SNEB property and Cantor intersection
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
The continuum-armed bandit problem
Rajeev Agrawal. “The continuum-armed bandit problem”. In:SIAM Journal on Control and Optimization33.6 (1995), pp. 1926–1951
work page 1995
-
[2]
Finite-time analysis of the multiarmed bandit problem
Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. “Finite-time analysis of the multiarmed bandit problem”. In:Machine Learning47 (2002), pp. 235–256
work page 2002
-
[3]
Continuous upper confidence trees with polynomial exploration–consistency
David Auger, Adrien Cou¨ etoux, and Olivier Teytaud. “Continuous upper confidence trees with polynomial exploration–consistency”. In:Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2013, pp. 194–209
work page 2013
-
[4]
Unifying count-based exploration and intrinsic motivation
Marc Bellemare et al. “Unifying count-based exploration and intrinsic motivation”. In:Advances in Neural Information Processing Systems. Vol. 29. 2016
work page 2016
-
[5]
Global optimization via inverse distance weighting and radial basis functions
Alberto Bemporad. “Global optimization via inverse distance weighting and radial basis functions”. In:Com- putational Optimization and Applications77.2 (2020), pp. 571–595
work page 2020
-
[6]
Algorithms for hyper-parameter optimiza- tion
James Bergstra, R´ emi Bardenet, Yoshua Bengio, and Bal´ azs K´ egl. “Algorithms for hyper-parameter optimiza- tion”. In:Advances in Neural Information Processing Systems. Vol. 24. 2011
work page 2011
-
[7]
Random search for hyper-parameter optimization
James Bergstra and Yoshua Bengio. “Random search for hyper-parameter optimization”. In:Journal of Machine Learning Research13.1 (2012), pp. 281–305
work page 2012
-
[8]
Eric Brochu, Vlad M. Cora, and Nando de Freitas.A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. 2010. arXiv:1012. 2599 [cs.LG]
work page 2010
-
[9]
Pure exploration in finitely-armed and continuous-armed bandits
S´ ebastien Bubeck, R´ emi Munos, and Gilles Stoltz. “Pure exploration in finitely-armed and continuous-armed bandits”. In:Theoretical Computer Science412.19 (2011), pp. 1832–1852
work page 2011
-
[10]
Convergence rates of efficient global optimization algorithms
Adam D. Bull. “Convergence rates of efficient global optimization algorithms”. In:Journal of Machine Learning Research12.88 (2011), pp. 2879–2904
work page 2011
-
[11]
A limited memory algorithm for bound con- strained optimization
Richard H Byrd, Peihuang Lu, Jorge Nocedal, and Ciyou Zhu. “A limited memory algorithm for bound con- strained optimization”. In:SIAM Journal on Scientific Computing16.5 (1995), pp. 1190–1208
work page 1995
-
[12]
Gaussian process optimization with adaptive sketching: scalable and no regret
Daniele Calandriello, Luigi Carratino, Alessandro Lazaric, Michal Valko, and Lorenzo Rosasco. “Gaussian process optimization with adaptive sketching: scalable and no regret”. In:Proceedings of the Thirty-Second Conference on Learning Theory. Vol. 99. 2019, pp. 533–557
work page 2019
-
[13]
Simple regret for infinitely many armed bandits
Alexandra Carpentier and Michal Valko. “Simple regret for infinitely many armed bandits”. In:Proceedings of the 32nd International Conference on Machine Learning. Vol. 37. 2015, pp. 1133–1141
work page 2015
-
[14]
Bayesian experimental design: a review
Kathryn Chaloner and Isabella Verdinelli. “Bayesian experimental design: a review”. In:Statistical Science (1995), pp. 273–304
work page 1995
- [15]
-
[16]
Modeling wine preferences by data mining from physicochemical properties
Paulo Cortez, Ant´ onio Cerdeira, Fernando Almeida, Telmo Matos, and Jos´ e Reis. “Modeling wine preferences by data mining from physicochemical properties”. In:Decision Support Systems47.4 (2009), pp. 547–553
work page 2009
-
[17]
Contin- uous upper confidence trees
Adrien Cou¨ etoux, Jean-Baptiste Hoock, Nataliya Sokolovska, Olivier Teytaud, and Nicolas Bonnard. “Contin- uous upper confidence trees”. In:Learning and Intelligent Optimization. 2011, pp. 433–445
work page 2011
-
[18]
Hebo: Pushing the limits of sample-efficient hyper-parameter optimisation
Alexander I Cowen-Rivers et al. “Hebo: Pushing the limits of sample-efficient hyper-parameter optimisation”. In:Journal of Artificial Intelligence Research74 (2022), pp. 1269–1349
work page 2022
-
[19]
A statistical method for global optimization
Dennis D Cox and Susan John. “A statistical method for global optimization”. In:[Proceedings] 1992 IEEE International Conference on Systems, Man, and Cybernetics. Vol. 2. 1992, pp. 1241–1246
work page 1992
-
[20]
SDO: a statistical method for global optimization
Dennis D Cox and Susan John. “SDO: a statistical method for global optimization”. In:Multidisciplinary Design Optimization: State of the Art80 (1997), pp. 315–329
work page 1997
-
[21]
Steven B Damelin and V Maymeskul. “On point energies, separation radius and mesh norm for s-extremal configurations on compact sets in Rn”. In:Journal of Complexity21.6 (2005), pp. 845–863
work page 2005
-
[22]
Asynchronous decentral- ized Bayesian optimization for large scale hyperparameter optimization
Romain Egel´ e, Isabelle Guyon, Venkatram Vishwanath, and Prasanna Balaprakash. “Asynchronous decentral- ized Bayesian optimization for large scale hyperparameter optimization”. In:2023 IEEE 19th International Conference on e-Science (e-Science). 2023, pp. 1–10. 30
work page 2023
-
[23]
Scalable global optimization via local Bayesian optimization
David Eriksson, Michael Pearce, Jacob Gardner, Ryan D Turner, and Matthias Poloczek. “Scalable global optimization via local Bayesian optimization”. In:Advances in Neural Information Processing Systems. Vol. 32. 2019
work page 2019
-
[24]
New York, NY: Cambridge University Press, 2023
Roman Garnett.Bayesian optimization. New York, NY: Cambridge University Press, 2023
work page 2023
-
[25]
David Gaudrie, Rodolphe Le Riche, and Tanguy Appriou. “An empirical case of Gaussian processes learning in high dimension: the likelihood versus leave-one-out rivalry”. In:SIAM Conference on Uncertainty Quantification (UQ24). 2024
work page 2024
-
[26]
Phoenics: a Bayesian optimizer for chemistry
Florian H¨ ase, Lo¨ ıc M. Roch, Christoph Kreisbeck, and Al´ an Aspuru-Guzik. “Phoenics: a Bayesian optimizer for chemistry”. In:ACS Central Science4.9 (2018), pp. 1134–1145
work page 2018
-
[27]
MCMC for varia- tionally sparse Gaussian processes
James Hensman, Alexander G Matthews, Maurizio Filippone, and Zoubin Ghahramani. “MCMC for varia- tionally sparse Gaussian processes”. In:Advances in Neural Information Processing Systems. Vol. 1. 2015, pp. 1648–1656
work page 2015
-
[28]
Sequential model-based optimization for general algorithm configuration
Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. “Sequential model-based optimization for general algorithm configuration”. In:Learning and Intelligent Optimization. 2011, pp. 507–523
work page 2011
-
[29]
Parallel algorithm configuration
Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. “Parallel algorithm configuration”. In:Learning and Intelligent Optimization. 2012, pp. 55–70
work page 2012
-
[30]
Vanilla Bayesian optimization performs great in high di- mensions
Carl Hvarfner, Erik Orm Hellsten, and Luigi Nardi. “Vanilla Bayesian optimization performs great in high di- mensions”. In:Proceedings of the 41st International Conference on Machine Learning. Vol. 235. 2024, pp. 20793– 20817
work page 2024
-
[31]
A literature survey of benchmark functions for global optimisation problems
Momin Jamil and Xin-She Yang. “A literature survey of benchmark functions for global optimisation problems”. In:International Journal of Mathematical Modelling and Numerical Optimisation4.2 (2013), pp. 150–194
work page 2013
-
[32]
Scalable Bayesian optimization using Vecchia approximations of Gaus- sian processes
Felix Jimenez and Matthias Katzfuss. “Scalable Bayesian optimization using Vecchia approximations of Gaus- sian processes”. In:Proceedings of The 26th International Conference on Artificial Intelligence and Statistics. Vol. 206. 2023, pp. 1492–1512
work page 2023
-
[33]
Efficient global optimization of expensive black-box functions
Donald R Jones, Matthias Schonlau, and William J Welch. “Efficient global optimization of expensive black-box functions”. In:Journal of Global Optimization13 (1998), pp. 455–492
work page 1998
-
[34]
Kernel approximation: from regression to interpolation
Lulu Kang and V Roshan Joseph. “Kernel approximation: from regression to interpolation”. In:SIAM/ASA Journal on Uncertainty Quantification4.1 (2016), pp. 112–129
work page 2016
-
[35]
A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise
Harold J Kushner. “A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise”. In:Journal of Basic Engineering86.1 (1964), pp. 97–106
work page 1964
-
[36]
Sequential adaptive designs in computer experiments for response surface model fit
Chen Quin Lam. “Sequential adaptive designs in computer experiments for response surface model fit”. PhD thesis. The Ohio State University, 2008
work page 2008
-
[37]
Hyperband: a novel bandit-based approach to hyperparameter optimization
Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. “Hyperband: a novel bandit-based approach to hyperparameter optimization”. In:Journal of Machine Learning Research18.185 (2018), pp. 1–52
work page 2018
-
[38]
Towards insensitivity of Nadaraya–Watson estimators to design correlation
Yu Yu Linke. “Towards insensitivity of Nadaraya–Watson estimators to design correlation”. In:Theory of Probability & Its Applications68.2 (2023), pp. 198–210
work page 2023
-
[39]
When Gaussian process meets big data: a review of scalable GPs
Haitao Liu, Yew-Soon Ong, Xiaobo Shen, and Jianfei Cai. “When Gaussian process meets big data: a review of scalable GPs”. In:IEEE Transactions on Neural Networks and Learning Systems31.11 (2020), pp. 4405–4423
work page 2020
-
[40]
MD McKay, RJ Beckman, and WJ Conover. “Comparison of three methods for selecting values of input variables in the analysis of output from a computer code”. In:Technometrics21.2 (1979), pp. 239–245
work page 1979
-
[41]
The application of Bayesian methods for seeking the extremum
Jonas Mockus. “The application of Bayesian methods for seeking the extremum”. In:Towards Global Opti- mization2 (1978), pp. 117–129
work page 1978
-
[42]
Bayesian heuristic approach to global optimization and examples
Jonas Mockus. “Bayesian heuristic approach to global optimization and examples”. In:Journal of Global Optimization22.1-4 (2002), pp. 191–203
work page 2002
-
[43]
Sequential adaptive design for emulating costly computer codes
Hossein Mohammadi and Peter Challenor. “Sequential adaptive design for emulating costly computer codes”. In:Journal of Statistical Computation and Simulation95.3 (2025), pp. 654–675
work page 2025
-
[44]
Cross-validation–based adaptive sampling for Gaussian process models
Hossein Mohammadi, Peter Challenor, Daniel Williamson, and Marc Goodfellow. “Cross-validation–based adaptive sampling for Gaussian process models”. In:SIAM/ASA Journal on Uncertainty Quantification10.1 (2022), pp. 294–316. 31
work page 2022
-
[45]
Efficient high dimensional Bayesian optimization with additivity and quadrature Fourier features
Mojmir Mutny and Andreas Krause. “Efficient high dimensional Bayesian optimization with additivity and quadrature Fourier features”. In:Advances in Neural Information Processing Systems. Vol. 31. 2018
work page 2018
-
[46]
Elizbar A. Nadaraya. “On estimating regression”. In:Theory of Probability & Its Applications9.1 (1964), pp. 141–142
work page 1964
-
[47]
Batch Bayesian optimisation via density-ratio estimation with guarantees
Rafael Oliveira, Louis C. Tiao, and Fabio Ramos. “Batch Bayesian optimisation via density-ratio estimation with guarantees”. In:Advances in Neural Information Processing Systems. Vol. 35. 2022, pp. 29816–29829
work page 2022
-
[48]
Sparse spatial autoregressions
R Kelley Pace and Ronald Barry. “Sparse spatial autoregressions”. In:Statistics & Probability Letters33.3 (1997), pp. 291–297
work page 1997
-
[49]
On estimation of a probability density function and mode
Emanuel Parzen. “On estimation of a probability density function and mode”. In:The Annals of Mathematical Statistics33.3 (1962), pp. 1065–1076
work page 1962
-
[50]
A benchmark of kriging-based infill criteria for noisy optimization
Victor Picheny, Tobias Wagner, and David Ginsbourger. “A benchmark of kriging-based infill criteria for noisy optimization”. In:Structural and Multidisciplinary Optimization48 (2013), pp. 607–626
work page 2013
-
[51]
Tony Pourmohamad and Herbert K. H. Lee.Bayesian optimization with application to computer experiments. Cham: Springer, 2021
work page 2021
-
[52]
William H Press. “Bandit solutions provide unified ethical models for randomized clinical trials and comparative effectiveness research”. In:Proceedings of the National Academy of Sciences106.52 (2009), pp. 22387–22392
work page 2009
-
[53]
GLISp-r: a preference-based optimization algorithm with convergence guarantees
Davide Previtali, Mirko Mazzoleni, Antonio Ferramosca, and Fabio Previdi. “GLISp-r: a preference-based optimization algorithm with convergence guarantees”. In:Computational Optimization and Applications86.1 (2023), pp. 383–420
work page 2023
-
[54]
Ada-BKB: scalable Gaussian process op- timization on continuous domains by adaptive discretization
Marco Rando, Luigi Carratino, Silvia Villa, and Lorenzo Rosasco. “Ada-BKB: scalable Gaussian process op- timization on continuous domains by adaptive discretization”. In:Proceedings of The 25th International Con- ference on Artificial Intelligence and Statistics. Vol. 151. 2022, pp. 7320–7348
work page 2022
-
[55]
Remarks on some nonparametric estimates of a density function
M Rosenblat. “Remarks on some nonparametric estimates of a density function”. In:The Annals of Mathe- matical Statistics27 (1956), pp. 832–837
work page 1956
-
[56]
Hoboken, NJ: John Wiley & Sons, 2015
David W Scott.Multivariate density estimation: Theory, practice, and visualization. Hoboken, NJ: John Wiley & Sons, 2015
work page 2015
-
[57]
Berlin, Heidelberg: Springer, 2010
Karl Siebertz, Thomas Hochkirchen, and D van Bebber.Statistische versuchsplanung: Design of experiments (DoE). Berlin, Heidelberg: Springer, 2010
work page 2010
-
[58]
Bernard W Silverman.Density estimation for statistics and data analysis. New York, NY: Routledge, 1998
work page 1998
-
[59]
Practical Bayesian optimization of machine learning algorithms
Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. “Practical Bayesian optimization of machine learning algorithms”. In:Advances in Neural Information Processing Systems. Vol. 25. 2012, pp. 2951–2959
work page 2012
-
[60]
A general recipe for likelihood-free Bayesian optimization
Jiaming Song, Lantao Yu, Willie Neiswanger, and Stefano Ermon. “A general recipe for likelihood-free Bayesian optimization”. In:Proceedings of the 39th International Conference on Machine Learning. Vol. 162. 2022, pp. 20384–20404
work page 2022
-
[61]
Gaussian process optimization in the bandit setting: no regret and experimental design
Niranjan Srinivas, Andreas Krause, Sham Kakade, and Matthias Seeger. “Gaussian process optimization in the bandit setting: no regret and experimental design”. In:Proceedings of the 27th International Conference on Machine Learning. 2010, pp. 1015–1022
work page 2010
-
[62]
Scalable Bayesian optimization with generalized product of experts
Saulius Tautvaiˇ sas and Julius ˇZilinskas. “Scalable Bayesian optimization with generalized product of experts”. In:Journal of Global Optimization88.3 (2024), pp. 777–802
work page 2024
-
[63]
BORE: Bayesian optimization by density-ratio estimation
Louis C Tiao et al. “BORE: Bayesian optimization by density-ratio estimation”. In:Proceedings of the 38th International Conference on Machine Learning. Vol. 139. 2021, pp. 10289–10300
work page 2021
-
[64]
Variational learning of inducing variables in sparse Gaussian processes
Michalis Titsias. “Variational learning of inducing variables in sparse Gaussian processes”. In:Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics. Vol. 5. 2009, pp. 567–574
work page 2009
-
[65]
Aimo T¨ orn and Antanas ˇZilinskas.Global optimization. Vol. 350. Berlin: Springer, 1989
work page 1989
-
[66]
Optimal order simple regret for Gaussian process bandits
Sattar Vakili, Nacime Bouziani, Sepehr Jalali, Alberto Bernacchia, and Da-shan Shiu. “Optimal order simple regret for Gaussian process bandits”. In:Advances in Neural Information Processing Systems. Vol. 34. 2021, pp. 21202–21215
work page 2021
-
[67]
Emmanuel Vazquez and Julien Bect. “Convergence properties of the expected improvement algorithm with fixed mean and covariance functions”. In:Journal of Statistical Planning and Inference140.11 (2010), pp. 3088–3095. 32
work page 2010
-
[68]
Martin J Wainwright.High-dimensional statistics: a non-asymptotic viewpoint. Vol. 48. New York, NY: Cam- bridge University Press, 2019
work page 2019
-
[69]
Exact Gaussian processes on a million data points
Ke Wang et al. “Exact Gaussian processes on a million data points”. In:Advances in Neural Information Processing Systems. Vol. 32. 2019
work page 2019
-
[70]
Wenjia Wang, Xiaowei Zhang, and Lu Zou.Regret optimality of GP-UCB. 2023. arXiv:2312.01386 [cs.LG]
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[71]
Geoffrey S. Watson. “Smooth regression analysis”. In:Sankhy¯ a: The Indian Journal of Statistics, Series A (1964), pp. 359–372
work page 1964
-
[72]
Tizian Wenzel, Gabriele Santin, and Bernard Haasdonk. “A novel class of stabilized greedy kernel approximation algorithms: convergence, stability and uniform point distribution”. In:Journal of Approximation Theory262 (2021), p. 105508
work page 2021
-
[73]
Cambridge, MA: The MIT Press, 2006
Christopher KI Williams and Carl Edward Rasmussen.Gaussian processes for machine learning. Cambridge, MA: The MIT Press, 2006
work page 2006
-
[74]
Monte Carlo tree search in continuous action spaces with execution uncertainty
Timothy Yee, Viliam Lis´ y, and Michael H. Bowling. “Monte Carlo tree search in continuous action spaces with execution uncertainty”. In:Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. 2016, pp. 690–696. 33
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.