Global Optimization of Gaussian Process Acquisition Functions Using a Piecewise-Linear Kernel Approximation

Calvin Tsay; Joel A. Paulson; Shiqiang Zhang; Yilin Xie

arxiv: 2410.16893 · v3 · submitted 2024-10-22 · 🧮 math.OC · cs.LG

Global Optimization of Gaussian Process Acquisition Functions Using a Piecewise-Linear Kernel Approximation

Yilin Xie , Shiqiang Zhang , Joel A. Paulson , Calvin Tsay This is my paper

Pith reviewed 2026-05-23 19:05 UTC · model grok-4.3

classification 🧮 math.OC cs.LG

keywords Bayesian optimizationGaussian processesmixed-integer quadratic programmingacquisition functionspiecewise-linear approximationglobal optimizationregret bounds

0 comments

The pith

A piecewise-linear kernel approximation lets mixed-integer quadratic programming globally optimize Gaussian process acquisition functions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that acquisition function optimization in Bayesian optimization, a non-convex problem usually tackled with local or sampling methods, can instead be solved to global optimality by approximating Gaussian process kernels with piecewise-linear functions inside a mixed-integer quadratic program. This matters because global solutions would select better evaluation points when each function call is expensive. The formulation applies specifically to uncertainty-based acquisition functions using any stationary or dot-product kernel and includes an analysis of the resulting regret bounds. Empirical tests cover synthetic benchmarks, constrained problems, and hyperparameter tuning.

Core claim

The PK-MIQP formulation introduces a piecewise-linear approximation for Gaussian process kernels and admits a corresponding MIQP representation for acquisition functions. The proposed method is applicable to uncertainty-based acquisition functions for any stationary or dot-product kernel. Theoretical regret bounds of the approximation are analyzed.

What carries the argument

The Piecewise-linear Kernel Mixed Integer Quadratic Programming (PK-MIQP) formulation, which replaces the kernel with a piecewise-linear model to produce an equivalent mixed-integer quadratic program for the acquisition function.

If this is right

Uncertainty-based acquisition functions for stationary and dot-product kernels become globally optimizable by off-the-shelf MIQP solvers.
Regret bounds can be stated for the approximated acquisition function.
The same kernel approximation works across multiple uncertainty-based acquisition functions without kernel-specific reformulation.
The approach extends to constrained Bayesian optimization problems as shown in the benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Tighter control of the piecewise-linear breakpoints could trade off solve time against approximation quality in higher dimensions.
The MIQP encoding might be combined with warm-starting from previous iterations to reduce solve times across Bayesian optimization steps.
If the approximation error can be bounded uniformly, the method could support theoretical sample-complexity guarantees that match non-approximated Bayesian optimization.

Load-bearing premise

The piecewise-linear kernel approximation error stays small enough that the derived regret bounds remain meaningful and the MIQP instances can be solved to optimality within practical time limits.

What would settle it

Solve a low-dimensional acquisition function whose true global maximum is known analytically; if the PK-MIQP optimum deviates beyond the approximation error bound or the solver times out without proving optimality, the claim does not hold.

Figures

Figures reproduced from arXiv: 2410.16893 by Calvin Tsay, Joel A. Paulson, Shiqiang Zhang, Yilin Xie.

**Figure 1.** Figure 1: (left) Illustration of piecewise linear approximation of kernel function. (right) Visualization of the effect of kernel approximation on LCB acquisition function. The solution from gradient-based method (orange square) may end up at a local minimum, a sampling-based solution can miss the global minimum, and optimizing approximated LCB using global model (red star) will provide the global solution. An alter… view at source ↗

**Figure 2.** Figure 2: (top) Matérn 3/2 kernel function divided into 3 parts. (bottom) The second-order derivative of Matérn 3/2 kernel function. Parts within threshold are considered as linear. Since the curvature of the kernel is proportional to the second-order derivative of k(·), we set a threshold ϵk to define “near-linear” parts, i.e., segments of the kernel with |k ′′(r)| ≤ ϵk. We empirically choose ϵk = 1.5 exp(−2)σ 2 f … view at source ↗

**Figure 3.** Figure 3: Numerical results on Bayesian optimization using PK-MIQP and the state-of-the-art minimizers. The mean [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

read the original abstract

Bayesian optimization relies on iteratively constructing and optimizing an acquisition function. The latter turns out to be a challenging, non-convex optimization problem itself. Despite the relative importance of this step, most algorithms employ sampling- or gradient-based methods, which do not provably converge to global optima. This work investigates mixed-integer programming (MIP) as a paradigm for global acquisition function optimization. Specifically, our Piecewise-linear Kernel Mixed Integer Quadratic Programming (PK-MIQP) formulation introduces a piecewise-linear approximation for Gaussian process kernels and admits a corresponding MIQP representation for acquisition functions. The proposed method is applicable to uncertainty-based acquisition functions for any stationary or dot-product kernel. We analyze the theoretical regret bounds of the proposed approximation, and empirically demonstrate the framework on synthetic functions, constrained benchmarks, and a hyperparameter tuning task.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a PK-MIQP formulation that turns piecewise-linear kernel approximations into mixed-integer quadratic programs for global acquisition optimization, but the regret bounds rest on unquantified approximation error.

read the letter

The main takeaway is that this work supplies a concrete way to recast acquisition function maximization as an MIQP by approximating the GP kernel with piecewise linear segments. The claim is that this works for any stationary or dot-product kernel and yields global solutions where sampling or gradient methods do not guarantee them. They also state regret bounds for the approximation and run tests on synthetic functions, constrained problems, and hyperparameter tuning. That combination is new enough to stand out from prior MIP or approximation work in BO. The formulation itself looks like a clean technical step that could be implemented and checked. The experiments cover a reasonable range of settings, which is better than abstract-only claims. The soft spot sits in the regret analysis. The abstract says bounds are derived but supplies no explicit control on how the piecewise-linear error scales with pieces, dimension, or length-scale. If that error term is not shown to remain small relative to the usual sublinear regret, the guarantees become formal rather than informative. The stress-test note correctly flags this gap from the abstract alone. Scalability of the resulting MIQPs is another open question not addressed in the provided text. This paper is aimed at people working on Bayesian optimization who care about deterministic global inner optimization rather than local search. Readers already comfortable with MIP solvers and kernel approximations will get the most out of it. The idea is grounded enough and the claims are specific enough that it deserves a serious referee to examine the derivations and full experimental details. I would send it to peer review.

Referee Report

1 major / 1 minor

Summary. The paper proposes a Piecewise-linear Kernel Mixed Integer Quadratic Programming (PK-MIQP) formulation that approximates Gaussian process kernels via piecewise-linear functions, enabling a mixed-integer quadratic programming representation for global optimization of uncertainty-based acquisition functions. The method applies to any stationary or dot-product kernel. It claims to analyze theoretical regret bounds of the approximation and provides empirical demonstrations on synthetic functions, constrained benchmarks, and hyperparameter tuning.

Significance. If the approximation error is controlled such that the regret bounds remain informative and the MIQP instances are tractable, the work would offer a rigorous global optimization paradigm for acquisition functions in Bayesian optimization, moving beyond local or sampling-based methods. The explicit MIQP encoding for a range of kernels is a concrete technical contribution.

major comments (1)

[Regret bounds analysis] Regret bounds analysis: the central claim that theoretical regret bounds are analyzed for the proposed approximation is load-bearing, yet the manuscript provides no explicit bound on the piecewise-linear kernel approximation error, no Lipschitz constant for the approximation, and no scaling of the error with the number of pieces, input dimension, or length-scale parameters. Without such quantification it is impossible to verify whether the derived bounds remain sublinear or are dominated by the approximation error.

minor comments (1)

The abstract states that the method 'admits a corresponding MIQP representation' but does not indicate the size of the resulting MIQP (number of binary variables, constraints) as a function of the number of pieces or data points; adding this scaling information would improve clarity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive comments. We address the single major comment below.

read point-by-point responses

Referee: Regret bounds analysis: the central claim that theoretical regret bounds are analyzed for the proposed approximation is load-bearing, yet the manuscript provides no explicit bound on the piecewise-linear kernel approximation error, no Lipschitz constant for the approximation, and no scaling of the error with the number of pieces, input dimension, or length-scale parameters. Without such quantification it is impossible to verify whether the derived bounds remain sublinear or are dominated by the approximation error.

Authors: We agree that the manuscript does not supply an explicit error bound, Lipschitz constant, or scaling law for the piecewise-linear kernel approximation. The current regret analysis assumes the approximation error can be controlled but does not quantify it. In the revision we will add a dedicated subsection deriving an explicit uniform approximation error bound for the piecewise-linear kernel (including its dependence on the number of pieces, input dimension, and length-scale), a corresponding Lipschitz constant, and the resulting condition under which the overall regret remains sublinear. This will make the theoretical claims fully verifiable. revision: yes

Circularity Check

0 steps flagged

No circularity: independent formulation of PK-MIQP approximation and regret analysis

full rationale

The paper introduces a piecewise-linear kernel approximation leading to an MIQP formulation for acquisition functions. This is constructed from standard MIP techniques applied to stationary/dot-product kernels and is not defined in terms of its own outputs or fitted parameters. The regret bound analysis is presented as a separate theoretical result without reducing to self-referential definitions, self-citations, or renaming of known empirical patterns. No load-bearing steps reduce by construction to inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that kernels are stationary or dot-product type and that a piecewise-linear approximation can be constructed whose error permits meaningful regret bounds; no free parameters or invented entities are identifiable from the abstract.

axioms (1)

domain assumption Gaussian process kernels are stationary or dot-product kernels
Abstract states applicability to any stationary or dot-product kernel for uncertainty-based acquisition functions.

pith-pipeline@v0.9.0 · 5675 in / 1299 out tokens · 41498 ms · 2026-05-23T19:05:36.293821+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 1 internal anchor

[1]

Ament, S

S. Ament, S. Daulton, D. Eriksson, M. Balandat, and E. Bakshy. Unexpected improvements to expected improvement for B ayesian optimization. In NeurIPS, 2024

work page 2024
[2]

Beale and J

E. Beale and J. Tomlin. Special facilities in a general mathematical programming system for nonconvex problems using ordered sets of variables. Operational Research, 69: 0 447--454, 1969

work page 1969
[3]

Belotti, C

P. Belotti, C. Kirches, S. Leyffer, J. Linderoth, J. Luedtke, and A. Mahajan. Mixed-integer nonlinear optimization. Acta Numerica, 22: 0 1--131, 2013

work page 2013
[4]

Bradford, A

E. Bradford, A. M. Schweidtmann, and A. Lapkin. Efficient multiobjective optimization employing G aussian processes, spectral sampling and a genetic algorithm. Journal of Global Optimization, 71 0 (2): 0 407--438, 2018

work page 2018
[5]

R. H. Byrd, M. E. Hribar, and J. Nocedal. An interior point algorithm for large-scale nonlinear programming. SIAM Journal on Optimization, 9: 0 877--900, 1999

work page 1999
[6]

Cartis, E

C. Cartis, E. Massart, and A. Otemissov. Global optimization using random embeddings. Mathematical Programming, 200 0 (2): 0 781--829, 2023

work page 2023
[7]

Colliandre and C

L. Colliandre and C. Muller. B ayesian optimization in drug discovery. Methods in Molecular Biology, 2716: 0 101--136, 2024

work page 2024
[8]

Daulton, X

S. Daulton, X. Wan, D. Eriksson, M. Balandat, M. A. Osborne, and E. Bakshy. Bayesian optimization over discrete and mixed spaces via probabilistic reparameterization. In NeurIPS, 2022

work page 2022
[9]

De Ath, R

G. De Ath, R. M. Everson, A. A. M. Rahat, and J. E. Fieldsend. Greed is good: Exploration and exploitation trade-offs in B ayesian optimisation. ACM Transactions on Evolutionary Learning and Optimization, 1: 0 1–22, 2021

work page 2021
[10]

D. K. Duvenaud, H. Nickisch, and C. Rasmussen. Additive G aussian processes. Advances in neural information processing systems, 24, 2011

work page 2011
[11]

Eriksson and M

D. Eriksson and M. Jankowiak. High-dimensional B ayesian optimization with sparse axis-aligned subspaces. In UAI, 2021

work page 2021
[12]

Eriksson, M

D. Eriksson, M. Pearce, J. Gardner, R. D. Turner, and M. Poloczek. Scalable global optimization via local B ayesian optimization. In NeurIPS, 2019

work page 2019
[13]

J. J. H. Forrest, J. Hirst, and J. A. Tomlin. Practical solution of large mixed integer programming problems with UMPIRE . Management Science, 20 0 (5): 0 736--773, 1974

work page 1974
[14]

P. I. Frazier. A tutorial on B ayesian optimization. arXiv preprint arXiv:1807.02811, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[15]

Gurobi optimizer reference manual , 2024

Gurobi Optimization, LLC . Gurobi optimizer reference manual , 2024. URL https://www.gurobi.com

work page 2024
[16]

D. R. Jones, M. Schonlau, and W. J. Welch. Efficient global optimization of expensive black-box functions. Journal of Global Optimization, 13: 0 455--492, 1998

work page 1998
[17]

Kandasamy, J

K. Kandasamy, J. Schneider, and B. P \'o czos. High dimensional B ayesian optimisation and bandits via additive models. In ICML, 2015

work page 2015
[18]

Kim and S

J. Kim and S. Choi. On local optimizers of acquisition functions in B ayesian optimization. In Machine Learning and Knowledge Discovery in Databases, 2021

work page 2021
[19]

D. P. Kingma and J. Ba. Adam: a method for stochastic optimization. In ICLR, 2015

work page 2015
[20]

D. Kraft. A software package for sequential quadratic programming. Wiss. Berichtswesen d. DFVLR, 1988

work page 1988
[21]

K. Lang. Newsweeder: Learning to filter netnews. In Proceedings of the Twelfth International Conference on Machine Learning, pages 331--339, 1995

work page 1995
[22]

Mathesen, G

L. Mathesen, G. Pedrielli, S. H. Ng, and Z. B. Zabinsky. Stochastic optimization with adaptive restart: A framework for integrated local and global learning. Journal of Global Optimization, 79: 0 87–110, 2021

work page 2021
[23]

A. G. d. G. Matthews, M. van der Wilk , T. Nickson, K. Fujii, A. Boukouvalas , P. Le \'o n-Villagr \'a , Z. Ghahramani, and J. Hensman. GP flow: A G aussian process library using T ensor F low . Journal of Machine Learning Research, 18 0 (40): 0 1--6, 2017

work page 2017
[24]

A. M. K. Nambiar, C. P. Breen, T. Hart, T. Kulesza, T. F. Jamison, and K. F. Jensen. B ayesian optimization of computer-proposed multistep synthetic routes on an automated robotic flow platform. ACS Central Science, 8 0 (6): 0 825--836, 2022

work page 2022
[25]

J. A. Nelder and R. Mead. A simplex method for function minimization . Comput. J., 7: 0 308--313, 1965

work page 1965
[26]

C. Oh, E. Gavves, and M. Welling. BOCK : B ayesian optimization with cylindrical kernels. In ICML, 2018

work page 2018
[27]

T. P. Papalexopoulos, C. Tjandraatmadja, R. Anderson, J. P. Vielma, and D. Belanger. Constrained discrete black-box optimization using mixed-integer programming. In ICML, 2022

work page 2022
[28]

J. A. Paulson and C. Tsay. Bayesian optimization as a flexible and efficient design framework for sustainable process systems. arXiv preprint arXiv:2401.16373, 2024

work page arXiv 2024
[29]

Pedregosa, G

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: machine learning in P ython. Journal of Machine Learning Research, 12, 2011

work page 2011
[30]

M. J. Powell. A direct search optimization method that models the objective and constraint functions by linear interpolation. Springer, 1994

work page 1994
[31]

M. P. Ranjit, G. Ganapathy, K. Sridhar, and V. Arumugham. Efficient deep learning hyperparameter tuning using cloud infrastructure: intelligent distributed hyperparameter tuning with B ayesian optimization in the cloud. In IEEE 12th International Conference on Cloud Computing (CLOUD), 2019

work page 2019
[32]

Schittkowski

K. Schittkowski. 306 test problems for nonlinear programming with optimal solutions - U ser's guide, 2008. URL https://klaus-schittkowski.de/tpnp.htm

work page 2008
[33]

Schulz, M

E. Schulz, M. Speekenbrink, and A. Krause. A tutorial on G aussian process regression: Modelling, exploring, and exploiting functions. Journal of Mathematical Psychology, 85: 0 1--16, 2018

work page 2018
[34]

A. M. Schweidtmann, D. Bongartz, D. Grothe, T. Kerkenhoff, X. Lin, J. Najman, and A. Mitsos. Deterministic global optimization with G aussian processes embedded. Mathematical Programming Computation, 13 0 (3): 0 553--581, 2021

work page 2021
[35]

X. Song, Q. Zhang, C. Lee, E. Fertig, T.-K. Huang, L. Belenki, G. Kochanski, S. Ariafar, S. Vasudevan, S. Perel, and D. Golovin. The V izier G aussian process bandit algorithm, 2024. URL https://arxiv.org/abs/2408.11527

work page arXiv 2024
[36]

Spagnol, R

A. Spagnol, R. L. Riche, and S. D. Veiga. Global sensitivity analysis for optimization with variable selection. SIAM/ASA Journal on Uncertainty Quantification, 7: 0 417–443, 2019

work page 2019
[37]

Srinivas, A

N. Srinivas, A. Krause, S. Kakade, and M. Seeger. G aussian process optimization in the bandit setting: no regret and experimental design. In ICML, 2010

work page 2010
[38]

Srinivas, A

N. Srinivas, A. Krause, S. M. Kakade, and M. W. Seeger. Information-theoretic regret bounds for G aussian process optimization in the bandit setting. IEEE Transactions on Information Theory, 58: 0 3250–3265, 2012

work page 2012
[39]

Thebelt, C

A. Thebelt, C. Tsay, R. Lee, N. Sudermann-Merx, D. Walz, B. Shafei, and R. Misener. Tree ensemble kernels for bayesian optimization with known constraints over mixed-feature spaces. Advances in Neural Information Processing Systems, 35: 0 37401--37415, 2022 a

work page 2022
[40]

Thebelt, C

A. Thebelt, C. Tsay, R. M. Lee, N. Sudermann-Merx, D. Walz, T. Tranter, and R. Misener. Multi-objective constrained optimization for energy applications via tree ensembles. Applied Energy, 306: 0 118061, 2022 b

work page 2022
[41]

Virtanen, R

P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, S. J. van der Walt , M. Brett, J. Wilson, K. J. Millman, N. Mayorov, A. R. J. Nelson, E. Jones, R. Kern, E. Larson, C. J. Carey, \.I . Polat, Y. Feng, E. W. Moore, J. VanderPlas , D. Laxalde, J. Perktold, R. Cimrman, I. Henrik...

work page 2020
[42]

Wilson, F

J. Wilson, F. Hutter, and M. Deisenroth. Maximizing acquisition functions for B ayesian optimization. In NeurIPS, 2018

work page 2018
[43]

C. Zhu, R. H. Byrd, P. Lu, and J. Nocedal. Algorithm 778: L-BFGS-B : Fortran subroutines for large-scale bound-constrained optimization. ACM Transactions on mathematical software (TOMS), 23: 0 550–560, 1997

work page 1997

[1] [1]

Ament, S

S. Ament, S. Daulton, D. Eriksson, M. Balandat, and E. Bakshy. Unexpected improvements to expected improvement for B ayesian optimization. In NeurIPS, 2024

work page 2024

[2] [2]

Beale and J

E. Beale and J. Tomlin. Special facilities in a general mathematical programming system for nonconvex problems using ordered sets of variables. Operational Research, 69: 0 447--454, 1969

work page 1969

[3] [3]

Belotti, C

P. Belotti, C. Kirches, S. Leyffer, J. Linderoth, J. Luedtke, and A. Mahajan. Mixed-integer nonlinear optimization. Acta Numerica, 22: 0 1--131, 2013

work page 2013

[4] [4]

Bradford, A

E. Bradford, A. M. Schweidtmann, and A. Lapkin. Efficient multiobjective optimization employing G aussian processes, spectral sampling and a genetic algorithm. Journal of Global Optimization, 71 0 (2): 0 407--438, 2018

work page 2018

[5] [5]

R. H. Byrd, M. E. Hribar, and J. Nocedal. An interior point algorithm for large-scale nonlinear programming. SIAM Journal on Optimization, 9: 0 877--900, 1999

work page 1999

[6] [6]

Cartis, E

C. Cartis, E. Massart, and A. Otemissov. Global optimization using random embeddings. Mathematical Programming, 200 0 (2): 0 781--829, 2023

work page 2023

[7] [7]

Colliandre and C

L. Colliandre and C. Muller. B ayesian optimization in drug discovery. Methods in Molecular Biology, 2716: 0 101--136, 2024

work page 2024

[8] [8]

Daulton, X

S. Daulton, X. Wan, D. Eriksson, M. Balandat, M. A. Osborne, and E. Bakshy. Bayesian optimization over discrete and mixed spaces via probabilistic reparameterization. In NeurIPS, 2022

work page 2022

[9] [9]

De Ath, R

G. De Ath, R. M. Everson, A. A. M. Rahat, and J. E. Fieldsend. Greed is good: Exploration and exploitation trade-offs in B ayesian optimisation. ACM Transactions on Evolutionary Learning and Optimization, 1: 0 1–22, 2021

work page 2021

[10] [10]

D. K. Duvenaud, H. Nickisch, and C. Rasmussen. Additive G aussian processes. Advances in neural information processing systems, 24, 2011

work page 2011

[11] [11]

Eriksson and M

D. Eriksson and M. Jankowiak. High-dimensional B ayesian optimization with sparse axis-aligned subspaces. In UAI, 2021

work page 2021

[12] [12]

Eriksson, M

D. Eriksson, M. Pearce, J. Gardner, R. D. Turner, and M. Poloczek. Scalable global optimization via local B ayesian optimization. In NeurIPS, 2019

work page 2019

[13] [13]

J. J. H. Forrest, J. Hirst, and J. A. Tomlin. Practical solution of large mixed integer programming problems with UMPIRE . Management Science, 20 0 (5): 0 736--773, 1974

work page 1974

[14] [14]

P. I. Frazier. A tutorial on B ayesian optimization. arXiv preprint arXiv:1807.02811, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[15] [15]

Gurobi optimizer reference manual , 2024

Gurobi Optimization, LLC . Gurobi optimizer reference manual , 2024. URL https://www.gurobi.com

work page 2024

[16] [16]

D. R. Jones, M. Schonlau, and W. J. Welch. Efficient global optimization of expensive black-box functions. Journal of Global Optimization, 13: 0 455--492, 1998

work page 1998

[17] [17]

Kandasamy, J

K. Kandasamy, J. Schneider, and B. P \'o czos. High dimensional B ayesian optimisation and bandits via additive models. In ICML, 2015

work page 2015

[18] [18]

Kim and S

J. Kim and S. Choi. On local optimizers of acquisition functions in B ayesian optimization. In Machine Learning and Knowledge Discovery in Databases, 2021

work page 2021

[19] [19]

D. P. Kingma and J. Ba. Adam: a method for stochastic optimization. In ICLR, 2015

work page 2015

[20] [20]

D. Kraft. A software package for sequential quadratic programming. Wiss. Berichtswesen d. DFVLR, 1988

work page 1988

[21] [21]

K. Lang. Newsweeder: Learning to filter netnews. In Proceedings of the Twelfth International Conference on Machine Learning, pages 331--339, 1995

work page 1995

[22] [22]

Mathesen, G

L. Mathesen, G. Pedrielli, S. H. Ng, and Z. B. Zabinsky. Stochastic optimization with adaptive restart: A framework for integrated local and global learning. Journal of Global Optimization, 79: 0 87–110, 2021

work page 2021

[23] [23]

A. G. d. G. Matthews, M. van der Wilk , T. Nickson, K. Fujii, A. Boukouvalas , P. Le \'o n-Villagr \'a , Z. Ghahramani, and J. Hensman. GP flow: A G aussian process library using T ensor F low . Journal of Machine Learning Research, 18 0 (40): 0 1--6, 2017

work page 2017

[24] [24]

A. M. K. Nambiar, C. P. Breen, T. Hart, T. Kulesza, T. F. Jamison, and K. F. Jensen. B ayesian optimization of computer-proposed multistep synthetic routes on an automated robotic flow platform. ACS Central Science, 8 0 (6): 0 825--836, 2022

work page 2022

[25] [25]

J. A. Nelder and R. Mead. A simplex method for function minimization . Comput. J., 7: 0 308--313, 1965

work page 1965

[26] [26]

C. Oh, E. Gavves, and M. Welling. BOCK : B ayesian optimization with cylindrical kernels. In ICML, 2018

work page 2018

[27] [27]

T. P. Papalexopoulos, C. Tjandraatmadja, R. Anderson, J. P. Vielma, and D. Belanger. Constrained discrete black-box optimization using mixed-integer programming. In ICML, 2022

work page 2022

[28] [28]

J. A. Paulson and C. Tsay. Bayesian optimization as a flexible and efficient design framework for sustainable process systems. arXiv preprint arXiv:2401.16373, 2024

work page arXiv 2024

[29] [29]

Pedregosa, G

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: machine learning in P ython. Journal of Machine Learning Research, 12, 2011

work page 2011

[30] [30]

M. J. Powell. A direct search optimization method that models the objective and constraint functions by linear interpolation. Springer, 1994

work page 1994

[31] [31]

M. P. Ranjit, G. Ganapathy, K. Sridhar, and V. Arumugham. Efficient deep learning hyperparameter tuning using cloud infrastructure: intelligent distributed hyperparameter tuning with B ayesian optimization in the cloud. In IEEE 12th International Conference on Cloud Computing (CLOUD), 2019

work page 2019

[32] [32]

Schittkowski

K. Schittkowski. 306 test problems for nonlinear programming with optimal solutions - U ser's guide, 2008. URL https://klaus-schittkowski.de/tpnp.htm

work page 2008

[33] [33]

Schulz, M

E. Schulz, M. Speekenbrink, and A. Krause. A tutorial on G aussian process regression: Modelling, exploring, and exploiting functions. Journal of Mathematical Psychology, 85: 0 1--16, 2018

work page 2018

[34] [34]

A. M. Schweidtmann, D. Bongartz, D. Grothe, T. Kerkenhoff, X. Lin, J. Najman, and A. Mitsos. Deterministic global optimization with G aussian processes embedded. Mathematical Programming Computation, 13 0 (3): 0 553--581, 2021

work page 2021

[35] [35]

X. Song, Q. Zhang, C. Lee, E. Fertig, T.-K. Huang, L. Belenki, G. Kochanski, S. Ariafar, S. Vasudevan, S. Perel, and D. Golovin. The V izier G aussian process bandit algorithm, 2024. URL https://arxiv.org/abs/2408.11527

work page arXiv 2024

[36] [36]

Spagnol, R

A. Spagnol, R. L. Riche, and S. D. Veiga. Global sensitivity analysis for optimization with variable selection. SIAM/ASA Journal on Uncertainty Quantification, 7: 0 417–443, 2019

work page 2019

[37] [37]

Srinivas, A

N. Srinivas, A. Krause, S. Kakade, and M. Seeger. G aussian process optimization in the bandit setting: no regret and experimental design. In ICML, 2010

work page 2010

[38] [38]

Srinivas, A

N. Srinivas, A. Krause, S. M. Kakade, and M. W. Seeger. Information-theoretic regret bounds for G aussian process optimization in the bandit setting. IEEE Transactions on Information Theory, 58: 0 3250–3265, 2012

work page 2012

[39] [39]

Thebelt, C

A. Thebelt, C. Tsay, R. Lee, N. Sudermann-Merx, D. Walz, B. Shafei, and R. Misener. Tree ensemble kernels for bayesian optimization with known constraints over mixed-feature spaces. Advances in Neural Information Processing Systems, 35: 0 37401--37415, 2022 a

work page 2022

[40] [40]

Thebelt, C

A. Thebelt, C. Tsay, R. M. Lee, N. Sudermann-Merx, D. Walz, T. Tranter, and R. Misener. Multi-objective constrained optimization for energy applications via tree ensembles. Applied Energy, 306: 0 118061, 2022 b

work page 2022

[41] [41]

Virtanen, R

P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, S. J. van der Walt , M. Brett, J. Wilson, K. J. Millman, N. Mayorov, A. R. J. Nelson, E. Jones, R. Kern, E. Larson, C. J. Carey, \.I . Polat, Y. Feng, E. W. Moore, J. VanderPlas , D. Laxalde, J. Perktold, R. Cimrman, I. Henrik...

work page 2020

[42] [42]

Wilson, F

J. Wilson, F. Hutter, and M. Deisenroth. Maximizing acquisition functions for B ayesian optimization. In NeurIPS, 2018

work page 2018

[43] [43]

C. Zhu, R. H. Byrd, P. Lu, and J. Nocedal. Algorithm 778: L-BFGS-B : Fortran subroutines for large-scale bound-constrained optimization. ACM Transactions on mathematical software (TOMS), 23: 0 550–560, 1997

work page 1997