arxiv: 2605.00855 · v1 · submitted 2026-04-21 · 🧮 math.OC · cs.LG· stat.ML

Recognition: unknown

An Efficient Spatial Branch-and-Bound Algorithm for Global Optimization of Gaussian Process Posterior Mean Functions

Wei-Ting Tang , Akshay Kudva , Calvin Tsay , Joel A. Paulson

Authors on Pith no claims yet

Pith reviewed 2026-05-10 02:43 UTC · model grok-4.3

classification 🧮 math.OC cs.LGstat.ML

keywords Gaussian processglobal optimizationbranch-and-boundposterior meanpiecewise-linear relaxationdeterministic optimizationspatial branch-and-bound

0 comments

The pith

The PALM-Mean algorithm delivers valid lower bounds and ε-global convergence for optimizing Gaussian process posterior mean functions via hybrid relaxations in spatial branch-and-bound.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an exact method to globally minimize the posterior mean of a trained Gaussian process over a hyperrectangular domain. It embeds a piecewise-analytic lower-bounding strategy into reduced-space spatial branch-and-bound, where at each node the kernel terms are split so that locally important ones are relaxed with sign-aware piecewise-linear functions in a scalar distance variable and the rest receive closed-form analytic bounds. This hybrid relaxation produces a valid underestimator while keeping the size of each relaxed subproblem manageable. The authors prove that the lower bounds are valid and that the overall algorithm converges to within any prescribed tolerance of the global minimum. The method is demonstrated to scale better than general-purpose deterministic solvers when the number of training points is large.

Core claim

We propose PALM-Mean, a piecewise-analytic lower-bounding framework embedded in reduced-space spatial branch-and-bound. At each node, kernel terms that are locally important are replaced by a sign-aware piecewise-linear relaxation in an appropriate scalar distance variable, while the remaining terms are bounded analytically in closed form. We show this hybrid approach yields a valid lower bound for the posterior mean, while limiting the size of the branch-and-bound subproblems. We establish validity of the node lower bounds and ε-global convergence of the resulting algorithm.

What carries the argument

PALM-Mean hybrid lower-bounding scheme, which partitions kernel terms into sign-aware piecewise-linear relaxations in a scalar distance variable for locally important terms and analytic bounds for the remainder.

If this is right

The node lower bounds remain valid throughout the search tree.
The algorithm converges to an ε-global optimum of the posterior mean function.
The approach improves scalability with respect to the number of training data points compared to general solvers.
Exact global optimization becomes practical for larger Gaussian process models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar term-partitioning ideas could apply to global optimization of other additive nonconvex functions common in machine learning.
The reduced-space formulation may help in problems with many input dimensions by focusing branching on relevant variables.
Integration with warm-starting or local refinement steps could further improve practical performance on real applications.

Load-bearing premise

Kernel terms can be partitioned into locally important ones that admit tight sign-aware piecewise-linear relaxations in a scalar distance variable while the remainder can be bounded analytically without destroying the overall lower-bound property.

What would settle it

A case where the proposed relaxation produces a lower bound that is strictly greater than the true posterior mean value at some point in the domain.

read the original abstract

We study the deterministic global optimization of trained Gaussian process posterior mean functions over hyperrectangular domains. Although the posterior mean function has a compact closed-form representation, its global optimization is challenging because it remains nonlinear and nonconvex. Existing exact deterministic approaches become increasingly difficult to scale as the number of training data points grows, leading to approximation-based methods that improve tractability by optimizing a modified (inexact) objective. In this work, we propose PALM-Mean, a piecewise-analytic lower-bounding framework embedded in reduced-space spatial branch-and-bound. At each node, kernel terms that are locally important are replaced by a sign-aware piecewise-linear relaxation in an appropriate scalar distance variable, while the remaining terms are bounded analytically in closed form. We show this hybrid approach yields a valid lower bound for the posterior mean, while limiting the size of the branch-and-bound subproblems. We establish validity of the node lower bounds and $\varepsilon$-global convergence of the resulting algorithm. Computational results on synthetic benchmarks and real-world application problems show that PALM-Mean improves scalability relative to representative general-purpose deterministic global solvers, particularly as the number of training data points increases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PALM-Mean gives a workable hybrid relaxation inside spatial B&B for global GP posterior mean optimization, with claimed convergence and better scaling than generic solvers.

read the letter

The paper's core contribution is a reduced-space spatial branch-and-bound method that, at each node, picks out locally important kernel terms for a sign-aware piecewise-linear relaxation in the scalar distance and bounds the rest with closed-form analytic expressions. This keeps the lower-bounding subproblems manageable while still producing valid underestimators for the posterior mean. They report that the approach improves run times over standard deterministic global solvers as the training set size grows, on both synthetic test cases and a couple of real applications. Computational evidence is presented, and the abstract states that node-bound validity plus epsilon-global convergence are shown. That combination is the practical advance: it targets a common nonconvex surrogate objective without immediately resorting to local or stochastic approximations. The hybrid split is the part that is not standard in prior GP optimization or generic B&B literature. The main thing to check is whether the analytic bounds on the remainder terms are consistent, meaning their gap goes to zero as node diameter shrinks. The piecewise-linear pieces should tighten naturally when the distance interval collapses, but the paper needs to confirm the same for the closed-form bounds; otherwise the standard B&B convergence argument could fail. From the abstract the authors say they establish convergence, so the proof presumably handles the limit behavior, but the details matter. This is aimed at researchers who need deterministic global solutions for GP means in moderate dimensions rather than relying on multi-start local search. It is worth sending to peer review because the algorithmic construction is concrete, the scaling claim is testable, and the convergence statement is explicit even if it requires careful verification in the full proofs.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes PALM-Mean, a reduced-space spatial branch-and-bound algorithm for deterministic global optimization of trained Gaussian process posterior mean functions over hyperrectangles. At each node the kernel terms are partitioned into locally important ones, which receive a sign-aware piecewise-linear relaxation in a scalar distance variable, and the remainder, which receive closed-form analytical bounds; the resulting hybrid underestimator is embedded in spatial B&B. The authors claim to prove validity of the node lower bounds and ε-global convergence of the algorithm, and report computational results on synthetic benchmarks and real-world problems showing improved scalability relative to general-purpose deterministic solvers as the number of training points grows.

Significance. If the validity and convergence claims are rigorously established, the work supplies a structurally exploiting deterministic global optimizer for GP posterior means that scales better than black-box solvers while retaining exactness guarantees. The hybrid bounding strategy that limits subproblem size while preserving a consistent underestimator is a concrete technical contribution, and the reported experiments on both synthetic and real instances provide useful evidence of practical improvement.

major comments (1)

[Convergence theorem / proof of ε-global convergence] The ε-global convergence argument (stated in the abstract and presumably detailed in the convergence theorem) requires that the hybrid lower bound gap vanishes as node diameter tends to zero. While the sign-aware PWL relaxations on the locally important kernels are consistent by construction, the closed-form analytical bounds on the remaining kernel terms are described only as “closed form” without an explicit demonstration that they become exact (or that their gap tends to zero) when the current hyperrectangle shrinks to a point. If the dynamic partitioning into “locally important” terms does not guarantee that every kernel eventually receives a consistent relaxation, or if the analytical bounds retain a positive gap independent of domain size, the standard B&B convergence theorem does not apply. Please supply the missing limit argument or the precise assumption that ensures consistency

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful and constructive review. We are pleased that the referee recognizes the technical contribution of the hybrid bounding strategy and the practical improvements shown in the experiments. Below we respond to the single major comment.

read point-by-point responses

Referee: The ε-global convergence argument (stated in the abstract and presumably detailed in the convergence theorem) requires that the hybrid lower bound gap vanishes as node diameter tends to zero. While the sign-aware PWL relaxations on the locally important kernels are consistent by construction, the closed-form analytical bounds on the remaining kernel terms are described only as “closed form” without an explicit demonstration that they become exact (or that their gap tends to zero) when the current hyperrectangle shrinks to a point. If the dynamic partitioning into “locally important” terms does not guarantee that every kernel eventually receives a consistent relaxation, or if the analytical bounds retain a positive gap independent of domain size, the standard B&B convergence theorem does not apply. Please supply the missing limit argument or the precise assumption that ensures consistency

Authors: We agree that an explicit consistency argument for the closed-form analytical bounds is required for a fully rigorous presentation. The manuscript establishes validity of the node lower bounds and states ε-global convergence, but the limit behavior of the analytical bounds on the non-locally-important terms is only implicit. In the revised version we will insert a short lemma immediately before the convergence theorem that proves the gap of these bounds tends to zero as the node diameter approaches zero. The proof relies on the continuity of the kernel functions together with the fact that the closed-form bounds are derived from the exact range of each kernel term over the current hyperrectangle (which contracts to a point). We will also add a clarifying remark on the dynamic partitioning criterion, showing that it does not interfere with overall consistency because the analytical bounds themselves are consistent for any partition. These additions will make the application of the standard spatial B&B convergence theorem fully explicit. revision: yes

Circularity Check

0 steps flagged

No circularity: validity and convergence follow from standard relaxation properties and B&B theory

full rationale

The paper constructs a hybrid lower bound by partitioning kernel terms into locally important ones (relaxed via sign-aware PWL in a scalar distance variable) and remainder terms (bounded analytically in closed form). Validity is shown by verifying each piece is a valid underestimator, and ε-global convergence follows from the standard spatial B&B argument once the lower bound is consistent (gap vanishes as node diameter → 0). No step reduces by definition to a fitted quantity, self-citation chain, or renamed input; the central claims rest on explicit analytic properties of the GP posterior mean and established B&B convergence theorems that are independent of the specific hybrid construction. The derivation is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The method rests on standard properties of positive-definite kernels and the geometry of hyperrectangular domains; no new free parameters or invented entities are introduced beyond the algorithmic construction itself.

axioms (2)

standard math The posterior mean is a finite sum of kernel evaluations and is therefore continuous and differentiable on the compact domain.
Invoked implicitly when claiming that analytic bounds exist for non-selected kernel terms.
domain assumption A sign-aware piecewise-linear function constructed from the scalar distance variable provides a valid lower bound for each selected kernel term over a sub-rectangle.
Central to the hybrid relaxation; validity is asserted but the explicit construction is not shown in the abstract.

pith-pipeline@v0.9.0 · 5519 in / 1527 out tokens · 27909 ms · 2026-05-10T02:43:02.831000+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

53 extracted references · 33 canonical work pages · 1 internal anchor

[1]

Computers & Chemical Engineering179, 108411 (2023)

Misener, R., Biegler, L.: Formulating data-driven surrogate models for process optimization. Computers & Chemical Engineering179, 108411 (2023)

2023
[2]

Alex Shtoff, Elie Abboud, Rotem Stram, and Oren Somekh

Sacks, J., Welch, W.J., Mitchell, T.J., Wynn, H.P.: Design and analysis of com- puter experiments. Statistical Science4(4), 409–423 (1989) https://doi.org/10. 1214/ss/1177012413

work page arXiv 1989
[3]

Santner, Brian J

Santner, T.J., Williams, B.J., Notz, W.I.: The Design and Analysis of Computer Experiments. Springer, New York (2003). https://doi.org/10.1007/ 978-1-4757-3799-8 .https://doi.org/10.1007/978-1-4757-3799-8

work page doi:10.1007/978-1-4757-3799-8 2003
[4]

Springer, New York (2019)

Santner, T.J., Williams, B.J., Notz, W.I.: The Design and Analysis of Com- puter Experiments, 2nd edn. Springer, New York (2019). https://doi.org/10. 1007/978-1-4939-8847-1 .https://doi.org/10.1007/978-1-4939-8847-1

work page doi:10.1007/978-1-4939-8847-1 2019
[5]

Progress in Aerospace Sciences41(1), 1–28 (2005) https://doi.org/10.1016/j.paerosci.2005.02.001

Queipo, N.V., Haftka, R.T., Shyy, W., Goel, T., Vaidyanathan, R., Tucker, P.K.: Surrogate-based analysis and optimization. Progress in Aerospace Sciences41(1), 1–28 (2005) https://doi.org/10.1016/j.paerosci.2005.02.001

work page doi:10.1016/j.paerosci.2005.02.001 2005
[7]

European Journal of Operational Research 183, 1109–1130

Kleijnen, J.P.: Kriging metamodeling in simulation: A review. European Journal of Operational Research192(3), 707–716 (2009) https://doi.org/10.1016/j.ejor. 2007.10.013 30

work page doi:10.1016/j.ejor 2009
[8]

MIT Press, Cambridge, MA (2006)

Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA (2006). https://doi.org/10.7551/mitpress/3206.001. 0001 .https://doi.org/10.7551/mitpress/3206.001.0001

work page doi:10.7551/mitpress/3206.001 2006
[9]

A Tutorial on Bayesian Optimization

Frazier, P.I.: A Tutorial on Bayesian Optimization (2018). https://doi.org/10. 48550/arXiv.1807.02811 . https://doi.org/10.48550/arXiv.1807.02811

work page internal anchor Pith review doi:10.48550/arxiv.1807.02811 2018
[10]

Jones, Matthias Schonlau, and William J

Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. Journal of Global Optimization13(4), 455–492 (1998) https: //doi.org/10.1023/A:1008306431147

work page doi:10.1023/a:1008306431147 1998
[11]

doi: 10.1109/JPROC.2015.2494218

Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., Freitas, N.: Taking the human out of the loop: A review of bayesian optimization. Proceedings of the IEEE 104(1), 148–175 (2016) https://doi.org/10.1109/JPROC.2015.2494218

work page doi:10.1109/jproc.2015.2494218 2016
[12]

A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning

Brochu, E., Cora, V.M., Freitas, N.: A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierar- chical Reinforcement Learning (2010). https://doi.org/10.48550/arXiv.1012.2599 . https://doi.org/10.48550/arXiv.1012.2599

work page Pith review doi:10.48550/arxiv.1012.2599 2010
[13]

Current Opinion in Green and Sustainable Chemistry51, 100983 (2025)

Paulson, J.A., Tsay, C.: Bayesian optimization as a flexible and efficient design framework for sustainable process systems. Current Opinion in Green and Sustainable Chemistry51, 100983 (2025)

2025
[14]

Mathematical Programming Computation13(3), 553–581 (2021) https://doi.org/10.1007/s12532-021-00204-y

Schweidtmann, A.M., Bongartz, D., Grothe, D., Kerkenhoff, T., Lin, X., Naj- man, J., Mitsos, A.: Deterministic global optimization with gaussian processes embedded. Mathematical Programming Computation13(3), 553–581 (2021) https://doi.org/10.1007/s12532-021-00204-y

work page doi:10.1007/s12532-021-00204-y 2021
[15]

Journal of Global Optimization21(4), 345–383 (2001) https://doi.org/10

Jones, D.R.: A taxonomy of global optimization methods based on response sur- faces. Journal of Global Optimization21(4), 345–383 (2001) https://doi.org/10. 1023/A:1012771025575

2001
[16]

Journal of Global Optimization8(2), 201–205 (1996) https://doi.org/10.1007/ BF00138693

Sahinidis, N.V.: Baron: A general purpose global optimization software package. Journal of Global Optimization8(2), 201–205 (1996) https://doi.org/10.1007/ BF00138693

1996
[17]

Journal of Global Optimization59(2–3), 503–526 (2014) https://doi.org/10.1007/s10898-014-0166-2

Misener, R., Floudas, C.A.: Antigone: Algorithms for continuous / integer global optimization of nonlinear equations. Journal of Global Optimization59(2–3), 503–526 (2014) https://doi.org/10.1007/s10898-014-0166-2

work page doi:10.1007/s10898-014-0166-2 2014
[18]

Management science15(9), 550–569 (1969)

Falk, J.E., Soland, R.M.: An algorithm for separable nonconvex programming problems. Management science15(9), 550–569 (1969)

1969
[19]

Springer, Berlin, Heidelberg (1996)

Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches, 3rd edn. Springer, Berlin and Heidelberg (1996). https://doi.org/10.1007/ 978-3-662-03199-5 .https://doi.org/10.1007/978-3-662-03199-5 31

work page doi:10.1007/978-3-662-03199-5 1996
[20]

Computers & Chemical Engineering19(5), 551–566 (1995) https://doi.org/10.1016/0098-1354(94)00097-2

Ryoo, H.S., Sahinidis, N.V.: Global optimization of nonconvex NLPs and MINLPs with applications in process design. Computers & Chemical Engineering19(5), 551–566 (1995) https://doi.org/10.1016/0098-1354(94)00097-2

work page doi:10.1016/0098-1354(94)00097-2 1995
[21]

Optimization Method s & Software 40 24(4-5), 597–634 (2009) https://doi.org/10.1080/10556780903087124

Belotti, P., Lee, J., Liberti, L., Margot, F., W¨ achter, A.: Branching and bounds tightening techniques for non-convex MINLP. Optimization Methods and Software24(4–5), 597–634 (2009) https://doi.org/10.1080/10556780903087124

work page doi:10.1080/10556780903087124 2009
[22]

Acta Numerica22, 1–131 (2013) https: //doi.org/10.1017/S0962492913000032

Belotti, P., Kirches, C., Leyffer, S., Linderoth, J., Luedtke, J., Mahajan, A.: Mixed-integer nonlinear optimization. Acta Numerica22, 1–131 (2013) https: //doi.org/10.1017/S0962492913000032

work page doi:10.1017/s0962492913000032 2013
[23]

Journal of Global Optimization 11, 287–311 (1997) https://doi.org/10.1023/A:1008212418949

Epperly, T.G.W., Pistikopoulos, E.N.: A reduced space branch and bound algo- rithm for global optimization. Journal of Global Optimization11(3), 287–311 (1997) https://doi.org/10.1023/A:1008212418949

work page doi:10.1023/a:1008212418949 1997
[24]

Kluwer Academic Publishers, Dordrecht (2000)

Floudas, C.A.: Deterministic Global Optimization: Theory, Methods and Applica- tions. Kluwer Academic Publishers, Dordrecht (2000). https://doi.org/10.1007/ 978-1-4757-4949-6

2000
[25]

In: Proceedings of The 28th International Conference on Artificial Intelligence and Statistics

Xie, Y., Zhang, S., Paulson, J.A., Tsay, C.: Global optimization of gaussian process acquisition functions using a piecewise-linear kernel approximation. In: Proceedings of The 28th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 258, pp. 2287–2295 (2025).https://proceedings.mlr.press/v258...

2025
[26]

Mathematical Programming 128(1–2), 49–72 (2011) https://doi.org/10.1007/S10107-009-0295-4

Vielma, J.P., Nemhauser, G.L.: Modeling disjunctive constraints with a loga- rithmic number of binary variables and constraints. Mathematical Programming 128(1–2), 49–72 (2011) https://doi.org/10.1007/S10107-009-0295-4

work page doi:10.1007/s10107-009-0295-4 2011
[27]

Operations Research71(5), 1835–1856 (2023) https://doi.org/10.1287/opre.2019.1973

Huchette, J., Vielma, J.P.: Nonconvex piecewise linear functions: Advanced for- mulations and simple modeling tools. Operations Research71(5), 1835–1856 (2023) https://doi.org/10.1287/opre.2019.1973

work page doi:10.1287/opre.2019.1973 2023
[28]

Journal of Global Optimization77, 455–486 (2020) https://doi.org/10.1007/s10898-020-00881-4

Grimstad, B., Knudsen, B.R.: Mathematical programming formulations for piece- wise polynomial functions. Journal of Global Optimization77, 455–486 (2020) https://doi.org/10.1007/s10898-020-00881-4

work page doi:10.1007/s10898-020-00881-4 2020
[29]

In: Pardalos, P.M., Prokopyev, O.A

Schweidtmann, A.M., Bongartz, D., Mitsos, A.: Optimization with trained machine learning models embedded. In: Pardalos, P.M., Prokopyev, O.A. (eds.) Encyclopedia of Optimization, pp. 1–8. Springer, Cham (2023). https://doi.org/ 10.1007/978-3-030-54621-2 735-1

work page doi:10.1007/978-3-030-54621-2 2023
[30]

Mathemat- ical Programming183(1), 3–39 (2020) 32

Anderson, R., Huchette, J., Ma, W., Tjandraatmadja, C., Vielma, J.P.: Strong mixed-integer programming formulations for trained neural networks. Mathemat- ical Programming183(1), 3–39 (2020) 32

2020
[31]

Computers & Chemical Engineering131, 106580 (2019)

Grimstad, B., Andersson, H.: ReLU networks as surrogate models in mixed- integer linear programs. Computers & Chemical Engineering131, 106580 (2019)

2019
[32]

INFORMS Journal on Computing (2026)

Huchette, J., Mu˜ noz, G., Serra, T., Tsay, C.: When deep learning meets polyhedral theory: A survey. INFORMS Journal on Computing (2026)

2026
[33]

Advances in neural information processing systems34, 3068–3080 (2021)

Tsay, C., Kronqvist, J., Thebelt, A., Misener, R.: Partition-based formulations for mixed-integer optimization of trained relu neural networks. Advances in neural information processing systems34, 3068–3080 (2021)

2021
[34]

Journal of Global Optimization71(2), 407–438 (2018) https://doi.org/10.1007/ s10898-018-0609-2

Bradford, E., Schweidtmann, A.M., Lapkin, A.: Efficient multiobjective optimiza- tion employing Gaussian processes, spectral sampling and a genetic algorithm. Journal of Global Optimization71(2), 407–438 (2018) https://doi.org/10.1007/ s10898-018-0609-2

2018
[35]

Nicolás, The bar derived category of a curved dg algebra, Journal of Pure and Applied Algebra 212 (2008) 2633–2659

Paulson, J.A., Lu, C.: COBALT: COnstrained Bayesian optimizAtion of compu- tationaLly expensive grey-box models exploiting derivaTive information. Com- puters & Chemical Engineering160, 107700 (2022) https://doi.org/10.1016/j. compchemeng.2022.107700

work page doi:10.1016/j 2022
[36]

Reliability Engineering & System Safety247, 110094 (2024) https://doi.org/10.1016/j.ress.2024.110094

Marrel, A., Iooss, B.: Probabilistic surrogate modeling by gaussian process: A review on recent insights in estimation and validation. Reliability Engineering & System Safety247, 110094 (2024) https://doi.org/10.1016/j.ress.2024.110094

work page doi:10.1016/j.ress.2024.110094 2024
[37]

Mathematical P rogramming 10, 147–175 (1976) https://doi.org/10.1007/BF01580665

McCormick, G.P.: Computability of global solutions to factorable nonconvex pro- grams: Part I — convex underestimating problems. Mathematical Programming 10(1), 147–175 (1976) https://doi.org/10.1007/BF01580665

work page doi:10.1007/bf01580665 1976
[38]

SIAM Journal on Optimization 20(2), 573–601 (2009) https://doi.org/10.1137/080717341

Mitsos, A., Chachuat, B., Barton, P.I.: Mccormick-based relaxations of algo- rithms. SIAM Journal on Optimization20(2), 573–601 (2009) https://doi.org/ 10.1137/080717341

work page doi:10.1137/080717341 2009
[39]

Springer, Boston, MA (20 02)

Tawarmalani, M., Sahinidis, N.V.: Convexification and Global Optimization in Continuous and Mixed-Integer Nonlinear Programming. Kluwer Academic Publishers, Dordrecht (2002). https://doi.org/10.1007/978-1-4757-3532-1

work page doi:10.1007/978-1-4757-3532-1 2002
[40]

Journal of Global Optimization45(1), 3–38 (2009) https://doi.org/10.1007/ s10898-008-9332-8

Floudas, C.A., Gounaris, C.E.: A review of recent advances in global optimiza- tion. Journal of Global Optimization45(1), 3–38 (2009) https://doi.org/10.1007/ s10898-008-9332-8

2009
[41]

arXiv preprint arXiv:2410.22322 (2024)

Adebiyi, T.A., Do, B., Zhang, R.: Optimizing posterior samples for bayesian optimization via rootfinding. arXiv preprint arXiv:2410.22322 (2024)

work page arXiv 2024
[42]

Georgiou, A., Jungen, D., Kaven, L., Hunstig, V., Frangakis, C., Kevrekidis, I., Mitsos, A.: Deterministic global optimization of the acquisition function in bayesian optimization: To do or not to do? arXiv preprint arXiv:2503.03625 (2025) 33

work page arXiv 2025
[43]

Technical report, Process Systems Engineering (AVT.SVT), RWTH Aachen University, Aachen, Germany (2018)

Bongartz, D., Najman, J., Sass, S., Mitsos, A.: Maingo – mccormick-based algo- rithm for mixed-integer nonlinear global optimization. Technical report, Process Systems Engineering (AVT.SVT), RWTH Aachen University, Aachen, Germany (2018). http://permalink.avt.rwth-aachen.de/?id=729717

2018
[44]

Journal of Global Optimization 69(4), 761–796 (2017)

Bongartz, D., Mitsos, A.: Deterministic global optimization of process flowsheets in a reduced space using mccormick relaxations. Journal of Global Optimization 69(4), 761–796 (2017)

2017
[45]

Optimization Metho ds & Software 33(3), 563–593 (2017) https://doi.org/10.1080/10556788.2017.1335312

Vigerske, S., Gleixner, A.M.: Scip: Global optimization of mixed-integer nonlinear programs in a branch-and-cut framework. Optimization Methods and Software 33(3), 563–593 (2018) https://doi.org/10.1080/10556788.2017.1335312

work page doi:10.1080/10556788.2017.1335312 2018
[46]

https: //www.gurobi.com

Gurobi Optimization, LLC: Gurobi Optimizer Reference Manual (2026). https: //www.gurobi.com

2026
[47]

ACM Transactions on mathematical software (TOMS)23(4), 550–560 (1997) https://doi.org/10.1145/ 279232.27923

Zhu, C., Byrd, R.H., Lu, P., Nocedal, J.: Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Transactions on mathematical software (TOMS)23(4), 550–560 (1997) https://doi.org/10.1145/ 279232.27923

work page arXiv 1997
[48]

The Annals of Statistics24(5), 2058– 2080 (1996) https://doi.org/10.1214/aos/1069362310

Loh, W.-L.: On latin hypercube sampling. The Annals of Statistics24(5), 2058– 2080 (1996) https://doi.org/10.1214/aos/1069362310

work page doi:10.1214/aos/1069362310 2058
[49]

Technometrics42(1), 55–61 (2000)

McKay, M.D., Beckman, R.J., Conover, W.J.: A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics42(1), 55–61 (2000)

2000
[50]

Advances in Neural Information Processing Systems34, 20708–20720 (2021)

M¨ uller, S., Rohr, A., Trimpe, S.: Local policy search with Bayesian optimization. Advances in Neural Information Processing Systems34, 20708–20720 (2021)

2021
[51]

https://doi.org/10.48550/arXiv.2510.05516

Tang, W.-T., Kudva, A., Paulson, J.A.: NeST-BO: Fast Local Bayesian Opti- mization via Newton-Step Targeting of Gradient and Hessian Information (2025). https://doi.org/10.48550/arXiv.2510.05516 . https://doi.org/10.48550/ arXiv.2510.05516

work page doi:10.48550/arxiv.2510.05516 2025
[52]

Benchmarking the Performance of Bayesian Optimization across Multiple Experimental Materials Science Domains

Liang, Q., Gongora, A.E., Ren, Z., Tiihonen, A., Liu, Z.,et al.: Benchmarking the performance of bayesian optimization across multiple experimental materials science domains. npj Computational Materials7(1), 188 (2021) https://doi.org/ 10.1038/s41524-021-00656-9

work page doi:10.1038/s41524-021-00656-9 2021
[53]

MRS Bulletin46(7), 566–575 (2021) https://doi.org/10

Deneault, J.R., Chang, J., Myung, J., Hooper, D., Armstrong, A., Pitt, M., Maruyama, B.: Toward autonomous additive manufacturing: Bayesian optimiza- tion on a 3d printer. MRS Bulletin46(7), 566–575 (2021) https://doi.org/10. 1557/s43577-021-00051-1 34

2021
[54]

Advances in neural information processing systems33, 21524–21538 (2020) 35

Balandat, M., Karrer, B., Jiang, D., Daulton, S., Letham, B., Wilson, A.G., Bak- shy, E.: Botorch: A framework for efficient monte-carlo bayesian optimization. Advances in neural information processing systems33, 21524–21538 (2020) 35

2020