Bayesian Optimization in Linear Time

Jesse Schneider; William J. Welch

arxiv: 2605.00237 · v1 · submitted 2026-04-30 · 💻 cs.LG · stat.ML

Bayesian Optimization in Linear Time

Jesse Schneider , William J. Welch This is my paper

Pith reviewed 2026-05-09 20:16 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords Bayesian optimizationlinear complexitybinary partitioningGaussian processesacquisition functionsblack-box optimizationhigh-dimensional optimization

0 comments

The pith

Recursive binary partitioning adapts Bayesian optimization to linear time and better performance on high-dimensional tests.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard Bayesian optimization trains a Gaussian process on all prior evaluations and uses an acquisition function to pick the next point by mixing exploration and exploitation. This produces cubic scaling in the number of points and applies a single global model even though the minimum is typically local. The paper shows that recursively splitting the domain into binary partitions lets both the model and the acquisition function operate inside each subregion. The resulting method runs in linear time and reaches lower objective values than a standard library on seven benchmark functions whose dimensions range from 6 to 124.

Core claim

Using flexible and recursive binary partitioning of the search space, both the modeling and acquisitive aspects of standard Bayesian optimization are adapted to work harmoniously with the partitioning scheme. This change removes the cubic computational bottleneck and aligns the search process more closely with the local character of minimization.

What carries the argument

Flexible recursive binary partitioning of the search space, which localizes both the Gaussian process surrogate and the acquisition function within each subregion.

If this is right

The approach scales to optimization budgets far larger than those feasible with cubic methods.
Better final objective values are obtained on high-dimensional black-box problems.
Computational cost grows linearly rather than cubically with the number of observations collected.
The local nature of minimization is respected while still using uncertainty to guide sampling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Partitioning ideas of this kind could be applied to other surrogate-based sequential methods that currently face cubic scaling limits.
Real engineering or hyperparameter tasks with costly evaluations might now support longer search runs without computational collapse.
Adaptive rules for choosing split locations could further improve performance on functions with sharp local features.

Load-bearing premise

That recursive binary partitioning integrates with Gaussian process modeling and acquisition functions without introducing biases that degrade the quality of the uncertainty-aware search.

What would settle it

Runtime measurements showing super-linear growth as the number of evaluations increases, or the method returning higher final values than standard Bayesian optimization on new test functions with known optima.

Figures

Figures reproduced from arXiv: 2605.00237 by Jesse Schneider, William J. Welch.

**Figure 1.** Figure 1: The binary partitioning in TreeBO proceeds via clustering and non-linear classification, enabling the optimization method to [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: The recursive partitioning of TreeBO implicitly creates a [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Side-by-side boxplots of the results for the Ackley test [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Side-by-side boxplots of the results for the Automotive [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: The smallest value of the Automotive function observed for (a) DiceOptim and (b) TreeBO. The faint lines represent individual [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: The delta between the smallest value of the Automotive function observed for DiceOptim and TreeBO. For each pair of runs, [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: The smallest value of the Ackley function observed for (a) DiceOptim and (b) TreeBO. The faint lines represent individual runs [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

**Figure 8.** Figure 8: The delta between the smallest value of the Ackley function observed for DiceOptim and TreeBO. For each pair of runs, the [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

**Figure 9.** Figure 9: The smallest value of the Hartmann function observed for (a) DiceOptim and (b) TreeBO. The faint lines represent individual [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 10.** Figure 10: The delta between the smallest value of the Hartmann function observed for DiceOptim and TreeBO. For each pair of runs, the [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗

**Figure 11.** Figure 11: The smallest value of the Rastrigin function observed for (a) DiceOptim and (b) TreeBO. The vertical axis is on a logarithmic [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗

**Figure 12.** Figure 12: The delta between the smallest value of the Rastrigin function observed for DiceOptim and TreeBO. The vertical axis is on [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗

**Figure 13.** Figure 13: The smallest value of the Schwefel function observed for (a) DiceOptim and (b) TreeBO. The faint lines represent individual [PITH_FULL_IMAGE:figures/full_fig_p019_13.png] view at source ↗

**Figure 14.** Figure 14: The delta between the smallest value of the Schwefel function observed for DiceOptim and TreeBO. For each pair of runs, the [PITH_FULL_IMAGE:figures/full_fig_p020_14.png] view at source ↗

**Figure 15.** Figure 15: The smallest value of the Levy function observed for (a) DiceOptim and (b) TreeBO. The vertical axis is on a logarithmic scale. [PITH_FULL_IMAGE:figures/full_fig_p020_15.png] view at source ↗

**Figure 16.** Figure 16: The delta between the smallest value of the Levy function observed for DiceOptim and TreeBO. The vertical axis is on a [PITH_FULL_IMAGE:figures/full_fig_p021_16.png] view at source ↗

**Figure 17.** Figure 17: The smallest value of the Michalewicz function observed for (a) DiceOptim and (b) TreeBO. The faint lines represent individual [PITH_FULL_IMAGE:figures/full_fig_p021_17.png] view at source ↗

**Figure 18.** Figure 18: The delta between the smallest value of the Michalewicz function observed for DiceOptim and TreeBO. For each pair of [PITH_FULL_IMAGE:figures/full_fig_p022_18.png] view at source ↗

**Figure 19.** Figure 19: Side-by-side boxplots of the results for the Automotive test function in [PITH_FULL_IMAGE:figures/full_fig_p022_19.png] view at source ↗

**Figure 20.** Figure 20: Side-by-side boxplots of the results for the Ackley test function in [PITH_FULL_IMAGE:figures/full_fig_p023_20.png] view at source ↗

**Figure 21.** Figure 21: Side-by-side boxplots of the results for the Hartmann test function in [PITH_FULL_IMAGE:figures/full_fig_p023_21.png] view at source ↗

**Figure 22.** Figure 22: Side-by-side boxplots of the results for the Rastrigin test function in [PITH_FULL_IMAGE:figures/full_fig_p024_22.png] view at source ↗

**Figure 23.** Figure 23: Side-by-side boxplots of the results for the Schwefel test function in [PITH_FULL_IMAGE:figures/full_fig_p024_23.png] view at source ↗

**Figure 24.** Figure 24: Side-by-side boxplots of the results for the Levy test function in [PITH_FULL_IMAGE:figures/full_fig_p025_24.png] view at source ↗

**Figure 25.** Figure 25: Side-by-side boxplots of the results for the Michalewicz test function in [PITH_FULL_IMAGE:figures/full_fig_p025_25.png] view at source ↗

read the original abstract

Bayesian optimization is a sequential method for minimizing objective functions that are expensive to evaluate and about which few assumptions can be made. By using all gathered data to train a Gaussian process model for the function and adaptively employing a mixture of global exploration and local exploitation, this method has been used for optimization in many fields including machine learning, automotive engineering and reinforcement learning. However, the standard method suffers from two problems: 1) with cubic computational complexity in the training-set size it eventually becomes computationally infeasible to train the model, and 2) globally modeling the objective function is not necessarily optimal given the local nature of minimization. Using flexible and recursive binary partitioning of the search space, we adapt both the modeling and acquisitive aspects of standard Bayesian optimization to work harmoniously with the partitioning scheme, thereby ameliorating both standard shortcomings. We compare our method against a commonly used Bayesian optimization library on seven challenging test functions, ranging in dimensionality from $6$ to $124$, and show that our method achieves superior optimization performance in all tests. In addition our method has linear computational complexity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts Bayesian optimization to linear time by recursively partitioning the space and syncing both the GP surrogate and acquisition function to it, with reported wins over a standard library on tests up to 124 dimensions.

read the letter

The central claim is that recursive binary partitioning lets them drop the usual cubic cost while also fixing the global-model versus local-minimization mismatch. They do this by changing how the surrogate is built and how the acquisition is computed so both respect the partitions. That integrated change is the concrete new piece, and the empirical side shows better final values than a common BO library across seven functions from 6 to 124 dimensions plus the linear scaling they advertise. Those results are the strongest part of what is here; high-dimensional scaling is a real pain point and any method that stays competitive while dropping to linear time is worth looking at. The soft spots are mostly around verification. The abstract gives no derivation or complexity breakdown, so it is not yet clear whether the partitioning preserves the uncertainty estimates that make BO useful or whether the adapted acquisition still balances exploration properly. The test functions are standard synthetic ones, which is fine for a first check but leaves open how the method behaves on the noisy, constrained, or multi-fidelity problems that actually drive most BO use. If the full paper supplies the missing equations, a reproducible implementation, and at least a sketch of why the complexity stays linear end-to-end, those gaps close quickly. This is for readers who already work on scalable surrogates or tree-structured optimization and want to see one concrete way the pieces can be made to fit. It is worth sending to a serious referee because the problem is well-motivated, the proposed fix is specific, and the reported gains are large enough to justify checking the details rather than desk-rejecting.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a Bayesian optimization algorithm that uses flexible recursive binary partitioning of the search space to adapt both Gaussian process modeling and acquisition functions. This is claimed to resolve the cubic complexity of standard BO and the mismatch between global modeling and local minimization, yielding linear computational complexity and superior empirical performance against a standard library on seven test functions with input dimensions ranging from 6 to 124.

Significance. If the integration of partitioning with GPs and acquisition functions is shown to preserve uncertainty quantification without introducing bias, the work would be significant for scaling Bayesian optimization to larger problems and higher dimensions where standard methods become infeasible. The reported empirical results across a range of dimensions provide a starting point for assessing practical utility, though the absence of explicit complexity derivations or theoretical guarantees in the provided text limits assessment of the core contribution.

major comments (2)

[Abstract / Methods] The abstract asserts linear computational complexity and harmonious adaptation of modeling and acquisition to the partitioning scheme, but no derivation, complexity analysis, or pseudocode detailing the integration with Gaussian processes is supplied. A dedicated methods section with explicit complexity bounds (e.g., showing O(n) scaling in training-set size) is required to substantiate the central claim.
[Experiments] The empirical comparison reports superior performance on all seven test functions, but without details on the specific acquisition function adaptation, partitioning recursion depth, or how local vs. global search is balanced within partitions, it is impossible to verify that the method avoids new biases while retaining the benefits of uncertainty-aware optimization. Include the exact experimental protocol, hyperparameter settings, and any ablation studies in §4 or §5.

minor comments (2)

[Abstract] The abstract mentions 'a commonly used Bayesian optimization library' without naming it or providing version/citation; add the specific reference (e.g., GPyOpt or BoTorch) for reproducibility.
[Experiments] Test function dimensions are given as a range (6 to 124) but individual dimensions and function names are not listed; include a table in the experiments section enumerating each function, its dimension, and the number of evaluations used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will revise the paper accordingly to improve clarity and completeness.

read point-by-point responses

Referee: [Abstract / Methods] The abstract asserts linear computational complexity and harmonious adaptation of modeling and acquisition to the partitioning scheme, but no derivation, complexity analysis, or pseudocode detailing the integration with Gaussian processes is supplied. A dedicated methods section with explicit complexity bounds (e.g., showing O(n) scaling in training-set size) is required to substantiate the central claim.

Authors: We agree that the current version does not provide a self-contained derivation or pseudocode. In the revised manuscript we will insert a new dedicated Methods section that formally describes the recursive binary partitioning procedure, its integration with the Gaussian process posterior and the adapted acquisition function, the full algorithm in pseudocode, and a complexity analysis establishing O(n) scaling in the number of observations (training-set size). This will directly substantiate the linear-complexity claim. revision: yes
Referee: [Experiments] The empirical comparison reports superior performance on all seven test functions, but without details on the specific acquisition function adaptation, partitioning recursion depth, or how local vs. global search is balanced within partitions, it is impossible to verify that the method avoids new biases while retaining the benefits of uncertainty-aware optimization. Include the exact experimental protocol, hyperparameter settings, and any ablation studies in §4 or §5.

Authors: We acknowledge that the experimental section currently lacks sufficient implementation detail for full reproducibility and bias assessment. We will expand Sections 4 and 5 to report the precise experimental protocol, all hyperparameter values, the exact form of acquisition-function adaptation inside each partition, the recursion-depth schedule, the mechanism used to balance local versus global search, and ablation studies that isolate the contribution of the partitioning scheme. These additions will allow readers to verify that uncertainty quantification is preserved. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper describes an algorithmic adaptation of Bayesian optimization via recursive binary partitioning of the search space to achieve linear complexity and address global-vs-local modeling issues. No equations, fitted parameters, or derivation steps are presented in the abstract or claims that reduce any result to a self-referential definition, a renamed fit, or a load-bearing self-citation chain. Empirical superiority is asserted via direct comparison on seven test functions (dimensions 6-124), which constitutes independent validation rather than a constructed prediction. The central claims remain self-contained algorithmic descriptions without the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The method implicitly assumes that binary partitioning preserves the uncertainty quantification properties of Gaussian processes and that local modeling within partitions remains beneficial for global minimization.

pith-pipeline@v0.9.0 · 5477 in / 1287 out tokens · 33961 ms · 2026-05-09T20:16:11.883540+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

[1]

Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, and Ey- tan Bakshy

Maximilian Balandat, Brian Karrer, Daniel R. Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, and Ey- tan Bakshy. BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. InAdvances in Neural Information Processing Systems 33, 2020. 7

work page 2020
[2]

R package version 1.0.4

Claus Bendtsen.pso: Particle Swarm Optimization, 2022. R package version 1.0.4. 7, 11

work page 2022
[3]

Rob Carnell.lhs: Latin Hypercube Samples, 2024. 6, 8

work page 2024
[4]

Standard particle swarm optimisation.HAL preprint hal-00764996, 2012

Maurice Clerc. Standard particle swarm optimisation.HAL preprint hal-00764996, 2012. 11

work page 2012
[5]

Scalable global optimization via local Bayesian optimization

David Eriksson, Michael Pearce, Jacob Gardner, Ryan D Turner, and Matthias Poloczek. Scalable global optimization via local Bayesian optimization. InAdvances in Neural Infor- mation Processing Systems, 2019. 9, 10

work page 2019
[6]

CambridgeUniversity Press, 2023

RomanGarnett.BayesianOptimization. CambridgeUniversity Press, 2023. 1, 2, 3

work page 2023
[7]

Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Karro, and D. Sculley. Google Vizier: A service for black-box optimization. InProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017. 1

work page 2017
[8]

Scalable Bayesian opti- mization using Vecchia approximations of Gaussian processes

Felix Jimenez and Matthias Katzfuss. Scalable Bayesian opti- mization using Vecchia approximations of Gaussian processes. InProceedings of The 26th International Conference on Artifi- cial Intelligence and Statistics, 2023. 9

work page 2023
[9]

Mopta 2008 benchmark, 2008

Donald R Jones. Mopta 2008 benchmark, 2008. 7

work page 2008
[10]

Benchmark problems for constrained global optimizationwithhigh-dimensionalblack-boxfunctions, 2025

Donald R Jones. Benchmark problems for constrained global optimizationwithhigh-dimensionalblack-boxfunctions, 2025. 7, 14

work page 2025
[11]

Efficient global optimization of expensive black-box functions

Donald R Jones, Matthias Schonlau, and William J Welch. Efficient global optimization of expensive black-box functions. Journal of Global optimization, 1998. 3

work page 1998
[12]

Welch.EGO: Efficient Global Optimization, 2023

Xiaomeng Ju and William J. Welch.EGO: Efficient Global Optimization, 2023. 8

work page 2023
[13]

Rousseeuw.Finding Groups in Data: An Introduction to Cluster Analysis

Leonard Kaufman and Peter J. Rousseeuw.Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Inc., 1990. 5, 9, 13 11

work page 1990
[14]

Navigating in high-dimensional search space: A hierarchical Bayesian opti- mization approach.arXiv preprint arXiv:2410.23148, 2024

Wenxuan Li, Taiyi Wang, and Eiko Yoneki. Navigating in high-dimensional search space: A hierarchical Bayesian opti- mization approach.arXiv preprint arXiv:2410.23148, 2024. 9, 10

work page arXiv 2024
[15]

When Gaussian process meets big data: A review of scalable gps.IEEE transactions on neural networks and learning sys- tems, 2020

Haitao Liu, Yew-Soon Ong, Xiaobo Shen, and Jianfei Cai. When Gaussian process meets big data: A review of scalable gps.IEEE transactions on neural networks and learning sys- tems, 2020. 9

work page 2020
[16]

Choos- ing the sample size of a computer experiment: A practical guide.Technometrics, 2009

Jason L Loeppky, Jerome Sacks, and William J Welch. Choos- ing the sample size of a computer experiment: A practical guide.Technometrics, 2009. 8

work page 2009
[17]

Geneticoptimiza- tion using derivatives: The rgenoud package for R.Journal of Statistical Software, 2011

WalterR.Mebane,Jr.andJasjeetS.Sekhon. Geneticoptimiza- tion using derivatives: The rgenoud package for R.Journal of Statistical Software, 2011. 7, 10

work page 2011
[18]

Foundations of Machine Learning

Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of Machine Learning. MIT Press, 2 edition, 2018. 7

work page 2018
[19]

Number153inAppliedMathematical Sciences

Stanley Osher and Ronald Fedkiw.Level Set Methods and Dy- namicImplicitSurfaces. Number153inAppliedMathematical Sciences. Springer, 2003. 10

work page 2003
[20]

DiceOptim: Kriging-Based Optimization for Computer Ex- periments, 2025

Victor Picheny, David Ginsbourger, and Olivier Roustant. DiceOptim: Kriging-Based Optimization for Computer Ex- periments, 2025. 1, 3, 7, 10

work page 2025
[21]

R Foundation for Statistical Computing, 2024

R Core Team.R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, 2024. 3, 7, 8

work page 2024
[22]

DiceK- riging, DiceOptim: Two R packages for the analysis of com- puter experiments by kriging-based metamodeling and opti- mization.Journal of Statistical Software, 2012

OlivierRoustant,DavidGinsbourger,andYvesDeville. DiceK- riging, DiceOptim: Two R packages for the analysis of com- puter experiments by kriging-based metamodeling and opti- mization.Journal of Statistical Software, 2012. 3, 10

work page 2012
[23]

Virtual library of sim- ulation experiments: Test functions and datasets, 2013

Sonja Surjanovic and Derek Bingham. Virtual library of sim- ulation experiments: Test functions and datasets, 2013. 7, 14

work page 2013
[24]

The MathWorks Inc.MATLAB, 2024. 8

work page 2024
[25]

Blaschko

Sinnu Susan Thomas, Jacopo Palandri, Mohsen Lakehal- Ayat, Punarjay Chakravarty, Friedrich Wolf-Monheim, and Matthew B. Blaschko. Kinematics design of a MacPherson suspensionarchitecturebasedonBayesianoptimization.IEEE Transactions on Cybernetics, 2023. 1

work page 2023
[26]

Bayesian op- timization is superior to random search for machine learning hyperparameter tuning: Analysis of the black-box optimiza- tion challenge 2020

Ryan Turner, David Eriksson, Michael McCourt, Juha Kiili, Eero Laaksonen, Zhen Xu, and Isabelle Guyon. Bayesian op- timization is superior to random search for machine learning hyperparameter tuning: Analysis of the black-box optimiza- tion challenge 2020. InProceedings of the NeurIPS 2020 Competition and Demonstration Track, 2021. 1

work page 2020
[27]

Learning search space partition for black-box optimization using Monte Carlo tree search.Advances in Neural Information Processing Systems, 2020

LinnanWang,RodrigoFonseca,andYuandongTian. Learning search space partition for black-box optimization using Monte Carlo tree search.Advances in Neural Information Processing Systems, 2020. 9, 10

work page 2020
[28]

Bayesianoptimizationinabilliondimen- sionsviarandomembeddings.JournalofArtificialIntelligence Research, 2016

Ziyu Wang, Frank Hutter, Masrour Zoghi, David Matheson, andNandoDeFeitas. Bayesianoptimizationinabilliondimen- sionsviarandomembeddings.JournalofArtificialIntelligence Research, 2016. 9

work page 2016
[29]

ScalableBayesianoptimizationviafocalizedsparse Gaussian processes

Yunyue Wei, Vincent Zhuang, Saraswati Soedarmadji, and YananSui. ScalableBayesianoptimizationviafocalizedsparse Gaussian processes. InAdvances in Neural Information Pro- cessing Systems, 2024. 9

work page 2024
[30]

Using trajec- tory data to improve Bayesian optimization for reinforcement learning.Journal of Machine Learning Research, 2014

Aaron Wilson, Alan Fern, and Prasad Tadepalli. Using trajec- tory data to improve Bayesian optimization for reinforcement learning.Journal of Machine Learning Research, 2014. 1

work page 2014
[31]

Are ran- domdecompositionsallweneedinhighdimensionalBayesian optimisation? InInternational Conference on Machine Learn- ing, 2023

Juliusz Krzysztof Ziomek and Haitham Bou Ammar. Are ran- domdecompositionsallweneedinhighdimensionalBayesian optimisation? InInternational Conference on Machine Learn- ing, 2023. 9 Jesse Schneiderreceived the B.Sc. in Data Science and Business Analytics from the University of London in 2021, and the M.Sc. in Statistics from Simon Fraser University in 2023...

work page 2023

[1] [1]

Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, and Ey- tan Bakshy

Maximilian Balandat, Brian Karrer, Daniel R. Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, and Ey- tan Bakshy. BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. InAdvances in Neural Information Processing Systems 33, 2020. 7

work page 2020

[2] [2]

R package version 1.0.4

Claus Bendtsen.pso: Particle Swarm Optimization, 2022. R package version 1.0.4. 7, 11

work page 2022

[3] [3]

Rob Carnell.lhs: Latin Hypercube Samples, 2024. 6, 8

work page 2024

[4] [4]

Standard particle swarm optimisation.HAL preprint hal-00764996, 2012

Maurice Clerc. Standard particle swarm optimisation.HAL preprint hal-00764996, 2012. 11

work page 2012

[5] [5]

Scalable global optimization via local Bayesian optimization

David Eriksson, Michael Pearce, Jacob Gardner, Ryan D Turner, and Matthias Poloczek. Scalable global optimization via local Bayesian optimization. InAdvances in Neural Infor- mation Processing Systems, 2019. 9, 10

work page 2019

[6] [6]

CambridgeUniversity Press, 2023

RomanGarnett.BayesianOptimization. CambridgeUniversity Press, 2023. 1, 2, 3

work page 2023

[7] [7]

Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Karro, and D. Sculley. Google Vizier: A service for black-box optimization. InProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017. 1

work page 2017

[8] [8]

Scalable Bayesian opti- mization using Vecchia approximations of Gaussian processes

Felix Jimenez and Matthias Katzfuss. Scalable Bayesian opti- mization using Vecchia approximations of Gaussian processes. InProceedings of The 26th International Conference on Artifi- cial Intelligence and Statistics, 2023. 9

work page 2023

[9] [9]

Mopta 2008 benchmark, 2008

Donald R Jones. Mopta 2008 benchmark, 2008. 7

work page 2008

[10] [10]

Benchmark problems for constrained global optimizationwithhigh-dimensionalblack-boxfunctions, 2025

Donald R Jones. Benchmark problems for constrained global optimizationwithhigh-dimensionalblack-boxfunctions, 2025. 7, 14

work page 2025

[11] [11]

Efficient global optimization of expensive black-box functions

Donald R Jones, Matthias Schonlau, and William J Welch. Efficient global optimization of expensive black-box functions. Journal of Global optimization, 1998. 3

work page 1998

[12] [12]

Welch.EGO: Efficient Global Optimization, 2023

Xiaomeng Ju and William J. Welch.EGO: Efficient Global Optimization, 2023. 8

work page 2023

[13] [13]

Rousseeuw.Finding Groups in Data: An Introduction to Cluster Analysis

Leonard Kaufman and Peter J. Rousseeuw.Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Inc., 1990. 5, 9, 13 11

work page 1990

[14] [14]

Navigating in high-dimensional search space: A hierarchical Bayesian opti- mization approach.arXiv preprint arXiv:2410.23148, 2024

Wenxuan Li, Taiyi Wang, and Eiko Yoneki. Navigating in high-dimensional search space: A hierarchical Bayesian opti- mization approach.arXiv preprint arXiv:2410.23148, 2024. 9, 10

work page arXiv 2024

[15] [15]

When Gaussian process meets big data: A review of scalable gps.IEEE transactions on neural networks and learning sys- tems, 2020

Haitao Liu, Yew-Soon Ong, Xiaobo Shen, and Jianfei Cai. When Gaussian process meets big data: A review of scalable gps.IEEE transactions on neural networks and learning sys- tems, 2020. 9

work page 2020

[16] [16]

Choos- ing the sample size of a computer experiment: A practical guide.Technometrics, 2009

Jason L Loeppky, Jerome Sacks, and William J Welch. Choos- ing the sample size of a computer experiment: A practical guide.Technometrics, 2009. 8

work page 2009

[17] [17]

Geneticoptimiza- tion using derivatives: The rgenoud package for R.Journal of Statistical Software, 2011

WalterR.Mebane,Jr.andJasjeetS.Sekhon. Geneticoptimiza- tion using derivatives: The rgenoud package for R.Journal of Statistical Software, 2011. 7, 10

work page 2011

[18] [18]

Foundations of Machine Learning

Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of Machine Learning. MIT Press, 2 edition, 2018. 7

work page 2018

[19] [19]

Number153inAppliedMathematical Sciences

Stanley Osher and Ronald Fedkiw.Level Set Methods and Dy- namicImplicitSurfaces. Number153inAppliedMathematical Sciences. Springer, 2003. 10

work page 2003

[20] [20]

DiceOptim: Kriging-Based Optimization for Computer Ex- periments, 2025

Victor Picheny, David Ginsbourger, and Olivier Roustant. DiceOptim: Kriging-Based Optimization for Computer Ex- periments, 2025. 1, 3, 7, 10

work page 2025

[21] [21]

R Foundation for Statistical Computing, 2024

R Core Team.R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, 2024. 3, 7, 8

work page 2024

[22] [22]

DiceK- riging, DiceOptim: Two R packages for the analysis of com- puter experiments by kriging-based metamodeling and opti- mization.Journal of Statistical Software, 2012

OlivierRoustant,DavidGinsbourger,andYvesDeville. DiceK- riging, DiceOptim: Two R packages for the analysis of com- puter experiments by kriging-based metamodeling and opti- mization.Journal of Statistical Software, 2012. 3, 10

work page 2012

[23] [23]

Virtual library of sim- ulation experiments: Test functions and datasets, 2013

Sonja Surjanovic and Derek Bingham. Virtual library of sim- ulation experiments: Test functions and datasets, 2013. 7, 14

work page 2013

[24] [24]

The MathWorks Inc.MATLAB, 2024. 8

work page 2024

[25] [25]

Blaschko

Sinnu Susan Thomas, Jacopo Palandri, Mohsen Lakehal- Ayat, Punarjay Chakravarty, Friedrich Wolf-Monheim, and Matthew B. Blaschko. Kinematics design of a MacPherson suspensionarchitecturebasedonBayesianoptimization.IEEE Transactions on Cybernetics, 2023. 1

work page 2023

[26] [26]

Bayesian op- timization is superior to random search for machine learning hyperparameter tuning: Analysis of the black-box optimiza- tion challenge 2020

Ryan Turner, David Eriksson, Michael McCourt, Juha Kiili, Eero Laaksonen, Zhen Xu, and Isabelle Guyon. Bayesian op- timization is superior to random search for machine learning hyperparameter tuning: Analysis of the black-box optimiza- tion challenge 2020. InProceedings of the NeurIPS 2020 Competition and Demonstration Track, 2021. 1

work page 2020

[27] [27]

Learning search space partition for black-box optimization using Monte Carlo tree search.Advances in Neural Information Processing Systems, 2020

LinnanWang,RodrigoFonseca,andYuandongTian. Learning search space partition for black-box optimization using Monte Carlo tree search.Advances in Neural Information Processing Systems, 2020. 9, 10

work page 2020

[28] [28]

Bayesianoptimizationinabilliondimen- sionsviarandomembeddings.JournalofArtificialIntelligence Research, 2016

Ziyu Wang, Frank Hutter, Masrour Zoghi, David Matheson, andNandoDeFeitas. Bayesianoptimizationinabilliondimen- sionsviarandomembeddings.JournalofArtificialIntelligence Research, 2016. 9

work page 2016

[29] [29]

ScalableBayesianoptimizationviafocalizedsparse Gaussian processes

Yunyue Wei, Vincent Zhuang, Saraswati Soedarmadji, and YananSui. ScalableBayesianoptimizationviafocalizedsparse Gaussian processes. InAdvances in Neural Information Pro- cessing Systems, 2024. 9

work page 2024

[30] [30]

Using trajec- tory data to improve Bayesian optimization for reinforcement learning.Journal of Machine Learning Research, 2014

Aaron Wilson, Alan Fern, and Prasad Tadepalli. Using trajec- tory data to improve Bayesian optimization for reinforcement learning.Journal of Machine Learning Research, 2014. 1

work page 2014

[31] [31]

Are ran- domdecompositionsallweneedinhighdimensionalBayesian optimisation? InInternational Conference on Machine Learn- ing, 2023

Juliusz Krzysztof Ziomek and Haitham Bou Ammar. Are ran- domdecompositionsallweneedinhighdimensionalBayesian optimisation? InInternational Conference on Machine Learn- ing, 2023. 9 Jesse Schneiderreceived the B.Sc. in Data Science and Business Analytics from the University of London in 2021, and the M.Sc. in Statistics from Simon Fraser University in 2023...

work page 2023