Understanding High-Dimensional Bayesian Optimization

Leonard Papenmeier; Luigi Nardi; Matthias Poloczek

arxiv: 2502.09198 · v2 · pith:HNBUC3PNnew · submitted 2025-02-13 · 💻 cs.LG

Understanding High-Dimensional Bayesian Optimization

Leonard Papenmeier , Matthias Poloczek , Luigi Nardi This is my paper

Pith reviewed 2026-05-23 03:25 UTC · model grok-4.3

classification 💻 cs.LG

keywords high-dimensional Bayesian optimizationGaussian processesvanishing gradientsmaximum likelihood estimationlength scaleslocal searchreal-world applications

0 comments

The pith

Vanishing gradients from Gaussian process initialization schemes cause most high-dimensional Bayesian optimization failures, while maximum likelihood estimation of length scales suffices for state-of-the-art performance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines why simple Bayesian optimization methods succeed on high-dimensional real-world tasks despite prior expectations that they would fail. Empirical tests identify vanishing gradients triggered by standard GP initialization as a central obstacle to effective search. Approaches that encourage local search outperform those focused on global exploration, and the authors show that maximum likelihood estimation of GP length scales alone reaches top performance levels. They introduce a straightforward MSR variant of MLE that exploits these observations to deliver state-of-the-art results on a broad collection of real applications.

Core claim

Our empirical analysis shows that vanishing gradients caused by Gaussian process (GP) initialization schemes play a major role in the failures of high-dimensional Bayesian optimization (HDBO) and that methods that promote local search behaviors are better suited for the task. We find that maximum likelihood estimation (MLE) of GP length scales suffices for state-of-the-art performance. Based on this, we propose a simple variant of MLE called MSR that leverages these findings to achieve state-of-the-art performance on a comprehensive set of real-world applications.

What carries the argument

Vanishing gradients induced by common Gaussian process initialization schemes, countered by maximum likelihood estimation of length scales that favors local search behavior.

If this is right

Maximum likelihood estimation of GP length scales alone reaches state-of-the-art results in high-dimensional settings.
Methods that promote local search outperform those that emphasize global exploration.
The MSR variant of MLE attains state-of-the-art performance on diverse real-world applications.
Targeted experiments can isolate and confirm the contribution of vanishing gradients to HDBO failures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

High-dimensional problems may reward focused local exploitation more than broad exploration strategies.
Similar initialization and length-scale adjustments could simplify other Gaussian-process-based optimizers.
MSR might be tested on synthetic high-dimensional functions with known global optima to separate local-search benefits from benchmark-specific effects.
The results suggest that many reported failures of high-dimensional BO may be fixable with standard tools rather than requiring entirely new algorithms.

Load-bearing premise

The performance gaps observed across the tested real-world applications arise primarily from the vanishing-gradient mechanism rather than from other unexamined elements of the experimental design or benchmark selection.

What would settle it

A controlled trial that alters only the GP initialization to eliminate vanishing gradients while holding all other factors fixed, after which the performance advantage of MLE-based local-search methods disappears.

Figures

Figures reproduced from arXiv: 2502.09198 by Leonard Papenmeier, Luigi Nardi, Matthias Poloczek.

**Figure 1.** Figure 1: Maximum MLE gradient magnitude for the 50 first gradient steps initialized with different initial length scales (y-axis) and problem dimensionalities (x-axis). With short initial length scales, the gradients vanish even for low dimensions. 3. Facets of the Curse of Dimensionality This section discusses how the curse of dimensionality impacts high-dimensional Bayesian optimization (HDBO) and techniques to… view at source ↗

**Figure 3.** Figure 3: Average distances between the initial and the final candidates of LogEI for various model length scales and dimensionalities without RAASP sampling. Values in the gray region are numerically zero. In high dimensions, the gradient of the AF vanishes, causing no movement of the gradient-based optimizer. optimized with gradient-based approaches. Thus, these ‘flat’ areas of the AF also lead to vanishing gra… view at source ↗

**Figure 4.** Figure 4: Left: Average distances between the initial and the final candidates of LogEI with RAASP sampling. The vanishing gradient issue decreases. Right: Fraction of multi-start GD candidates originating from the RAASP samples when evaluating LogEI on random samples. In high dimensions, RAASP samples are increasingly more likely to get picked, even for longer length scales. create several candidates evaluated on… view at source ↗

**Figure 6.** Figure 6: Average length scales (y-axis) obtained by MLE (blue) and MAP (orange) for different numbers of randomly sampled observations (x-axis) for a 10- and for a 50-dimensional GP prior sample. The obtained length scales differ substantially for the higher dimensional function if few points have been observed. MLE exhibits a higher variance and sensitivity to noise, particularly when fitting a model in high-dimen… view at source ↗

**Figure 8.** Figure 8: BO with the ‘scaled’ initialization of MLE performs comparably to the state-of-the-art in HDBO. simplifying our analysis compared to wider priors, which reduce the difference between MLE and MAP. Compared to MLE, the MAP estimates vary less but exhibit significant bias. This is pronounced for the 50-dimensional GP sample, where the MAP estimates for the length scales revert to the prior mode for 100, 200,… view at source ↗

**Figure 10.** Figure 10: DSP exhibits the least exploration. MLE with fixed initial length scales performs like random search on Ant and Humanoid. the lower-dimensional benchmarks, being in line with our analysis of the bias-variance trade-off in Sec. 3.3. At the beginning of its execution, the BO algorithm that uses MLE with scaled initial length scales (‘MLE (scaled)’) uses longer length scales than all other methods. The resul… view at source ↗

**Figure 11.** Figure 11: Distribution of EI values for GPs in various dimensionalities. When conditioning on the same amount of data points and maintaining the length scale as the dimensionality grows, the distribution of EI values becomes more peaked. As discussed by (Ament et al., 2024), EI often suffers from vanishing gradients, which only worsens in high-dimensional spaces due to the plethora of flat regions. This is shown in… view at source ↗

**Figure 12.** Figure 12: Number of gradient updates for the AF optimization for MSR, and with and without RAASP sampling. RAASP sampling reduces the number of gradient updates. C. Additional Experiments C.1. Ranking of Optimization Algorithms MSR DSP Bounce MLE (scaled) MLE (ℓ = ln 2) Mopta08 (d = 124) 1 2 5 3 4 Lasso-DNA (d = 180) 4 1 5 3 2 Ant (d = 888) 2 4 3 1 5 Humanoid (d = 6392) 2 1 - 3 4 [PITH_FULL_IMAGE:figures/full_fig_… view at source ↗

**Figure 13.** Figure 13: Mean absolute value of the gradients for the different MLE methods, including the proposed MSR. The constant length scale initialization exhibits vanishing gradients for the high-dimensional Ant and Humanoid problems. 16 [PITH_FULL_IMAGE:figures/full_fig_p016_13.png] view at source ↗

**Figure 15.** Figure 15: OTSD (solid lines) and performance curves (dashed lines) of the 100-dimensional Levy function Figs. 15 and 16 show the OTSD and performance plots for the 100-dimensional Levy and Griewank functions [PITH_FULL_IMAGE:figures/full_fig_p017_15.png] view at source ↗

**Figure 14.** Figure 14: OTSD (solid lines) and performance curves (dashed lines) of the 100-dimensional Schwefel function 0 500 1000 iteration 10 1 10 2 10 3 best value Griewank100 0 200 400 OTSD CMA-ES best value DSP OTSD TuRBO [PITH_FULL_IMAGE:figures/full_fig_p018_14.png] view at source ↗

**Figure 16.** Figure 16: OTSD (solid lines) and performance curves (dashed lines) of the 100-dimensional Griewank function 10 5 0 5 x1 10 10 5 0 5 10 x2 0 20 40 60 80 Levy 500 0 x1 500 500 0 500 x2 0 50 100 150 Griewank 500 250 0 250 x1 500 500 250 0 250 500 x2 0 500 1000 1500 Schwefel [PITH_FULL_IMAGE:figures/full_fig_p018_16.png] view at source ↗

**Figure 17.** Figure 17: The two-dimensional versions of the Levy, Griewank, and Schwefel benchmark functions used above. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_17.png] view at source ↗

**Figure 18.** Figure 18: LogEI run on a two-dimensional GP prior sample for 100 evaluations. The right panel shows the posterior mean at the end of the optimization. For highly multimodal benchmarks, EI reverts to a local search behavior and does not obtain a global optimum (red cross). 19 [PITH_FULL_IMAGE:figures/full_fig_p019_18.png] view at source ↗

**Figure 19.** Figure 19: The mean average length scales of “dominant” and “secondary” dimensions for the Mopta08 (left) and Lasso-DNA (right) benchmarks for DSP. 0 250 500 750 1000 Iteration 0.0 0.2 0.4 0.6 0.8 frac. params. at border Mopta08 0 250 500 750 1000 Iteration 0.0 0.2 0.4 0.6 frac. params. at border Lasso DNA [PITH_FULL_IMAGE:figures/full_fig_p021_19.png] view at source ↗

**Figure 20.** Figure 20: Fraction of dimensions set to a value at the border (0 or 1) by DSP. The shaded area shows the standard error of the mean across 15 repetitions. 0 200 400 600 800 1000 Iteration 0.0 0.2 0.4 0.6 0.8 frac. params. at border Mopta08 0 200 400 600 800 1000 Iteration 0.0 0.1 0.2 0.3 0.4 0.5 0.6 frac. params. at border Lasso-DNA [PITH_FULL_IMAGE:figures/full_fig_p021_20.png] view at source ↗

**Figure 21.** Figure 21: Fraction of dimensions set to a value at the border (0 or 1) by our MLE method. The shaded area shows the standard error of the mean across 15 repetitions. indicates that the GP model actually makes use of the specific characteristics of these benchmarks. Figs. 20 and 21 further show that BO consistently evaluates a large share of the parameters at the border [PITH_FULL_IMAGE:figures/full_fig_p021_21.png] view at source ↗

read the original abstract

Recent work reported that simple Bayesian optimization (BO) methods perform well for high-dimensional real-world tasks, seemingly contradicting prior work and tribal knowledge. This paper investigates why. We identify underlying challenges that arise in high-dimensional BO and explain why recent methods succeed. Our empirical analysis shows that vanishing gradients caused by Gaussian process (GP) initialization schemes play a major role in the failures of high-dimensional Bayesian optimization (HDBO) and that methods that promote local search behaviors are better suited for the task. We find that maximum likelihood estimation (MLE) of GP length scales suffices for state-of-the-art performance. Based on this, we propose a simple variant of MLE called MSR that leverages these findings to achieve state-of-the-art performance on a comprehensive set of real-world applications. We present targeted experiments to illustrate and confirm our findings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper traces HDBO failures to GP initialization gradients and shows MLE plus a simple MSR tweak works on their real-world benchmarks.

read the letter

Two things stand out right away. The work pins the recent surprise that plain BO holds up in high dimensions on vanishing gradients during GP length-scale fitting, and it shows that standard MLE or their minor MSR adjustment delivers competitive results without extra machinery. The targeted experiments that link initialization choices to gradient behavior and local-search performance are the clearest new piece; they give a concrete reason why some recent simple methods succeeded where older expectations predicted collapse. Credit for testing on actual applications instead of only synthetic functions and for keeping the proposed fix minimal. The empirical focus makes the diagnosis actionable for people who just need to run the thing. The soft spot is the causal isolation. The stress-test concern holds: performance differences could stem from unablated choices in acquisition optimization, length-scale bounds, seed handling, or benchmark selection rather than the gradient mechanism alone. Without experiments that flip only the initialization scheme while freezing everything else, the attribution stays plausible but not fully pinned down. The real-world results are still useful even if the exact mechanism needs tighter controls. This paper is for practitioners and researchers who implement high-dimensional BO and want straightforward GP-fitting advice. Anyone already running these methods would get immediate takeaways on what to change first. It deserves a serious referee because the question is live, the fix is cheap to try, and confirmation or pushback on the experiments would help the field sort out the current confusion around high-dim performance.

Referee Report

1 major / 0 minor

Summary. The paper investigates why simple Bayesian optimization methods succeed on high-dimensional real-world tasks despite prior expectations. It identifies vanishing gradients from Gaussian process initialization schemes as a primary cause of high-dimensional BO failures, argues that methods promoting local search are better suited, shows that MLE of GP length scales suffices for strong performance, and proposes a simple MLE variant called MSR that achieves state-of-the-art results on real-world applications, supported by targeted experiments.

Significance. If the empirical findings hold after improved controls, the work offers a clear mechanistic explanation for recent HDBO observations and a practical, low-complexity method (MSR) that matches or exceeds more elaborate approaches. The emphasis on initialization effects and local-search promotion provides a useful lens for diagnosing and designing future high-dimensional optimizers. The targeted experiments are a positive step toward reproducibility in this empirical domain.

major comments (1)

[section on targeted experiments and empirical analysis] The central attribution of performance gaps to vanishing gradients from GP initialization requires explicit isolation experiments that toggle only the initialization scheme (or length-scale handling) while fixing acquisition-function optimization, length-scale constraints, random seeds, and benchmark selection. The described targeted experiments do not appear to include such controls, leaving open the possibility that observed differences arise from other unablated factors (see skeptic note on weakest assumption).

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and the opportunity to clarify our experimental design. We respond to the single major comment below.

read point-by-point responses

Referee: [section on targeted experiments and empirical analysis] The central attribution of performance gaps to vanishing gradients from GP initialization requires explicit isolation experiments that toggle only the initialization scheme (or length-scale handling) while fixing acquisition-function optimization, length-scale constraints, random seeds, and benchmark selection. The described targeted experiments do not appear to include such controls, leaving open the possibility that observed differences arise from other unablated factors (see skeptic note on weakest assumption).

Authors: We appreciate the referee's emphasis on rigorous isolation. Our targeted experiments (detailed in the section on empirical analysis) were designed to vary only the GP initialization scheme and length-scale handling: we compared standard initialization (which induces vanishing gradients) against MLE-based length-scale estimation while holding fixed the acquisition-function optimizer, length-scale constraints (e.g., bounds and positivity), random seeds, and the exact set of benchmark tasks. All other algorithmic components remained identical across runs. This isolates the contribution of initialization-induced gradient issues. We will revise the manuscript to add an explicit paragraph and table footnote enumerating these fixed factors, thereby making the isolation protocol unambiguous. revision: partial

Circularity Check

0 steps flagged

No circularity: claims rest on targeted experiments, not self-referential definitions or fitted inputs

full rationale

The paper presents an empirical investigation into HDBO failures, attributing them to vanishing gradients from GP initialization via targeted experiments, and proposes MSR as a simple MLE variant. No equations define quantities in terms of themselves, no predictions are fitted inputs renamed, and no load-bearing self-citations or uniqueness theorems reduce the central claims to prior author work by construction. The analysis is driven by experimental comparisons on real-world tasks, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review yields limited visibility into parameters or assumptions; no explicit free parameters or invented entities are stated.

axioms (1)

domain assumption Gaussian processes provide a suitable surrogate model for the unknown objective in Bayesian optimization
Standard modeling assumption invoked throughout Bayesian optimization literature.

pith-pipeline@v0.9.0 · 5662 in / 1098 out tokens · 32896 ms · 2026-05-23T03:25:53.836530+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our empirical analysis shows that vanishing gradients caused by Gaussian process (GP) initialization schemes play a major role in the failures of high-dimensional Bayesian optimization (HDBO) and that methods that promote local search behaviors are better suited for the task. We find that maximum likelihood estimation (MLE) of GP length scales suffices for state-of-the-art performance.
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a simple variant of MLE called MSR that leverages these findings to achieve state-of-the-art performance on a comprehensive set of real-world applications.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Active Learning for Gaussian Process Regression Under Self-Induced Boltzmann Weights
cs.LG 2026-05 unverdicted novelty 7.0

AB-SID-iVAR enables Gaussian process active learning for self-induced Boltzmann distributions by closed-form approximation of the target, with high-probability error vanishing guarantees and empirical gains on PES and...
Do We Really Need to Approach the Entire Pareto Front in Many-Objective Bayesian Optimisation?
cs.AI 2026-04 unverdicted novelty 7.0

Proposes SPMO framework with ESPI acquisition function to find one high-quality single solution in many-objective BO under limited budgets instead of approximating the entire Pareto front.

Reference graph

Works this paper leans on

57 extracted references · 57 canonical work pages · cited by 2 Pith papers · 2 internal anchors

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page
[2]

Unexpected improvements to expected improvement for bayesian optimization

Ament, S., Daulton, S., Eriksson, D., Balandat, M., and Bakshy, E. Unexpected improvements to expected improvement for bayesian optimization . Advances in Neural Information Processing Systems, 36, 2024

work page 2024
[3]

G., and Bakshy, E

Balandat, M., Karrer, B., Jiang, D., Daulton, S., Letham, B., Wilson, A. G., and Bakshy, E. BoTorch: A framework for efficient Monte-Carlo Bayesian optimization . Advances in neural information processing systems, 33: 0 21524--21538, 2020. URL https://github.com/pytorch/botorch/tree/v0.12.0. Last access: Jan 16, 2025

work page 2020
[4]

Relaxing the additivity constraints in decentralized no-regret high-dimensional bayesian optimization

Bardou, A., Thiran, P., and Begin, T. Relaxing the additivity constraints in decentralized no-regret high-dimensional bayesian optimization. In The Twelfth International Conference on Learning Representations, 2024

work page 2024
[5]

and Po s \'i k, P

Baudi s , P. and Po s \'i k, P. Online Black-Box Algorithm Portfolios for Continuous Optimization . In Parallel Problem Solving from Nature -- PPSN XIII, pp.\ 40--49, Cham, 2014. Springer International Publishing

work page 2014
[6]

and Wycoff, N

Binois, M. and Wycoff, N. A Survey on High-dimensional Gaussian Process Modeling with Application to Bayesian Optimization . ACM Trans. Evol. Learn. Optim., 2 0 (2), aug 2022. doi:10.1145/3545611

work page doi:10.1145/3545611 2022
[7]

A., Bartoli, N., Regis, R

Bouhlel, M. A., Bartoli, N., Regis, R. G., Otsmane, A., and Morlier, J. Efficient global optimization for high-dimensional constrained problems by using the K riging models combined with the partial least squares method. Engineering Optimization, 50 0 (12): 0 2038--2053, 2018

work page 2038
[8]

Calandra, R., Seyfarth, A., Peters, J., and Deisenroth, M. P. Bayesian optimization for learning gaits under uncertainty . Annals of Mathematics and Artificial Intelligence, 76 0 (1): 0 5--23, 2016

work page 2016
[9]

Semi-supervised E mbedding L earning for H igh-dimensional B ayesian O ptimization

Chen, J., Zhu, G., Yuan, C., and Huang, Y. Semi-supervised E mbedding L earning for H igh-dimensional B ayesian O ptimization. arXiv preprint arXiv:2005.14601, 2020

work page arXiv 2005
[10]

R., and Eriksson, D

Deshwal, A., Ament, S., Balandat, M., Bakshy, E., Doppa, J. R., and Eriksson, D. Bayesian optimization over high-dimensional combinatorial spaces via dictionary-based embeddings . In International Conference on Artificial Intelligence and Statistics, pp.\ 7021--7039. PMLR, 2023

work page 2023
[11]

K., Nickisch, H., and Rasmussen, C

Duvenaud, D. K., Nickisch, H., and Rasmussen, C. Additive gaussian processes . Advances in neural information processing systems, 24, 2011

work page 2011
[12]

and Jankowiak, M

Eriksson, D. and Jankowiak, M. High-dimensional Bayesian optimization with sparse axis-aligned subspaces . In Uncertainty in Artificial Intelligence, pp.\ 493--503. PMLR, 2021

work page 2021
[13]

D., and Poloczek, M

Eriksson, D., Pearce, M., Gardner, J., Turner, R. D., and Poloczek, M. Scalable global optimization via local Bayesian optimization . Advances in neural information processing systems, 32, 2019

work page 2019
[14]

Frazier, P. I. A tutorial on Bayesian optimization . arXiv preprint arXiv:1807.02811, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[15]

Discovering and exploiting additive structure for B ayesian optimization

Gardner, J., Guo, C., Weinberger, K., Garnett, R., and Grosse, R. Discovering and exploiting additive structure for B ayesian optimization . In International Conference on Artificial Intelligence and Statistics, pp.\ 1311--1319, 2017

work page 2017
[16]

High-dimensional Bayesian optimization via tree-structured additive models

Han, E., Arora, I., and Scarlett, J. High-dimensional Bayesian optimization via tree-structured additive models . In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp.\ 7630--7638, 2021

work page 2021
[17]

O., Hvarfner, C., Papenmeier, L., and Nardi, L

Hellsten, E. O., Hvarfner, C., Papenmeier, L., and Nardi, L. High-dimensional Bayesian Optimization with Group Testing . arXiv preprint arXiv:2310.03515, 2023

work page arXiv 2023
[18]

u gamer, D., H \

Herrmann, M., Lange, F. J. D., Eggensperger, K., Casalicchio, G., Wever, M., Feurer, M., R \"u gamer, D., H \"u llermeier, E., Boulesteix, A.-L., and Bischl, B. Position: Why We Must Rethink Empirical Research in Machine Learning . In Forty-first International Conference on Machine Learning, 2024

work page 2024
[19]

N., Hoang, Q

Hoang, T. N., Hoang, Q. M., Ouyang, R., and Low, K. H. Decentralized high-dimensional bayesian optimization with factor graphs. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018

work page 2018
[20]

O., and Nardi, L

Hvarfner, C., Hellsten, E. O., and Nardi, L. Vanilla B ayesian optimization performs great in high dimensions. In Salakhutdinov, R., Kolter, Z., Heller, K., Weller, A., Oliver, N., Scarlett, J., and Berkenkamp, F. (eds.), Proceedings of the 41st International Conference on Machine Learning, volume 235 of Proceedings of Machine Learning Research, pp.\ 2079...

work page 2024
[21]

Jones, D. R. A taxonomy of global optimization methods based on response surfaces. Journal of global optimization, 21: 0 345--383, 2001

work page 2001
[22]

Jones, D. R. Large-Scale Multi-Disciplinary Mass Optimization in the Auto Industry . In MOPTA 2008 Conference (20 August 2008), 2008

work page 2008
[23]

R., Schonlau, M., and Welch, W

Jones, D. R., Schonlau, M., and Welch, W. J. Efficient global optimization of expensive black-box functions. Journal of Global optimization, 13: 0 455--492, 1998

work page 1998
[24]

High dimensional Bayesian optimisation and bandits via additive models

Kandasamy, K., Schneider, J., and P \'o czos, B. High dimensional Bayesian optimisation and bandits via additive models . In International conference on machine learning, pp.\ 295--304. PMLR, 2015

work page 2015
[25]

and Oates, C

Karvonen, T. and Oates, C. J. Maximum likelihood estimation in Gaussian process regression is ill-posed . Journal of Machine Learning Research, 24 0 (120): 0 1--47, 2023

work page 2023
[26]

The curse of dimensionality

K \"o ppen, M. The curse of dimensionality . In 5th online world conference on soft computing in industrial applications (WSC5), volume 1, pp.\ 4--8, 2000

work page 2000
[27]

Lam, R., Poloczek, M., Frazier, P., and Willcox, K. E. Advances in Bayesian optimization with applications in aerospace engineering . In 2018 AIAA Non-Deterministic Approaches Conference, pp.\ 1656, 2018

work page 2018
[28]

Re-examining linear embeddings for high-dimensional Bayesian optimization

Letham, B., Calandra, R., Rai, A., and Bakshy, E. Re-examining linear embeddings for high-dimensional Bayesian optimization . Advances in neural information processing systems, 33: 0 1546--1558, 2020

work page 2020
[29]

J., Wang, T., Bowling, M

Lizotte, D. J., Wang, T., Bowling, M. H., Schuurmans, D., et al. Automatic Gait Optimization With G aussian Process Regression. In IJCAI, volume 7, pp.\ 944--949, 2007

work page 2007
[30]

W., Constantine, P., Palacios, F., and Alonso, J

Lukaczyk, T. W., Constantine, P., Palacios, F., and Alonso, J. J. Active subspaces for shape optimization . In 10th AIAA multidisciplinary design optimization conference, pp.\ 1171, 2014

work page 2014
[31]

T., Moore, J., Kusner, M., Bradshaw, J., and Gardner, J

Maus, N., Jones, H. T., Moore, J., Kusner, M., Bradshaw, J., and Gardner, J. R. Local Latent Space Bayesian Optimization over Structured Inputs . In Advances in Neural Information Processing Systems, 2022

work page 2022
[32]

I., Nardi, L., and Krüger, V

Mayr, M., Ahmad, F., Chatzilygeroudis, K. I., Nardi, L., and Krüger, V. Skill-based Multi-objective Reinforcement Learning of Industrial Robot Tasks with Planning and Knowledge Integration . CoRR, abs/2203.10033, 2022

work page arXiv 2022
[33]

The Bayesian approach to global optimization

Mockus, J. The Bayesian approach to global optimization . In System Modeling and Optimization: Proceedings of the 10th IFIP Conference New York City, USA, August 31--September 4, 1981, pp.\ 473--481. Springer, 2005

work page 1981
[34]

P., and Sesh Kumar, K

Moriconi, R., Deisenroth, M. P., and Sesh Kumar, K. High-dimensional Bayesian optimization using low-dimensional feature spaces . Machine Learning, 109: 0 1925--1943, 2020

work page 1925
[35]

and Krause, A

Mutny, M. and Krause, A. Efficient high dimensional B ayesian optimization with additivity and quadrature F ourier features . Advances in Neural Information Processing Systems, 31, 2018

work page 2018
[36]

A framework for Bayesian optimization in embedded subspaces

Nayebi, A., Munteanu, A., and Poloczek, M. A framework for Bayesian optimization in embedded subspaces . In International Conference on Machine Learning, pp.\ 4752--4761. PMLR, 2019

work page 2019
[37]

M., Frazier, P

Negoescu, D. M., Frazier, P. I., and Powell, W. B. The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery . INFORMS Journal on Computing, 23 0 (3): 0 346--363, 2011

work page 2011
[38]

Combinatorial bayesian optimization using the graph cartesian product

Oh, C., Tomczak, J., Gavves, E., and Welling, M. Combinatorial bayesian optimization using the graph cartesian product . Advances in Neural Information Processing Systems, 32, 2019

work page 2019
[39]

Increasing the scope as you learn: Adaptive bayesian optimization in nested subspaces

Papenmeier, L., Nardi, L., and Poloczek, M. Increasing the scope as you learn: Adaptive bayesian optimization in nested subspaces . Advances in Neural Information Processing Systems, 35: 0 11586--11601, 2022

work page 2022
[40]

Bounce: Reliable high-dimensional bayesian optimization for combinatorial and mixed spaces

Papenmeier, L., Nardi, L., and Poloczek, M. Bounce: Reliable high-dimensional bayesian optimization for combinatorial and mixed spaces. In Thirty-seventh Conference on Neural Information Processing Systems, 2023

work page 2023
[41]

Exploring Exploration in Bayesian Optimization

Papenmeier, L., Cheng, N., Becker, S., and Nardi, L. Exploring exploration in bayesian optimization. arXiv preprint arXiv:2502.08208, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[42]

and Ng, S

Pedrielli, G. and Ng, S. H. G-STAR: A new kriging-based trust region method for global optimization . In 2016 Winter Simulation Conference (WSC), pp.\ 803--814. IEEE, 2016

work page 2016
[43]

Bayesian optimization using domain knowledge on the ATRIAS biped

Rai, A., Antonova, R., Song, S., Martin, W., Geyer, H., and Atkeson, C. Bayesian optimization using domain knowledge on the ATRIAS biped . In 2018 IEEE International Conference on Robotics and Automation (ICRA), pp.\ 1771--1778. IEEE, 2018

work page 2018
[44]

Cylindrical Thompson Sampling for High-Dimensional Bayesian Optimization

Rashidi, B., Johnstonbaugh, K., and Gao, C. Cylindrical Thompson Sampling for High-Dimensional Bayesian Optimization . In International Conference on Artificial Intelligence and Statistics, pp.\ 3502--3510. PMLR, 2024

work page 2024
[45]

Regis, R. G. Trust regions in Kriging-based optimization with expected improvement . Engineering optimization, 48 0 (6): 0 1037--1059, 2016

work page 2016
[46]

Regis, R. G. and Shoemaker, C. A. Combining radial basis function surrogates and dynamic coordinate search in high-dimensional expensive black-box optimization. Engineering Optimization, 45 0 (5): 0 529--555, 2013

work page 2013
[47]

Lassobench: A high-dimensional hyperparameter optimization benchmark suite for lasso

S ehi \'c , K., Gramfort, A., Salmon, J., and Nardi, L. Lassobench: A high-dimensional hyperparameter optimization benchmark suite for lasso . In International Conference on Automated Machine Learning, pp.\ 2--1. PMLR, 2022

work page 2022
[48]

Monte carlo tree search based variable selection for high dimensional bayesian optimization

Song, L., Xue, K., Huang, X., and Qian, C. Monte carlo tree search based variable selection for high dimensional bayesian optimization . Advances in Neural Information Processing Systems, 35: 0 28488--28501, 2022

work page 2022
[49]

Gaussian process optimization in the bandit setting: no regret and experimental design

Srinivas, N., Krause, A., Kakade, S., and Seeger, M. Gaussian process optimization in the bandit setting: no regret and experimental design . In Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML'10, pp.\ 1015–1022, Madison, WI, USA, 2010. Omnipress. ISBN 9781605589077

work page 2010
[50]

Tripp, A., Daxberger, E., and Hern\' a ndez-Lobato, J. M. Sample- E fficient O ptimization in the L atent S pace of D eep G enerative M odels via W eighted R etraining. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems (NeurIPS), volume 33, pp.\ 11259--11272. Curran Associates, I...

work page 2020
[51]

Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search

Wang, L., Fonseca, R., and Tian, Y. Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search . Advances in Neural Information Processing Systems, 33: 0 19511--19522, 2020

work page 2020
[52]

Bayesian optimization in a billion dimensions via random embeddings

Wang, Z., Hutter, F., Zoghi, M., Matheson, D., and De Feitas, N. Bayesian optimization in a billion dimensions via random embeddings . Journal of Artificial Intelligence Research, 55: 0 361--387, 2016

work page 2016
[53]

Batched large-scale Bayesian optimization in high-dimensional spaces

Wang, Z., Gehring, C., Kohli, P., and Jegelka, S. Batched large-scale Bayesian optimization in high-dimensional spaces . In International Conference on Artificial Intelligence and Statistics, pp.\ 745--754. PMLR, 2018

work page 2018
[54]

Williams, C. K. and Rasmussen, C. E. Gaussian processes for machine learning, volume 2. MIT press Cambridge, MA, 2006

work page 2006
[55]

Wolpert, D. H. and Macready, W. G. No free lunch theorems for optimization. IEEE transactions on evolutionary computation, 1 0 (1): 0 67--82, 1997

work page 1997
[56]

and Zhe, S

Xu, Z. and Zhe, S. Standard Gaussian Process is All You Need for High-Dimensional Bayesian Optimization . arXiv preprint arXiv:2402.02746v3, 2024

work page arXiv 2024
[57]

Ziomek, J. K. and Ammar, H. B. Are random decompositions all we need in high dimensional Bayesian optimisation? In International Conference on Machine Learning, pp.\ 43347--43368. PMLR, 2023

work page 2023

[1] [1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page

[2] [2]

Unexpected improvements to expected improvement for bayesian optimization

Ament, S., Daulton, S., Eriksson, D., Balandat, M., and Bakshy, E. Unexpected improvements to expected improvement for bayesian optimization . Advances in Neural Information Processing Systems, 36, 2024

work page 2024

[3] [3]

G., and Bakshy, E

Balandat, M., Karrer, B., Jiang, D., Daulton, S., Letham, B., Wilson, A. G., and Bakshy, E. BoTorch: A framework for efficient Monte-Carlo Bayesian optimization . Advances in neural information processing systems, 33: 0 21524--21538, 2020. URL https://github.com/pytorch/botorch/tree/v0.12.0. Last access: Jan 16, 2025

work page 2020

[4] [4]

Relaxing the additivity constraints in decentralized no-regret high-dimensional bayesian optimization

Bardou, A., Thiran, P., and Begin, T. Relaxing the additivity constraints in decentralized no-regret high-dimensional bayesian optimization. In The Twelfth International Conference on Learning Representations, 2024

work page 2024

[5] [5]

and Po s \'i k, P

Baudi s , P. and Po s \'i k, P. Online Black-Box Algorithm Portfolios for Continuous Optimization . In Parallel Problem Solving from Nature -- PPSN XIII, pp.\ 40--49, Cham, 2014. Springer International Publishing

work page 2014

[6] [6]

and Wycoff, N

Binois, M. and Wycoff, N. A Survey on High-dimensional Gaussian Process Modeling with Application to Bayesian Optimization . ACM Trans. Evol. Learn. Optim., 2 0 (2), aug 2022. doi:10.1145/3545611

work page doi:10.1145/3545611 2022

[7] [7]

A., Bartoli, N., Regis, R

Bouhlel, M. A., Bartoli, N., Regis, R. G., Otsmane, A., and Morlier, J. Efficient global optimization for high-dimensional constrained problems by using the K riging models combined with the partial least squares method. Engineering Optimization, 50 0 (12): 0 2038--2053, 2018

work page 2038

[8] [8]

Calandra, R., Seyfarth, A., Peters, J., and Deisenroth, M. P. Bayesian optimization for learning gaits under uncertainty . Annals of Mathematics and Artificial Intelligence, 76 0 (1): 0 5--23, 2016

work page 2016

[9] [9]

Semi-supervised E mbedding L earning for H igh-dimensional B ayesian O ptimization

Chen, J., Zhu, G., Yuan, C., and Huang, Y. Semi-supervised E mbedding L earning for H igh-dimensional B ayesian O ptimization. arXiv preprint arXiv:2005.14601, 2020

work page arXiv 2005

[10] [10]

R., and Eriksson, D

Deshwal, A., Ament, S., Balandat, M., Bakshy, E., Doppa, J. R., and Eriksson, D. Bayesian optimization over high-dimensional combinatorial spaces via dictionary-based embeddings . In International Conference on Artificial Intelligence and Statistics, pp.\ 7021--7039. PMLR, 2023

work page 2023

[11] [11]

K., Nickisch, H., and Rasmussen, C

Duvenaud, D. K., Nickisch, H., and Rasmussen, C. Additive gaussian processes . Advances in neural information processing systems, 24, 2011

work page 2011

[12] [12]

and Jankowiak, M

Eriksson, D. and Jankowiak, M. High-dimensional Bayesian optimization with sparse axis-aligned subspaces . In Uncertainty in Artificial Intelligence, pp.\ 493--503. PMLR, 2021

work page 2021

[13] [13]

D., and Poloczek, M

Eriksson, D., Pearce, M., Gardner, J., Turner, R. D., and Poloczek, M. Scalable global optimization via local Bayesian optimization . Advances in neural information processing systems, 32, 2019

work page 2019

[14] [14]

Frazier, P. I. A tutorial on Bayesian optimization . arXiv preprint arXiv:1807.02811, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[15] [15]

Discovering and exploiting additive structure for B ayesian optimization

Gardner, J., Guo, C., Weinberger, K., Garnett, R., and Grosse, R. Discovering and exploiting additive structure for B ayesian optimization . In International Conference on Artificial Intelligence and Statistics, pp.\ 1311--1319, 2017

work page 2017

[16] [16]

High-dimensional Bayesian optimization via tree-structured additive models

Han, E., Arora, I., and Scarlett, J. High-dimensional Bayesian optimization via tree-structured additive models . In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp.\ 7630--7638, 2021

work page 2021

[17] [17]

O., Hvarfner, C., Papenmeier, L., and Nardi, L

Hellsten, E. O., Hvarfner, C., Papenmeier, L., and Nardi, L. High-dimensional Bayesian Optimization with Group Testing . arXiv preprint arXiv:2310.03515, 2023

work page arXiv 2023

[18] [18]

u gamer, D., H \

Herrmann, M., Lange, F. J. D., Eggensperger, K., Casalicchio, G., Wever, M., Feurer, M., R \"u gamer, D., H \"u llermeier, E., Boulesteix, A.-L., and Bischl, B. Position: Why We Must Rethink Empirical Research in Machine Learning . In Forty-first International Conference on Machine Learning, 2024

work page 2024

[19] [19]

N., Hoang, Q

Hoang, T. N., Hoang, Q. M., Ouyang, R., and Low, K. H. Decentralized high-dimensional bayesian optimization with factor graphs. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018

work page 2018

[20] [20]

O., and Nardi, L

Hvarfner, C., Hellsten, E. O., and Nardi, L. Vanilla B ayesian optimization performs great in high dimensions. In Salakhutdinov, R., Kolter, Z., Heller, K., Weller, A., Oliver, N., Scarlett, J., and Berkenkamp, F. (eds.), Proceedings of the 41st International Conference on Machine Learning, volume 235 of Proceedings of Machine Learning Research, pp.\ 2079...

work page 2024

[21] [21]

Jones, D. R. A taxonomy of global optimization methods based on response surfaces. Journal of global optimization, 21: 0 345--383, 2001

work page 2001

[22] [22]

Jones, D. R. Large-Scale Multi-Disciplinary Mass Optimization in the Auto Industry . In MOPTA 2008 Conference (20 August 2008), 2008

work page 2008

[23] [23]

R., Schonlau, M., and Welch, W

Jones, D. R., Schonlau, M., and Welch, W. J. Efficient global optimization of expensive black-box functions. Journal of Global optimization, 13: 0 455--492, 1998

work page 1998

[24] [24]

High dimensional Bayesian optimisation and bandits via additive models

Kandasamy, K., Schneider, J., and P \'o czos, B. High dimensional Bayesian optimisation and bandits via additive models . In International conference on machine learning, pp.\ 295--304. PMLR, 2015

work page 2015

[25] [25]

and Oates, C

Karvonen, T. and Oates, C. J. Maximum likelihood estimation in Gaussian process regression is ill-posed . Journal of Machine Learning Research, 24 0 (120): 0 1--47, 2023

work page 2023

[26] [26]

The curse of dimensionality

K \"o ppen, M. The curse of dimensionality . In 5th online world conference on soft computing in industrial applications (WSC5), volume 1, pp.\ 4--8, 2000

work page 2000

[27] [27]

Lam, R., Poloczek, M., Frazier, P., and Willcox, K. E. Advances in Bayesian optimization with applications in aerospace engineering . In 2018 AIAA Non-Deterministic Approaches Conference, pp.\ 1656, 2018

work page 2018

[28] [28]

Re-examining linear embeddings for high-dimensional Bayesian optimization

Letham, B., Calandra, R., Rai, A., and Bakshy, E. Re-examining linear embeddings for high-dimensional Bayesian optimization . Advances in neural information processing systems, 33: 0 1546--1558, 2020

work page 2020

[29] [29]

J., Wang, T., Bowling, M

Lizotte, D. J., Wang, T., Bowling, M. H., Schuurmans, D., et al. Automatic Gait Optimization With G aussian Process Regression. In IJCAI, volume 7, pp.\ 944--949, 2007

work page 2007

[30] [30]

W., Constantine, P., Palacios, F., and Alonso, J

Lukaczyk, T. W., Constantine, P., Palacios, F., and Alonso, J. J. Active subspaces for shape optimization . In 10th AIAA multidisciplinary design optimization conference, pp.\ 1171, 2014

work page 2014

[31] [31]

T., Moore, J., Kusner, M., Bradshaw, J., and Gardner, J

Maus, N., Jones, H. T., Moore, J., Kusner, M., Bradshaw, J., and Gardner, J. R. Local Latent Space Bayesian Optimization over Structured Inputs . In Advances in Neural Information Processing Systems, 2022

work page 2022

[32] [32]

I., Nardi, L., and Krüger, V

Mayr, M., Ahmad, F., Chatzilygeroudis, K. I., Nardi, L., and Krüger, V. Skill-based Multi-objective Reinforcement Learning of Industrial Robot Tasks with Planning and Knowledge Integration . CoRR, abs/2203.10033, 2022

work page arXiv 2022

[33] [33]

The Bayesian approach to global optimization

Mockus, J. The Bayesian approach to global optimization . In System Modeling and Optimization: Proceedings of the 10th IFIP Conference New York City, USA, August 31--September 4, 1981, pp.\ 473--481. Springer, 2005

work page 1981

[34] [34]

P., and Sesh Kumar, K

Moriconi, R., Deisenroth, M. P., and Sesh Kumar, K. High-dimensional Bayesian optimization using low-dimensional feature spaces . Machine Learning, 109: 0 1925--1943, 2020

work page 1925

[35] [35]

and Krause, A

Mutny, M. and Krause, A. Efficient high dimensional B ayesian optimization with additivity and quadrature F ourier features . Advances in Neural Information Processing Systems, 31, 2018

work page 2018

[36] [36]

A framework for Bayesian optimization in embedded subspaces

Nayebi, A., Munteanu, A., and Poloczek, M. A framework for Bayesian optimization in embedded subspaces . In International Conference on Machine Learning, pp.\ 4752--4761. PMLR, 2019

work page 2019

[37] [37]

M., Frazier, P

Negoescu, D. M., Frazier, P. I., and Powell, W. B. The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery . INFORMS Journal on Computing, 23 0 (3): 0 346--363, 2011

work page 2011

[38] [38]

Combinatorial bayesian optimization using the graph cartesian product

Oh, C., Tomczak, J., Gavves, E., and Welling, M. Combinatorial bayesian optimization using the graph cartesian product . Advances in Neural Information Processing Systems, 32, 2019

work page 2019

[39] [39]

Increasing the scope as you learn: Adaptive bayesian optimization in nested subspaces

Papenmeier, L., Nardi, L., and Poloczek, M. Increasing the scope as you learn: Adaptive bayesian optimization in nested subspaces . Advances in Neural Information Processing Systems, 35: 0 11586--11601, 2022

work page 2022

[40] [40]

Bounce: Reliable high-dimensional bayesian optimization for combinatorial and mixed spaces

Papenmeier, L., Nardi, L., and Poloczek, M. Bounce: Reliable high-dimensional bayesian optimization for combinatorial and mixed spaces. In Thirty-seventh Conference on Neural Information Processing Systems, 2023

work page 2023

[41] [41]

Exploring Exploration in Bayesian Optimization

Papenmeier, L., Cheng, N., Becker, S., and Nardi, L. Exploring exploration in bayesian optimization. arXiv preprint arXiv:2502.08208, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[42] [42]

and Ng, S

Pedrielli, G. and Ng, S. H. G-STAR: A new kriging-based trust region method for global optimization . In 2016 Winter Simulation Conference (WSC), pp.\ 803--814. IEEE, 2016

work page 2016

[43] [43]

Bayesian optimization using domain knowledge on the ATRIAS biped

Rai, A., Antonova, R., Song, S., Martin, W., Geyer, H., and Atkeson, C. Bayesian optimization using domain knowledge on the ATRIAS biped . In 2018 IEEE International Conference on Robotics and Automation (ICRA), pp.\ 1771--1778. IEEE, 2018

work page 2018

[44] [44]

Cylindrical Thompson Sampling for High-Dimensional Bayesian Optimization

Rashidi, B., Johnstonbaugh, K., and Gao, C. Cylindrical Thompson Sampling for High-Dimensional Bayesian Optimization . In International Conference on Artificial Intelligence and Statistics, pp.\ 3502--3510. PMLR, 2024

work page 2024

[45] [45]

Regis, R. G. Trust regions in Kriging-based optimization with expected improvement . Engineering optimization, 48 0 (6): 0 1037--1059, 2016

work page 2016

[46] [46]

Regis, R. G. and Shoemaker, C. A. Combining radial basis function surrogates and dynamic coordinate search in high-dimensional expensive black-box optimization. Engineering Optimization, 45 0 (5): 0 529--555, 2013

work page 2013

[47] [47]

Lassobench: A high-dimensional hyperparameter optimization benchmark suite for lasso

S ehi \'c , K., Gramfort, A., Salmon, J., and Nardi, L. Lassobench: A high-dimensional hyperparameter optimization benchmark suite for lasso . In International Conference on Automated Machine Learning, pp.\ 2--1. PMLR, 2022

work page 2022

[48] [48]

Monte carlo tree search based variable selection for high dimensional bayesian optimization

Song, L., Xue, K., Huang, X., and Qian, C. Monte carlo tree search based variable selection for high dimensional bayesian optimization . Advances in Neural Information Processing Systems, 35: 0 28488--28501, 2022

work page 2022

[49] [49]

Gaussian process optimization in the bandit setting: no regret and experimental design

Srinivas, N., Krause, A., Kakade, S., and Seeger, M. Gaussian process optimization in the bandit setting: no regret and experimental design . In Proceedings of the 27th International Conference on International Conference on Machine Learning, ICML'10, pp.\ 1015–1022, Madison, WI, USA, 2010. Omnipress. ISBN 9781605589077

work page 2010

[50] [50]

Tripp, A., Daxberger, E., and Hern\' a ndez-Lobato, J. M. Sample- E fficient O ptimization in the L atent S pace of D eep G enerative M odels via W eighted R etraining. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems (NeurIPS), volume 33, pp.\ 11259--11272. Curran Associates, I...

work page 2020

[51] [51]

Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search

Wang, L., Fonseca, R., and Tian, Y. Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search . Advances in Neural Information Processing Systems, 33: 0 19511--19522, 2020

work page 2020

[52] [52]

Bayesian optimization in a billion dimensions via random embeddings

Wang, Z., Hutter, F., Zoghi, M., Matheson, D., and De Feitas, N. Bayesian optimization in a billion dimensions via random embeddings . Journal of Artificial Intelligence Research, 55: 0 361--387, 2016

work page 2016

[53] [53]

Batched large-scale Bayesian optimization in high-dimensional spaces

Wang, Z., Gehring, C., Kohli, P., and Jegelka, S. Batched large-scale Bayesian optimization in high-dimensional spaces . In International Conference on Artificial Intelligence and Statistics, pp.\ 745--754. PMLR, 2018

work page 2018

[54] [54]

Williams, C. K. and Rasmussen, C. E. Gaussian processes for machine learning, volume 2. MIT press Cambridge, MA, 2006

work page 2006

[55] [55]

Wolpert, D. H. and Macready, W. G. No free lunch theorems for optimization. IEEE transactions on evolutionary computation, 1 0 (1): 0 67--82, 1997

work page 1997

[56] [56]

and Zhe, S

Xu, Z. and Zhe, S. Standard Gaussian Process is All You Need for High-Dimensional Bayesian Optimization . arXiv preprint arXiv:2402.02746v3, 2024

work page arXiv 2024

[57] [57]

Ziomek, J. K. and Ammar, H. B. Are random decompositions all we need in high dimensional Bayesian optimisation? In International Conference on Machine Learning, pp.\ 43347--43368. PMLR, 2023

work page 2023