pith. sign in

arxiv: 2304.11468 · v1 · pith:PCU62FAUnew · submitted 2023-04-22 · 💻 cs.LG · stat.ML

Increasing the Scope as You Learn: Adaptive Bayesian Optimization in Nested Subspaces

Pith reviewed 2026-05-24 09:20 UTC · model grok-4.3

classification 💻 cs.LG stat.ML
keywords Bayesian optimizationhigh-dimensional optimizationnested subspacesadaptive optimizationblack-box functionstheoretical guaranteesmachine learning
0
0 comments X

The pith

BAxUS adapts the optimization space using nested random subspaces for high-dimensional Bayesian optimization with failure guarantees.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents BAxUS, a method for Bayesian optimization that uses a family of nested random subspaces to start in lower dimensions and expand as it gathers information about the black-box function. This addresses the issues in existing high-dimensional BO methods that can degrade or fail under unverifiable assumptions as dimensions increase. By adapting the space dynamically, BAxUS maintains high performance across applications like life sciences and robotics. Theoretical guarantees ensure the method does not risk failure by missing the relevant directions. Evaluations show it outperforms state-of-the-art approaches on a broad set of problems.

Core claim

BAxUS leverages a novel family of nested random subspaces to adapt the space it optimizes over to the problem. This ensures high performance while removing the risk of failure, which we assert via theoretical guarantees. A comprehensive evaluation demonstrates that BAxUS achieves better results than the state-of-the-art methods for a broad set of applications.

What carries the argument

A novel family of nested random subspaces that the method uses to adaptively expand the search space as it learns.

If this is right

  • Outperforms state-of-the-art high-dimensional BO methods on various applications
  • Provides theoretical guarantees that eliminate the risk of failure present in prior methods
  • Maintains performance as the number of dimensions increases
  • Applicable to expensive black-box functions in life sciences, neural architecture search, and robotics

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The adaptive subspace approach could be extended to other sequential decision-making problems beyond optimization.
  • If the nested structure preserves optimality conditions, similar techniques might improve scalability in related fields like reinforcement learning.
  • Testing on functions with known low effective dimensionality could validate the adaptation speed.

Load-bearing premise

The family of nested random subspaces can be constructed and adapted to preserve both performance and theoretical guarantees without adding unverifiable assumptions about the objective function.

What would settle it

Finding a high-dimensional objective function where BAxUS fails to locate the optimum despite the subspaces being nested and random, while other methods succeed.

Figures

Figures reproduced from arXiv: 2304.11468 by Leonard Papenmeier, Luigi Nardi, Matthias Poloczek.

Figure 1
Figure 1. Figure 1: Observations are kept when increasing the target dimensionality. We give an example [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The worst-case guarantees for the success probabilities [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Top row: 124D MOPTA08 (l): BAXUS obtains the best solutions, followed by TURBO and CMA-ES. 388D SVM (c): BAXUS outperforms the other methods from the start. 500D BRANIN (r): SAASBO, BAXUS, ALEBO, and HESBO find an optimum; SAASBO and ALEBO converge fastest. Bottom row: 100D LASSO-HARD (l) and 300D LASSO-HIGH (c): BAXUS outperforms the baselines. SAASBO, ALEBO, and HESBO struggle. 500D HARTMANN6 (r): SAASBO… view at source ↗
Figure 4
Figure 4. Figure 4: BAXUS outperforms the SOTA and in particular proves to be robust to observational noise on the 1000D LASSO-HARD (l) and the 300D LASSO-HIGH (r). Note that CMA-ES performs considerably worse than on the noise-free versions of the benchmarks. 9 [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Left: the BAXUS embedding gives better optimization performance on the shifted ACKLEY10 function: TURBO in embedded subspaces of the BAXUS and HESBO embeddings. The BAXUS embedding has a higher probability to contain the optimum. Right: the distribution of the final incumbents (lower the better). The horizontal bars show the median. The left side of [PITH_FULL_IMAGE:figures/full_fig_p024_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: An evaluation of BAXUS and TURBO with BAXUS embeddings of different target dimensionalities on LASSO-HARD: We run TURBO with the BAXUS embedding for fixed target dimensionalities d = 2, 10, 20, . . . , 100 and compare to BAXUS. Summing up, we observe that BAXUS achieves a better performance than TURBO with a fixed embedding dimensionality. C.2 Evaluation on an additional Lasso benchmark In addition to the … view at source ↗
Figure 7
Figure 7. Figure 7: BAXUS and baselines on LASSO-DNA. As before, BAXUS makes considerable progress in the beginning and converges faster than TURBO and CMA-ES [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: An evaluation of BAXUS and other methods on high-dimensional test problems of MuJoCo. again using the BAXUS embedding. This gives a smaller (in terms of number of rows) projection matrix S˜| which we finally use to update S | : Algorithm 2 Observation-preserving embedding increase Input: transposed embedding matrix S | , number of new bins per latent dimension b, observed points Y ∈ [−1, 1]n×d . Output: up… view at source ↗
read the original abstract

Recent advances have extended the scope of Bayesian optimization (BO) to expensive-to-evaluate black-box functions with dozens of dimensions, aspiring to unlock impactful applications, for example, in the life sciences, neural architecture search, and robotics. However, a closer examination reveals that the state-of-the-art methods for high-dimensional Bayesian optimization (HDBO) suffer from degrading performance as the number of dimensions increases or even risk failure if certain unverifiable assumptions are not met. This paper proposes BAxUS that leverages a novel family of nested random subspaces to adapt the space it optimizes over to the problem. This ensures high performance while removing the risk of failure, which we assert via theoretical guarantees. A comprehensive evaluation demonstrates that BAxUS achieves better results than the state-of-the-art methods for a broad set of applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes BAxUS, a high-dimensional Bayesian optimization method that employs a novel family of nested random subspaces to adaptively restrict the optimization space to the problem at hand. It claims that this construction yields both improved empirical performance over existing HDBO methods across a range of applications and theoretical guarantees that eliminate the risk of failure associated with unverifiable assumptions in prior work.

Significance. If the theoretical guarantees are shown to hold unconditionally for arbitrary black-box objectives and the empirical gains prove robust, the work would address a key limitation in scaling BO to dozens of dimensions without introducing new structural assumptions, with potential impact on life-sciences, NAS, and robotics applications.

major comments (2)
  1. [Abstract] Abstract: the central claim that the nested-subspace family 'removes the risk of failure' via theoretical guarantees is load-bearing, yet the abstract supplies no statement of the probability space, the precise failure event being bounded, or the conditions (e.g., effective dimension, alignment probability) under which the guarantee is derived; without these, it is impossible to determine whether the guarantee is unconditional or merely high-probability conditional on the objective satisfying an unverifiable structural property.
  2. [Abstract] Abstract / Theoretical Analysis (implied): the adaptation rule must discover alignment with relevant directions while preserving both the empirical gains and the claimed failure-resistance; the manuscript must demonstrate that this rule does not re-introduce the very unverifiable assumptions the method is advertised to avoid.
minor comments (1)
  1. [Abstract] Abstract: the evaluation is described only as 'comprehensive' with no mention of the benchmark suite, dimensionality range, number of repetitions, or statistical testing protocol.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments on the abstract and theoretical claims. We address each point below and indicate planned revisions.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that the nested-subspace family 'removes the risk of failure' via theoretical guarantees is load-bearing, yet the abstract supplies no statement of the probability space, the precise failure event being bounded, or the conditions (e.g., effective dimension, alignment probability) under which the guarantee is derived; without these, it is impossible to determine whether the guarantee is unconditional or merely high-probability conditional on the objective satisfying an unverifiable structural property.

    Authors: We agree the abstract should state the setting more precisely. The guarantees are high-probability statements (over the random nested subspace construction) that the relevant directions of an objective with low effective dimension are included in the active subspace; the failure event is the event that these directions are missed. The probability depends only on the sampling of the nested family and the effective dimension, not on further unverifiable properties of the objective. We will revise the abstract to include a concise statement of the probability space and failure event. revision: yes

  2. Referee: [Abstract] Abstract / Theoretical Analysis (implied): the adaptation rule must discover alignment with relevant directions while preserving both the empirical gains and the claimed failure-resistance; the manuscript must demonstrate that this rule does not re-introduce the very unverifiable assumptions the method is advertised to avoid.

    Authors: The adaptation rule expands the subspace using only observed function values and does not presuppose any fixed alignment or structural property of the objective. The theoretical analysis shows that the high-probability inclusion guarantee continues to hold under the same random-subspace measure even after adaptive expansion; no additional unverifiable assumptions on the objective are required. We will add a clarifying sentence to the abstract and introduction to make this explicit. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation chain is self-contained

full rationale

The paper introduces BAxUS with a family of nested random subspaces and asserts theoretical guarantees that remove failure risk. No equations or steps in the provided abstract or description reduce a claimed prediction or guarantee to a fitted input by construction, nor do they rely on load-bearing self-citations whose content is unverified within the paper. The central claims rest on the proposed adaptation mechanism and external theoretical assertions rather than renaming or self-defining the result. Empirical evaluation is presented separately from the guarantees. This is the normal case of an independent derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit free parameters, axioms, or invented entities; the method is described only at the level of 'novel family of nested random subspaces' without further decomposition.

pith-pipeline@v0.9.0 · 5669 in / 1037 out tokens · 18489 ms · 2026-05-24T09:20:04.341435+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Global Convergence of Sampling-Based Nonconvex Optimization through Diffusion-Style Smoothing

    cs.LG 2026-05 unverdicted novelty 6.0

    Recasts sampling-based nonconvex optimization as smoothed gradient descent to obtain non-asymptotic convergence guarantees and introduces the DIDA annealed algorithm that converges to the global optimum.

Reference graph

Works this paper leans on

81 extracted references · 81 canonical work pages · cited by 1 Pith paper · 1 internal anchor

  1. [1]

    Bakshy, L

    E. Bakshy, L. Dworkin, B. Karrer, K. Kashin, B. Letham, A. Murthy, and S. Singh. Ae: A domain-agnostic platform for adaptive experimentation. In Conference on Neural Information Processing Systems, pages 1–8, 2018

  2. [2]

    Baptista and M

    R. Baptista and M. Poloczek. Bayesian optimization of combinatorial structures. InInternational Conference on Machine Learning, pages 462–471. PMLR, 2018

  3. [3]

    E. F. Beckenbach and R. Bellman.Inequalities, volume 30. Springer Science & Business Media, 2012

  4. [4]

    Bergstra and Y

    J. Bergstra and Y . Bengio. Random search for hyper-parameter optimization.Journal of machine learning research, 13(2), 2012

  5. [5]

    Bergstra, R

    J. Bergstra, R. Bardenet, Y . Bengio, and B. Kégl. Algorithms for Hyper-Parameter Optimiza- tion. In Advances in Neural Information Processing Systems (NeurIPS), volume 24. Curran Associates, Inc., 2011

  6. [6]

    M. Binois. Uncertainty quantification on Pareto fronts and high-dimensional strategies in Bayesian optimization, with applications in multi-objective automotive design . PhD thesis, Ecole Nationale Supérieure des Mines de Saint-Etienne, 2015

  7. [7]

    Binois and N

    M. Binois and N. Wycoff. A survey on high-dimensional Gaussian process modeling with appli- cation to Bayesian optimization. ACM Transactions on Evolutionary Learning and Optimization, 2(2):1–26, 2022

  8. [8]

    Binois, D

    M. Binois, D. Ginsbourger, and O. Roustant. A warped kernel improving robustness in Bayesian optimization via random embeddings. In International Conference on Learning and Intelligent Optimization (LION), pages 281–286. Springer, 2015

  9. [9]

    Binois, D

    M. Binois, D. Ginsbourger, and O. Roustant. On the choice of the low-dimensional domain for global optimization via random embeddings. Journal of global optimization, 76(1):69–90, 2020

  10. [10]

    M. A. Bouhlel, N. Bartoli, R. G. Regis, A. Otsmane, and J. Morlier. Efficient global optimization for high-dimensional constrained problems by using the Kriging models combined with the partial least squares method. Engineering Optimization, 50(12):2038–2053, 2018

  11. [11]

    Burger, P

    B. Burger, P. M. Maffettone, V . V . Gusev, C. M. Aitchison, Y . Bai, X. Wang, X. Li, B. M. Alston, B. Li, R. Clowes, et al. A mobile robotic chemist. Nature, 583(7815):237–241, 2020

  12. [12]

    Calandra, N

    R. Calandra, N. Gopalan, A. Seyfarth, J. Peters, and M. P. Deisenroth. Bayesian Gait Opti- mization for Bipedal Locomotion. In P. M. Pardalos, M. G. Resende, C. V ogiatzis, and J. L. Walteros, editors, Learning and Intelligent Optimization, pages 274–290, Cham, 2014. Springer International Publishing

  13. [13]

    Calandra, A

    R. Calandra, A. Seyfarth, J. Peters, and M. P. Deisenroth. Bayesian optimization for learning gaits under uncertainty. Annals of Mathematics and Artificial Intelligence, 76(1):5–23, 2016

  14. [14]

    Candelieri, R

    A. Candelieri, R. Perego, and F. Archetti. Bayesian Optimization of Pump Operations in Water Distribution Systems. Journal of Global Optimization, 71(1):213–235, May 2018. ISSN 0925-5001

  15. [15]

    Charikar, K

    M. Charikar, K. Chen, and M. Farach-Colton. Finding Frequent Items in Data Streams. In P. Widmayer, S. Eidenbenz, F. Triguero, R. Morales, R. Conejo, and M. Hennessy, editors, Automata, Languages and Programming, pages 693–703, Berlin, Heidelberg, 2002. Springer Berlin Heidelberg. ISBN 978-3-540-45465-6

  16. [16]

    J. Chen, G. Zhu, C. Yuan, and Y . Huang. Semi-supervised Embedding Learning for High- dimensional Bayesian Optimization. arXiv preprint arXiv:2005.14601, 2020

  17. [17]

    P. G. Constantine. Active Subspaces. Society for Industrial and Applied Mathematics, Philadel- phia, PA, 2015. 11

  18. [18]

    Cosenza, R

    Z. Cosenza, R. Astudillo, P. Frazier, K. Baar, and D. E. Block. Multi-Information Source Bayesian Optimization of Culture Media for Cellular Agriculture. Biotechnology and Bioengi- neering, 2022

  19. [19]

    Ejjeh, L

    A. Ejjeh, L. Medvinsky, A. Councilman, H. Nehra, S. Sharma, V . Adve, L. Nardi, E. Nurvitadhi, and R. A. Rutenbar. HPVM2FPGA: Enabling True Hardware-Agnostic FPGA Programming. In Proceedings of the 33rd IEEE International Conference on Application-specific Systems, Architectures, and Processors, 2022

  20. [20]

    Eriksson and M

    D. Eriksson and M. Jankowiak. High-dimensional Bayesian optimization with sparse axis- aligned subspaces. In C. de Campos and M. H. Maathuis, editors, Proceedings of the Thirty- Seventh Conference on Uncertainty in Artificial Intelligence, volume 161 of Proceedings of Machine Learning Research, pages 493–503. PMLR, 27–30 Jul 2021

  21. [21]

    Eriksson and M

    D. Eriksson and M. Poloczek. Scalable Constrained Bayesian Optimization. In A. Banerjee and K. Fukumizu, editors, Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, volume 130 of Proceedings of Machine Learning Research, pages 730–738. PMLR, 13–15 Apr 2021

  22. [22]

    Eriksson, M

    D. Eriksson, M. Pearce, J. Gardner, R. D. Turner, and M. Poloczek. Scalable Global Optimiza- tion via Local Bayesian Optimization. In Advances in Neural Information Processing Systems (NeurIPS), pages 5496–5507, 2019

  23. [23]

    P. I. Frazier and J. Wang. Bayesian Optimization for Materials Design, pages 45–75. Springer International Publishing, Cham, 2016. ISBN 978-3-319-23871-5

  24. [24]

    Gardner, C

    J. Gardner, C. Guo, K. Weinberger, R. Garnett, and R. Grosse. Discovering and exploiting addi- tive structure for Bayesian optimization. In International Conference on Artificial Intelligence and Statistics, pages 1311–1319, 2017

  25. [25]

    R. L. Graham, D. E. Knuth, O. Patashnik, and S. Liu. Concrete mathematics: a foundation for computer science. Computers in Physics, 3(5):106–107, 1989

  26. [26]

    Hansen and A

    N. Hansen and A. Ostermeier. Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation. In Proceedings of IEEE International Conference on Evolutionary Computation (ICEC), pages 312–317, 1996

  27. [27]

    Hansen, Y

    N. Hansen, Y . Akimoto, and P. Baudis. CMA-ES/pycma on Github. Zenodo, DOI:10.5281/zenodo.2559634, Feb. 2019. Last accessed: 05/09/2022. License: BSD-3-Clause

  28. [28]

    F. Hase, L. M. Roch, C. Kreisbeck, and A. Aspuru-Guzik. Phoenics: a Bayesian optimizer for chemistry. ACS central science, 4(9):1134–1145, 2018

  29. [29]

    F. Häse, M. Aldeghi, R. J. Hickman, L. M. Roch, and A. Aspuru-Guzik. Gryffin: An algorithm for Bayesian optimization of categorical variables informed by expert knowledge. Applied Physics Reviews, 8(3):031406, 2021

  30. [30]

    H. C. Herbol, W. Hu, P. Frazier, P. Clancy, and M. Poloczek. Efficient search of compositional space for hybrid organic–inorganic perovskites via Bayesian optimization. npj Computational Materials, 4(1):1–7, 2018

  31. [31]

    J. M. Hernández-Lobato, J. Requeima, E. O. Pyzer-Knapp, and A. Aspuru-Guzik. Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1470–1479. PMLR, 06–11 Aug 2017

  32. [32]

    Z. E. Hughes, M. A. Nguyen, J. Wang, Y . Liu, M. T. Swihart, M. Poloczek, P. I. Frazier, M. R. Knecht, and T. R. Walsh. Tuning materials-binding peptide sequences toward gold-and silver-binding selectivity with Bayesian optimization. ACS nano, 15(11):18260–18269, 2021

  33. [33]

    Hvarfner, D

    C. Hvarfner, D. Stoll, A. Souza, L. Nardi, M. Lindauer, and F. Hutter. PiBO: Augmenting Acquisition Functions with User Beliefs for Bayesian Optimization. InInternational Conference on Learning Representations, 2022. 12

  34. [34]

    D. R. Jones. Large-scale multi-disciplinary mass optimization in the auto industry. In MOPTA 2008 Conference (20 August 2008), 2008

  35. [35]

    Kandasamy, J

    K. Kandasamy, J. Schneider, and B. Póczos. High dimensional Bayesian optimisation and bandits via additive models. In International conference on machine learning (ICML), pages 295–304, 2015

  36. [36]

    Kandasamy, W

    K. Kandasamy, W. Neiswanger, J. Schneider, B. Poczos, and E. P. Xing. Neural Architec- ture Search with Bayesian Optimisation and Optimal Transport. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems (NeurIPS), volume 31. Curran Associates, Inc., 2018

  37. [37]

    D. P. Kingma and M. Welling. Auto-Encoding Variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014

  38. [38]

    Klein, S

    A. Klein, S. Falkner, S. Bartels, P. Hennig, and F. Hutter. Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets. In A. Singh and J. Zhu, editors, Proceedings of the 20th International Conference on Artificial Intelligence and Statistics , volume 54 of Proceedings of Machine Learning Research, pages 528–536. PMLR, 20–22 Apr 2017

  39. [39]

    R. Lam, M. Poloczek, P. Frazier, and K. E. Willcox. Advances in Bayesian optimization with applications in aerospace engineering. In2018 AIAA Non-Deterministic Approaches Conference, page 1656, 2018

  40. [40]

    Letham, R

    B. Letham, R. Calandra, A. Rai, and E. Bakshy. Re-Examining Linear Embeddings for High- Dimensional Bayesian Optimization. In Advances in Neural Information Processing Systems (NeurIPS), volume 33, pages 1546–1558. Curran Associates, Inc., 2020

  41. [41]

    D. J. Lizotte, T. Wang, M. H. Bowling, D. Schuurmans, et al. Automatic Gait Optimization With Gaussian Process Regression. In IJCAI, volume 7, pages 944–949, 2007

  42. [42]

    X. Lu, J. Gonzalez, Z. Dai, and N. D. Lawrence. Structured Variationally Auto-encoded Optimization. In J. G. Dy and A. Krause, editors, Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research , pages 3273–3281. PMLR, 2018

  43. [43]

    T. W. Lukaczyk, P. Constantine, F. Palacios, and J. J. Alonso. Active subspaces for shape optimization. In 10th AIAA multidisciplinary design optimization conference, page 1171, 2014

  44. [44]

    A. W. Marshall, I. Olkin, and B. C. Arnold. Inequalities: theory of majorization and its applications, volume 143. Springer, 1979

  45. [45]

    N. Maus, H. T. Jones, J. Moore, M. Kusner, J. Bradshaw, and J. R. Gardner. Local latent space bayesian optimization over structured inputs. In Advances in Neural Information Processing Systems, 2022

  46. [46]

    M. Mayr, F. Ahmad, K. I. Chatzilygeroudis, L. Nardi, and V . Krüger. Skill-based Multi-objective Reinforcement Learning of Industrial Robot Tasks with Planning and Knowledge Integration. CoRR, abs/2203.10033, 2022

  47. [47]

    Moriconi, M

    R. Moriconi, M. P. Deisenroth, and K. S. Sesh Kumar. High-dimensional Bayesian optimization using low-dimensional feature spaces. Machine Learning, 109(9):1925–1943, Sep 2020. ISSN 1573-0565

  48. [48]

    Mutny and A

    M. Mutny and A. Krause. Efficient high dimensional Bayesian optimization with additivity and quadrature Fourier features. Advances in Neural Information Processing Systems, 31, 2018

  49. [49]

    Nardi, D

    L. Nardi, D. Koeplinger, and K. Olukotun. Practical design space exploration. In 2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pages 347–358. IEEE, 2019. 13

  50. [50]

    Nayebi, A

    A. Nayebi, A. Munteanu, and M. Poloczek. A framework for Bayesian Optimization in Embedded Subspaces. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research (PMLR), pages 4752–4761. PMLR, 09–15 Jun 2019

  51. [51]

    D. M. Negoescu, P. I. Frazier, and W. B. Powell. The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery. INFORMS Journal on Computing, 23(3):346–363, 2011

  52. [52]

    Packwood

    D. Packwood. Bayesian Optimization for Materials Science. Springer, 2017

  53. [53]

    Pedrielli and S

    G. Pedrielli and S. H. Ng. G-STAR: A new kriging-based trust region method for global optimization. In 2016 Winter Simulation Conference (WSC), pages 803–814. IEEE, 2016

  54. [54]

    A. Rai, R. Antonova, S. Song, W. Martin, H. Geyer, and C. Atkeson. Bayesian optimization using domain knowledge on the ATRIAS biped. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pages 1771–1778. IEEE, 2018

  55. [55]

    R. G. Regis. Trust regions in Kriging-based optimization with expected improvement. Engi- neering optimization, 48(6):1037–1059, 2016

  56. [56]

    H. Robbins. A remark on stirling’s formula. The American mathematical monthly, 62(1):26–29, 1955

  57. [57]

    B. Ru, X. Wan, X. Dong, and M. Osborne. Interpretable Neural Architecture Search via Bayesian Optimisation with Weisfeiler-Lehman Kernels. In International Conference on Learning Representations, 2021

  58. [58]

    A. M. Schweidtmann, A. D. Clayton, N. Holmes, E. Bradford, R. A. Bourne, and A. A. Lapkin. Machine learning meets continuous flow chemistry: Automated optimization towards the Pareto front of multiple objectives. Chemical Engineering Journal, 352:277–282, 2018

  59. [59]

    Šehi´c, A

    K. Šehi´c, A. Gramfort, J. Salmon, and L. Nardi. LassoBench: A High-Dimensional Hyperpa- rameter Optimization Benchmark Suite for Lasso. In First Conference on Automated Machine Learning (Main Track), 2022

  60. [60]

    B. J. Shields, J. Stevens, J. Li, M. Parasram, F. Damani, J. I. M. Alvarado, J. M. Janey, R. P. Adams, and A. G. Doyle. Bayesian reaction optimization as a tool for chemical synthesis. Nature, 590(7844):89–96, 2021

  61. [61]

    Snoek, H

    J. Snoek, H. Larochelle, and R. P. Adams. Practical Bayesian Optimization of Machine Learning Algorithms. In F. Pereira, C. Burges, L. Bottou, and K. Weinberger, editors,Advances in Neural Information Processing Systems, volume 25. Curran Associates, Inc., 2012

  62. [62]

    H. H. Sohrab. Basic real analysis, volume 231. Springer, 2003

  63. [63]

    F. J. Solis and R. J.-B. Wets. Minimization by Random Search Techniques. Mathematics of Operations Research, 6(1):19–30, 1981. ISSN 0364765X, 15265471

  64. [64]

    DeepFreak: Learning Crystallography Diffraction Patterns with Automated Machine Learning

    A. Souza, L. B. Oliveira, S. Hollatz, M. Feldman, K. Olukotun, J. M. Holton, A. E. Cohen, and L. Nardi. DeepFreak: Learning crystallography diffraction patterns with automated machine learning. arXiv preprint arXiv:1904.11834, 2019

  65. [65]

    Tallorin, J

    L. Tallorin, J. Wang, W. E. Kim, S. Sahu, N. M. Kosa, P. Yang, M. Thompson, M. K. Gilson, P. I. Frazier, M. D. Burkart, et al. Discovering de novo peptide substrates for enzymes using machine learning. Nature communications, 9(1):1–10, 2018

  66. [66]

    W. R. Thompson. On the Likelihood that One Unknown Probability Exceeds Another in View of the Evidence of Two Samples. Biometrika, 25(3/4):285–294, 1933. ISSN 00063444

  67. [67]

    Tripp, E

    A. Tripp, E. Daxberger, and J. M. Hernández-Lobato. Sample-Efficient Optimization in the Latent Space of Deep Generative Models via Weighted Retraining. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems (NeurIPS), volume 33, pages 11259–11272. Curran Associates, Inc., 2020. 14

  68. [68]

    T. Ueno, T. D. Rhone, Z. Hou, T. Mizoguchi, and K. Tsuda. COMBO: An efficient Bayesian optimization library for materials science. Materials Discovery, 4:18–21, 2016. ISSN 2352- 9245

  69. [69]

    X. Wan, V . Nguyen, H. Ha, B. Ru, C. Lu, and M. A. Osborne. Think Global and Act Lo- cal: Bayesian Optimisation over High-Dimensional Categorical and Mixed Search Spaces. In Proceedings of the 38th International Conference on Machine Learning , volume 139 of Proceedings of Machine Learning Research, pages 10663–10674. PMLR, 18–24 Jul 2021

  70. [70]

    L. Wang, R. Fonseca, and Y . Tian. Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search. Advances in Neural Information Processing Systems , 33: 19511–19522, 2020

  71. [71]

    Z. Wang, F. Hutter, M. Zoghi, D. Matheson, and N. de Feitas. Bayesian optimization in a billion dimensions via random embeddings. Journal of Artificial Intelligence Research (JAIR), 55: 361–387, 2016

  72. [72]

    Z. Wang, C. Gehring, P. Kohli, and S. Jegelka. Batched large-scale Bayesian optimization in high-dimensional spaces. In International Conference on Artificial Intelligence and Statistics, pages 745–754, 2018

  73. [73]

    C. K. Williams and C. E. Rasmussen. Gaussian processes for machine learning, volume 2. MIT press Cambridge, MA, 2006

  74. [74]

    D. P. Woodruff et al. Sketching as a tool for numerical linear algebra. Foundations and Trends® in Theoretical Computer Science, 10(1–2):1–157, 2014

  75. [75]

    J. Zhou, Z. Yang, Y . Si, L. Kang, H. Li, M. Wang, and Z. Zhang. A trust-region parallel Bayesian optimization method for simulation-driven antenna design. IEEE Transactions on Antennas and Propagation, 69(7):3966–3981, 2020. 15 Checklist

  76. [76]

    For all authors... (a) Do the main claims made in the abstract and introduction accurately reflect the paper’s contributions and scope? [Yes] (b) Did you describe the limitations of your work? [Yes] See Section 5. (c) Did you discuss any potential negative societal impacts of your work? [Yes] See Section 5. (d) Have you read the ethics review guidelines an...

  77. [77]

    (a) Did you state the full set of assumptions of all theoretical results? [Yes] (b) Did you include complete proofs of all theoretical results? [Yes] See Appendix A

    If you are including theoretical results... (a) Did you state the full set of assumptions of all theoretical results? [Yes] (b) Did you include complete proofs of all theoretical results? [Yes] See Appendix A

  78. [78]

    If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experi- mental results (either in the supplemental material or as a URL)? [Yes] (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] (c) Did you report error bars (e.g., with respect to the r...

  79. [79]

    (a) If your work uses existing assets, did you cite the creators? [Yes] See Appendix E

    If you are using existing assets (e.g., code, data, models) or curating/releasing new assets... (a) If your work uses existing assets, did you cite the creators? [Yes] See Appendix E. (b) Did you mention the license of the assets? [Yes] See Appendix E. (c) Did you include any new assets either in the supplemental material or as a URL? [Yes] (d) Did you di...

  80. [80]

    All de active input dimensions are mapped to distinct target dimensions

    If you used crowdsourcing or conducted research with human subjects... (a) Did you include the full text of instructions given to participants and screenshots, if applicable? [N/A] (b) Did you describe any potential participant risks, with links to Institutional Review Board (IRB) approvals, if applicable? [N/A] (c) Did you include the estimated hourly wa...

Showing first 80 references.