pith. sign in

arxiv: 2506.09894 · v1 · submitted 2025-06-11 · ⚛️ physics.comp-ph · cond-mat.mtrl-sci

Choosing a Suitable Acquisition Function for Batch Bayesian Optimization: Comparison of Serial and Monte Carlo Approaches

Pith reviewed 2026-05-19 09:53 UTC · model grok-4.3

classification ⚛️ physics.comp-ph cond-mat.mtrl-sci
keywords Bayesian optimizationbatch acquisition functionsblack-box optimizationmaterials synthesisperovskite solar cellsMonte Carlo acquisition
0
0 comments X

The pith

qUCB is the recommended default acquisition function for batch Bayesian optimization of black-box functions in up to six dimensions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares serial and Monte Carlo batch acquisition functions to decide how to select groups of points for testing when optimizing an unknown costly function. It evaluates them on the Ackley and Hartmann test problems, which stand in for difficult search landscapes in materials work, plus a model fitted to real perovskite solar cell data. The tests show that q-upper confidence bound delivers more reliable optima with fewer evaluations when nothing is known ahead of time about the shape or noise level of the target function. This choice matters because each experimental run is expensive, so a better default strategy can reduce the total number of trials needed to reach a strong result.

Core claim

Tests on the six-dimensional Ackley and Hartmann functions and on an empirical model of perovskite solar cell power conversion efficiency show that qUCB and serial UCB with local penalization both perform well in noiseless settings while qlogEI lags, but all three Monte Carlo methods converge faster and with less dependence on starting conditions than UCB/LP once noise is added; the overall recommendation is therefore to use qUCB as the default when optimizing a black-box function in six or fewer dimensions without prior landscape knowledge.

What carries the argument

q-upper confidence bound (qUCB) Monte Carlo batch acquisition function that selects parallel evaluation points by balancing predicted value against uncertainty.

If this is right

  • qUCB reaches a confident modeled optimum while using fewer expensive samples than the alternatives in low-dimensional noisy settings.
  • All Monte Carlo approaches show faster convergence and lower sensitivity to initial conditions than serial UCB/LP when noise is present.
  • In noiseless conditions both UCB/LP and qUCB outperform qlogEI on the Ackley and Hartmann proxies.
  • The same ordering of performance holds when the acquisition functions are applied to an empirical model built from real perovskite solar cell measurements.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Experimental groups working on new materials optimization tasks could adopt qUCB as their starting acquisition function to reduce the number of synthesis or processing runs required.
  • The same comparison framework could be repeated on functions with more than six dimensions or on other classes of experimental data to test whether the preference for qUCB persists.
  • Software tools for Bayesian optimization could expose qUCB as the pre-selected default for users who supply no prior information about their objective.

Load-bearing premise

The Ackley and Hartmann functions together with the regression model from perovskite data are representative proxies for the optimization landscapes encountered in typical materials synthesis experiments.

What would settle it

A new comparison on a different test function or on fresh experimental data in which qlogEI or UCB/LP reaches a high-confidence optimum using strictly fewer evaluations than qUCB would falsify the default recommendation.

read the original abstract

Batch Bayesian optimization is widely used for optimizing expensive experimental processes when several samples can be tested together to save time or cost. A central decision in designing a Bayesian optimization campaign to guide experiments is the choice of a batch acquisition function when little or nothing is known about the landscape of the "black box" function to be optimized. To inform this decision, we first compare the performance of serial and Monte Carlo batch acquisition functions on two mathematical functions that serve as proxies for typical materials synthesis and processing experiments. The two functions, both in six dimensions, are the Ackley function, which epitomizes a "needle-in-haystack" search, and the Hartmann function, which exemplifies a "false optimum" problem. Our study evaluates the serial upper confidence bound with local penalization (UCB/LP) batch acquisition policy against Monte Carlo-based parallel approaches: q-log expected improvement (qlogEI) and q-upper confidence bound (qUCB), where q is the batch size. Tests on Ackley and Hartmann show that UCB/LP and qUCB perform well in noiseless conditions, both outperforming qlogEI. For the Hartmann function with noise, all Monte Carlo functions achieve faster convergence with less sensitivity to initial conditions compared to UCB/LP. We then confirm the findings on an empirical regression model built from experimental data in maximizing power conversion efficiency of flexible perovskite solar cells. Our results suggest that when empirically optimizing a "black-box" function in less than or equal to six dimensions with no prior knowledge of the landscape or noise characteristics, qUCB is best suited as the default to maximize confidence in the modeled optimum while minimizing the number of expensive samples needed.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper compares serial UCB with local penalization (UCB/LP) against Monte Carlo batch acquisition functions q-log expected improvement (qlogEI) and q-upper confidence bound (qUCB) for batch Bayesian optimization. Tests are performed on the 6D Ackley (needle-in-haystack) and Hartmann (false-optimum) functions in both noiseless and noisy regimes, plus an empirical regression model derived from perovskite solar cell experimental data for maximizing power conversion efficiency. The central claim is that qUCB is the best default choice for empirically optimizing unknown black-box functions in ≤6 dimensions with no prior knowledge of the landscape or noise, as it maximizes confidence in the modeled optimum while minimizing expensive samples.

Significance. If the performance ordering generalizes beyond the tested cases, the work offers practical guidance for selecting batch acquisition functions in experimental optimization campaigns, particularly in materials science and chemistry where batch evaluations are common. The consistent trends across noiseless/noisy regimes and mathematical/empirical proxies are a strength, though the limited benchmark set constrains the scope of the recommendation.

major comments (1)
  1. [Abstract and results on test functions] The recommendation that qUCB should serve as the default acquisition function for arbitrary unknown 6D black-box functions rests on performance ordering observed only on Ackley, Hartmann, and a single fitted perovskite regression model. No additional test functions, variation in noise structure or modality count, or sensitivity analysis to the regression fitting procedure are reported, so an artifact in any of these three surfaces could invert the ranking. This directly affects the load-bearing generalization in the abstract and conclusion.
minor comments (2)
  1. [Abstract] The abstract states performance trends but provides no error bars, statistical significance tests, or full implementation details (e.g., number of runs, hyperparameter settings for the Gaussian process).
  2. [Empirical results section] Clarify whether the reported trends on the perovskite model include variation across different random seeds or initial designs, as sensitivity to initial conditions is mentioned for the Hartmann function.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive comments on the scope of our benchmarks and the strength of the generalization in the abstract and conclusion. We address the major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract and results on test functions] The recommendation that qUCB should serve as the default acquisition function for arbitrary unknown 6D black-box functions rests on performance ordering observed only on Ackley, Hartmann, and a single fitted perovskite regression model. No additional test functions, variation in noise structure or modality count, or sensitivity analysis to the regression fitting procedure are reported, so an artifact in any of these three surfaces could invert the ranking. This directly affects the load-bearing generalization in the abstract and conclusion.

    Authors: We selected the Ackley and Hartmann functions specifically because they represent two distinct and common challenges in materials optimization landscapes: Ackley as a high-dimensional needle-in-haystack problem with many local minima, and Hartmann as an example with deceptive false optima. The perovskite regression model was included to provide validation on a surface derived from real experimental data rather than purely synthetic functions. We agree, however, that the current set is limited and that an artifact in any of these surfaces could affect the observed ranking. In the revised manuscript we will qualify the language in the abstract and conclusion to state that qUCB is recommended as a default choice based on consistent performance across these representative 6D cases in both noiseless and noisy regimes, rather than claiming it for arbitrary unknown black-box functions. We will also add an explicit limitations paragraph discussing the restricted benchmark set, the lack of systematic variation in noise structure or modality, and the absence of sensitivity analysis on the regression fitting procedure, while suggesting these as directions for future work. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical comparisons on external benchmarks and data-driven model

full rationale

The paper conducts direct numerical evaluations of acquisition functions (UCB/LP, qlogEI, qUCB) on the Ackley and Hartmann functions plus a regression model fitted to perovskite solar cell data. These test surfaces and the empirical dataset are independent of the performance conclusions drawn; the ordering of methods is measured by observed convergence rates rather than by any equation that reduces to a fitted parameter, self-definition, or self-citation chain. No derivations, uniqueness theorems, or ansatzes are invoked that would make the recommendation tautological with the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on standard Bayesian optimization assumptions and uses established test functions as proxies without introducing new free parameters, axioms beyond domain conventions, or invented entities.

axioms (1)
  • domain assumption Gaussian process surrogate models and standard acquisition function definitions remain valid for the chosen test functions and empirical regression model.
    Implicit in the use of UCB, EI, and their batch variants throughout the comparisons.

pith-pipeline@v0.9.0 · 5858 in / 1316 out tokens · 32554 ms · 2026-05-19T09:53:17.375182+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · 3 internal anchors

  1. [1]

    1 X. Wang, Y. Jin, S. Schmitt and M. Olhofer, Recent advances in bayesian optimization, ACM Comput. Surv. , 2023, 55,

  2. [2]

    2 S. Sun, N. T. P. Hartono, Z. D. Ren, F. Oviedo, A. M. Buscemi, M. Layurova, D. X. Chen, T. Ogunfunmi, J. Thapa, S. Ramasamy, C. Settens, B. L. DeCost, A. G. Kusne, Z. Liu, S. I. P. Tian, I. M. Peters, J.-P. Correa-Baena and T. Buonassisi, Accelerated development of perovskite- inspired materials via high-throughput synthesis and machine-learning diagnos...

  3. [3]

    4 B. J. Shields, J. Stevens, J. Li, M. Parasram, F. Damani, J. I. M. Alvarado, J. M. Janey, R. P. Adams and A. G. Doyle, Bayesian reaction optimization as a tool for chemical synthesis, Nature, 2021, 590,8 9 –96. 5 A. E. Gongora, B. Xu, W. Perry, C. Okoye, P. Riley, K. G. Reyes, E. F. Morgan and K. A. Brown, A Bayesian experimental autonomous researcher f...

  4. [4]

    9 P. I. Frazier, A tutorial on Bayesian optimization, arXiv, 2018, preprint, arXiv:1807.02811, DOI: 10.48550/arXiv.1807.02811. © 2025 The Author(s). Published by the Royal Society of Chemistry Digital Discovery Paper Digital Discovery Open Access Article. Published on 30 May

  5. [5]

    This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence

    Downloaded on 6/10/2025 3:27:23 PM. This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. View Article Online 10 D. R. Jones, M. Schonlau and W. J. Welch, E ffi cient global optimization of expensive black-box functions, J. Global Optim., 1998, 13, 455 –492. 11 N. Srinivas, A. Krause, S. M. Kakade and M. Seeger, G...

  6. [6]

    Parallel Bayesian Global Optimization of Expensive Functions

    12 J. Gonz ´alez, Z. Dai, P. Hennig and N. Lawrence, Batch bayesian optimization via local penalization, Proc. Mach. Learn Res. , 2016, 51, 648 –657. 13 D. Ginsbourger, R. Le Riche and L. Carraro, in Computational Intelligence in Expensive Optimization Problems. Adaptation Learning and Optimization , ed. Y. Tenne and C.-K. Goh, Springer-Verlag, 2010, vol....

  7. [7]

    17 J. T. Wilson, F. Hutter and M. P. Deisenroth, in NIPS’18: Proceedings of the 32nd International Conference on Neural Information Processing Systems , 2018, pp. 9906 –9917. 18 M. Balandat, B. Karrer, D. Jian, S. Daulton, B. Letham, A. G. Wilson and E. Bakshy, in NIPS’20: Proceedings of the 34th International Conference on Neural Information Processing S...

  8. [8]

    21 W. Xu, Z. Liu, R. T. Piper and J. W. P. Hsu, Bayesian optimization of photonic curing process for exible perovskite photovoltaic devices, Sol. Energy Mater. Sol. Cells, 2023, 249, 112055. 22 L. P. Fröhlich, E. D. Klenske, J. Vinogradska, C. Daniel and M. N. Zeilinger, Noisy-input entropy search for e ffi cient robust bayesian optimization, Proc. Mach. L...

  9. [9]

    20577 –20612

    , 2023, pp. 20577 –20612. 28 Z. Liu, N. Rolston, A. C. Flick, T. W. Colburn, Z. Ren, R. H. Dauskardt and T. Buonassisi, Machine learning with knowledge constraints for process optimization of open-air perovskite solar cell manufacturing, Joule, 2022, 6, 834–849. 29 A. E. Siemenn, Z. Ren, Q. Li and T. Buonassisi, Fast Bayesian optimization of Needle-in-a-H...

  10. [10]

    Diouane, V

    30 Y. Diouane, V. Picheny, R. Le Riche and A. S. Di Perrotolo, TREGO: a trust-region framework for e ffi cient global optimization, J. Global Optim. , 2023, 86,1 –23. 31 D. Eriksson, U. Ai, M. Pearce, J. R. Gardner, R. Turner and M. Poloczek, in NeurIPS, Vancouver ,

  11. [11]

    Published by the Royal Society of Chemistry Digital Discovery Paper Open Access Article

    Digital Discovery © 2025 The Author(s). Published by the Royal Society of Chemistry Digital Discovery Paper Open Access Article. Published on 30 May

  12. [12]

    This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence

    Downloaded on 6/10/2025 3:27:23 PM. This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. View Article Online