Multi-Variable Batch Bayesian Optimization in Materials Research: Synthetic Data Analysis of Noise Sensitivity and Problem Landscape Effects
Pith reviewed 2026-05-22 20:55 UTC · model grok-4.3
The pith
Noise degrades batch Bayesian optimization far more on needle-in-a-haystack landscapes than on smooth ones with local optima.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The effects of noise depend on the problem landscape: noise degrades the optimization results of a needle-in-a-haystack search (Ackley) dramatically more. However, with increasing noise, we observe an increasing probability of landing on the local optimum in Hartmann. Therefore, prior knowledge of the problem domain structure and noise level is essential when designing BO for materials research experiments. Synthetic data studies with known ground truth and controlled noise levels enable isolation of the impact of different batch BO components before real experiments.
What carries the argument
Batch Bayesian optimization simulations on the Ackley and Hartmann test functions with controlled additive noise, tracked via learning curves, performance metrics, and visualizations.
If this is right
- Prior knowledge of problem domain structure and noise level is essential when designing BO for materials research experiments.
- Synthetic data studies enable isolation and evaluation of acquisition policy, objective metrics, and hyperparameter values before real experiments.
- The results facilitate greater utilization of BO in guiding experimental materials research with large numbers of design variables.
Where Pith is reading between the lines
- Optimization strategies for materials problems may need to be chosen differently depending on whether the expected response surface is expected to be needle-like or to contain multiple competing optima.
- Real materials datasets could be screened for landscape features similar to Ackley or Hartmann to decide which acquisition functions to prioritize.
- The probability shift toward local optima under noise in Hartmann-like cases suggests that repeated runs or ensemble methods might be needed to improve the chance of reaching the global optimum.
Load-bearing premise
That the Ackley and Hartmann functions plus the chosen noise model are sufficiently representative of the structure and noise statistics encountered in actual multi-variable materials optimization experiments.
What would settle it
Running identical batch BO procedures on real experimental materials datasets that have measured noise levels and known ground-truth optima, then checking whether the observed convergence patterns match the synthetic Ackley and Hartmann behaviors.
read the original abstract
Bayesian Optimization (BO) machine learning method is increasingly used to guide experimental optimization tasks in materials science. To emulate the large number of input variables and noise-containing results in experimental materials research, we perform batch BO simulation of six design variables with a range of noise levels. Two test cases relevant for materials science problems are examined: a needle-in-a-haystack case (Ackley function) that may be encountered in, e.g., molecule optimizations, and a smooth landscape with a local optimum in addition to the global optimum (Hartmann function) that may be encountered in, e.g., material composition optimization. We show learning curves, performance metrics, and visualization to effectively track the optimization progression and evaluate how the optimization outcomes are affected by noise, batch-picking method, choice of acquisition function, and exploration hyperparameter values. We find that the effects of noise depend on the problem landscape: noise degrades the optimization results of a needle-in-a-haystack search (Ackley) dramatically more. However, with increasing noise, we observe an increasing probability of landing on the local optimum in Hartmann. Therefore, prior knowledge of the problem domain structure and noise level is essential when designing BO for materials research experiments. Synthetic data studies -- with known ground truth and controlled noise levels -- enable us to isolate and evaluate the impact of different batch BO components, {\it e.g.}, acquisition policy, objective metrics, and hyperparameter values, before transitioning to the inherent uncertainties of real experimental systems. The results and methodology of this study will facilitate a greater utilization of BO in guiding experimental materials research, specifically in settings with a large number of design variables to optimize.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a synthetic simulation study of batch Bayesian optimization (BO) applied to 6-dimensional problems with additive noise, using the Ackley function as a needle-in-a-haystack proxy (e.g., molecule optimization) and the Hartmann function as a multi-modal proxy (e.g., material composition optimization). The authors track learning curves and performance metrics across noise levels, batch-picking methods, acquisition functions, and exploration hyperparameters, concluding that noise effects are landscape-dependent—dramatically degrading Ackley performance while increasing local-optimum trapping probability in Hartmann—and that prior knowledge of domain structure and noise is essential for BO in materials research. Synthetic data with known ground truth is used to isolate component impacts before real experiments.
Significance. If the reported differential noise sensitivity holds, the work usefully illustrates how problem landscape modulates BO robustness in high-variable-count settings common to materials science. The controlled synthetic design with known ground truth is a clear strength, permitting isolation of noise, batch policy, and hyperparameter effects without confounding experimental uncertainties. This could help practitioners anticipate failure modes when transitioning BO to noisy, multi-variable experiments. Significance is reduced by the unverified assumption that Ackley/Hartmann plus the chosen additive noise model adequately capture the modality, smoothness, and noise correlation structures of actual materials tasks.
major comments (2)
- [Abstract and results sections] Abstract and results sections: the headline claim that noise impact is landscape-dependent (dramatic degradation on Ackley, rising local-optimum probability on Hartmann) and the recommendation that 'prior knowledge of the problem domain structure and noise level is essential' rest on only two 6-D benchmark functions with a simple additive noise model. The manuscript should add explicit discussion or supplementary experiments mapping the modality, constraint structure, and noise statistics of these functions to representative materials problems (e.g., composition optimization or molecular design) to support generalizability.
- [Methods or experimental setup] Methods or experimental setup: without reported details on the number of independent runs, statistical tests for performance differences, or data-exclusion rules, it is difficult to assess whether the observed trends (e.g., increasing local-optimum trapping with noise in Hartmann) are robust or sensitive to post-hoc choices. Adding error bars or confidence intervals on learning curves and metrics would strengthen the evidence.
minor comments (2)
- [Figures] Figure captions and axis labels: ensure all learning-curve plots include run-to-run variability (e.g., shaded regions or error bars) and clearly label batch size and acquisition-function variants.
- [Notation] Notation: define the exact form of the additive noise model and the batch selection criterion (e.g., q-EI or other) at first use to improve readability for readers outside the immediate BO community.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback on our synthetic study of batch Bayesian optimization. We address each major comment below and have revised the manuscript to incorporate the suggested improvements where feasible.
read point-by-point responses
-
Referee: [Abstract and results sections] Abstract and results sections: the headline claim that noise impact is landscape-dependent (dramatic degradation on Ackley, rising local-optimum probability on Hartmann) and the recommendation that 'prior knowledge of the problem domain structure and noise level is essential' rest on only two 6-D benchmark functions with a simple additive noise model. The manuscript should add explicit discussion or supplementary experiments mapping the modality, constraint structure, and noise statistics of these functions to representative materials problems (e.g., composition optimization or molecular design) to support generalizability.
Authors: We acknowledge that the study relies on two specific 6D benchmark functions chosen as proxies for distinct materials optimization landscapes. In the revised manuscript, we have added explicit discussion in the Introduction and Conclusions sections mapping Ackley's needle-in-a-haystack structure (high-dimensional search with a single sharp global optimum) to molecular design tasks and Hartmann's multi-modality to composition optimization, including references to literature on typical noise characteristics in those domains. We have also clarified the relevance of the additive Gaussian noise model to experimental measurement uncertainty. However, new supplementary experiments using real materials datasets fall outside the scope of this controlled synthetic analysis, which prioritizes isolation of effects with known ground truth. revision: partial
-
Referee: [Methods or experimental setup] Methods or experimental setup: without reported details on the number of independent runs, statistical tests for performance differences, or data-exclusion rules, it is difficult to assess whether the observed trends (e.g., increasing local-optimum trapping with noise in Hartmann) are robust or sensitive to post-hoc choices. Adding error bars or confidence intervals on learning curves and metrics would strengthen the evidence.
Authors: We agree that these details were insufficiently reported. The revised manuscript now includes a new subsection in Methods specifying the number of independent runs (20 per configuration), the application of paired t-tests for assessing performance differences, and explicit data-handling rules. All learning curves and performance metrics have been updated to include error bars representing one standard deviation across runs. revision: yes
Circularity Check
No circularity: forward simulation on external benchmarks
full rationale
The paper reports results from direct forward simulations of batch Bayesian optimization on two standard external benchmark functions (Ackley and Hartmann) with controlled additive noise. No parameters are fitted to the target outcomes and then re-used as predictions; no self-citations are invoked to justify uniqueness or ansatzes; the reported landscape-dependent noise effects are computed outputs rather than definitions or renamings of the inputs. The derivation chain is therefore self-contained against independent benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Ackley and Hartmann functions plus the chosen noise model are representative of materials optimization problems
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We apply batch BO to the 6D Ackley function, representative of needle-in-a-haystack search problems... and the 6D Hartmann function, representative of a problem with a false maximum...
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We find that the effects of noise depend on the problem landscape: noise degrades the optimization results of a needle-in-a-haystack search (Ackley) dramatically more.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Choosing a Suitable Acquisition Function for Batch Bayesian Optimization: Comparison of Serial and Monte Carlo Approaches
Empirical comparison finds qUCB outperforms qlogEI and matches or exceeds UCB/LP for convergence in noiseless and noisy 6D optimization, recommending it as default for unknown landscapes.
Reference graph
Works this paper leans on
-
[1]
E. Brochu, V. M. Cora, and N. de Freitas (2010), A Tutorial on Bayesian Optimization of Expen- sive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning.arXiv preprint arXiv:1012.2599
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[2]
B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. de Freitas (2016), Taking the Human Out of the Loop: A Review of Bayesian Optimization.Proceedings of the IEEE, 104:148–175
work page 2016
-
[3]
Practical Bayesian Optimization of Machine Learning Algorithms
J. Snoek, H. Larochelle, and R. P. Adams (2012), Practical Bayesian Optimization of Machine Learning Algorithms.arXiv preprint arXiv:1206.2944
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[4]
Q. Liang, A. E. Gongora, Z. Ren, A. Tiihonen, Z. Liu, S. Sun, J. R. Deneault, D. Bash, F. Mekki-Berrada, S. A. Khan, K. Hippalgaonkar, B. Maruyama, K. A. Brown, J. Fisher III, and T. Buonassisi (2021), Benchmarking the performance of Bayesian optimization across multiple experimental materials science domains.npj Computational Materials7:188. 21
work page 2021
- [5]
-
[6]
A. E. Gongora, B. Xu, W. Perry, C. Okoye, P. Riley, K. G. Reyes, E. F. Morgan, and K. A. Brown (2020), A Bayesian experimental autonomous researcher for mechanical design.Science Advances 6:eaaz1708
work page 2020
- [7]
- [8]
-
[9]
S. Daulton, S. Cakmak, M. Balandat, M. A. Osborne, E. Zhou, and E. Bakshy (2022), Robust Multi-Objective Bayesian Optimization Under Input Noise. Available: https://arxiv.org/abs/ 2202.07549
- [10]
-
[11]
H. Bellamy, A. Rehim, O. Orhobor, and R. King (2022), Batched Bayesian Optimization for Drug Design in Noisy Environments.Journal of Chemical Information and Modeling62:3970-3981
work page 2022
-
[12]
A. E. Siemenn, Z. Ren, Q. Li, and T. Buonassisi (2023), Fast Bayesian optimization of Needle-in- a-Haystack problems using zooming memory-based initialization (ZoMBI).npj Computational Materials9:79
work page 2023
-
[13]
M. de Jong, W. Chen, T. Angsten, A. Jain, R. Notestine, A. Gamst, M. Sluiter, C. K. Ande, S. van der Zwaag, J. J. Plata, C. Toher, S. Curtarolo, G. Ceder, K. A. Persson, and M. Asta (2015), Charting the complete elastic properties of inorganic crystalline compounds.Scientific Data2:150009
work page 2015
-
[14]
B. Hinterleitner, I. Knapp, M. Ponederet al.(2019), Thermoelectric performance of a metastable thin-film Heusler alloy.Nature576:85–90
work page 2019
-
[15]
Z. Li, K. G. Pradeep, Y. Deng, D. Raabe, and C. C. Tasan (2016), Metastable high-entropy dual-phase alloys overcome the strength–ductility trade-off.Nature534(7606):227–230
work page 2016
-
[16]
Z. Liu, N. Rolston, A. C. Flick, T. W. Colburn, Z. Ren, R. H. Dauskardt, and T. Buonas- sisi (2022), Machine learning with knowledge constraints for process optimization of open-air perovskite solar cell manufacturing.Joule6(4):834–849
work page 2022
-
[17]
S. Sun, A. Tiihonen, F. Oviedo, Z. Liu, J. Thapa, Y. Zhao, N. T. P. Hartono, A. Goyal, T. Heumueller, C. Batali, A. Encinas, J. J. Yoo, R. Li, Z. Ren, I. M. Peters, C. J. Brabec, M. G. Bawendi, V. Stevanovic, J. Fisher, and T. Buonassisi (2021), A data fusion approach to optimize compositional stability of halide perovskites.Matter4(4):1305–1322
work page 2021
-
[18]
F. Mekki-Berrada, Z. Ren, T. Huang, W. K. Wong, F. Zheng, J. Xie, I. P. Tian, S. Jayavelu, Z. Mahfoud, D. Bash, K. Hippalgaonkar, S. Khan, T. Buonassisi, Q. Li, and X. Wang (2021), Two-step machine learning enables optimized nanoparticle synthesis.npj Computational Materials7:55
work page 2021
-
[19]
J. R. Deneault, J. Chang, J. Myung, et al. (2021), Toward autonomous additive manufacturing: Bayesian optimization on a 3D printerMRS Bulletin46:566–575
work page 2021
-
[20]
M. Aldeghi, D. E. Graff, N. Frey, J. A. Morrone, E. O. Pyzer-Knapp, K. E. Jordan, and C. W. Coley (2022), Roughness of molecular property landscapes and its impact on modellability. Journal of Chemical Information and Modeling62:4660–4671. 22
work page 2022
- [21]
-
[22]
Gaussian Processes for Machine Learning,
C. E. Rasmussen and C. K. I. Williams (2005), “Gaussian Processes for Machine Learning,” The MIT Press
work page 2005
-
[23]
Available: http: //github.com/SheffieldML/GPy
GPy authors (2012–2014), GPy: A Gaussian process framework in python. Available: http: //github.com/SheffieldML/GPy
work page 2012
-
[24]
J. Janusevskis, R. Le Riche, D. Ginsbourger, and R. Girdziusas (2012), Expected Improve- ments for the Asynchronous Parallel Global Optimization of Expensive Functions: Potentials and Challenges.In: Learning and Intelligent Optimization, Springer, Berlin, Heidelberg, pp. 413–418
work page 2012
- [25]
-
[26]
D. D. Cox and S. John (1992), A statistical method for global optimization. in1992 IEEE International Conference on Systems, Man, and Cybernetics2:1241–1246
work page 1992
-
[27]
N. Srinivas, A. Krause, S. M. Kakade, and M. Seeger (2010), Gaussian process bandits without regret: An experimental design approach.IEEE Transactions on Information Theory117:121
work page 2010
-
[28]
S. Sun, N. T. P. Hartono, Z. D. Ren, F. Oviedo, A. M. Buscemi, M. Layurova, D. X. Chen, T. Ogunfunmi, J. Thapa, S. Ramasamy, C. Settens, B. L. DeCost, A. G. Kusne, Z. Liu, S. I. P. Tian, I. M. Peters, J.-P. Correa-Baena, and T. Buonassisi (2019), Accelerated Devel- opment of Perovskite-Inspired Materials via High-Throughput Synthesis and Machine-Learning ...
work page 2019
-
[29]
W. Xu, Z. Liu, R. T. Piper, and J. W. P. Hsu (2023), Bayesian Optimization of photonic curing process for flexible perovskite photovoltaic devices.Solar Energy Materials and Solar Cells 249:112055
work page 2023
-
[30]
B. Cao, L. A. Adutwum, A. O. Oliynyk, E. J. Luber, B. C. Olsen, A. Mar, and J. M. Buriak (2018), How To Optimize Materials and Devices via Design of Experiments and Machine Learning: Demonstration Using Organic Photovoltaics.ACS Nano12(8):7434–7444
work page 2018
-
[31]
Y. Xie, C. Zhang, H. Deng, B. Zheng, J.-W. Su, K. Shutt, and J. Lin (2021), Accelerate Synthesis of Metal–Organic Frameworks by a Robotic Platform and Bayesian Optimization.ACS Applied Materials & Interfaces13:53485–53491
work page 2021
-
[32]
F. Hutter, H. H. Hoos, and K. Leyton-Brown (2011), Sequential Model-Based Optimiza- tion for General Algorithm Configuration.In: Coello, C.A.C. (eds) Learning and Intelligent Optimization. Lecture Notes in Computer Science, vol 6683, Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25566-3 40
-
[33]
I. Mia, M. Lee, W. Xu, W. Vandenberghe, and J. W. P. Hsu (2025), Choosing a suitable acquisi- tion function for batch Bayesian optimization: comparison of serial and Monte Carlo approaches. Digital Discovery4(7):1751–1762
work page 2025
-
[34]
J. Gonzalez, Z. Dai, P. Hennig, and N. Lawrence (2016), Batch Bayesian Optimization via Local Penalization inProc. 19th Int. Conf. on Artificial Intelligence and Statistics51:648–657
work page 2016
-
[35]
D. Ginsbourger, R. Le Riche, and L. Carraro (2010), Kriging Is Well-Suited to Parallelize Opti- mization. inComputational Intelligence in Expensive Optimization Problems, Y. Tenne and C.-K. Goh, Eds. 48:131–162
work page 2010
-
[36]
D. Ginsbourger, R. Le Riche, and L. Carraro (2007), A Multi-points Criterion for Deterministic Parallel Global Optimization based on Kriging.Intl. Conf. on Nonconvex Programming, NCP07, Rouen, France. 23
work page 2007
- [37]
-
[38]
M. Balandat, B. Karrer, D. R. Jiang, S. Daulton, B. Letham, A. G. Wilson, and E. Bakshy (2020), BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization.Advances in Neural Information Processing Systems33
work page 2020
-
[39]
N. R. Hunt (2020), Batch Bayesian optimization. Available: https://dspace.mit.edu/handle/ 1721.1/128591
work page 2020
-
[40]
S. Surjanovic and D. Bingham (2024), Virtual Library of Simulation Experiments: Test Functions and Datasets. Available: http://www.sfu.ca/ ∼ssurjano
work page 2024
-
[41]
Y. Diouane, V. Picheny, R. Le Riche, and A. S. Di Perrotolo (2023), TREGO: a trust-region framework for efficient global optimization.Journal of Global Optimization86:1–23
work page 2023
-
[42]
D. Eriksson, M. Pearce, J. R. Gardner, R. Turner, and M. Poloczek (2020), Scalable Global Optimization via Local Bayesian Optimization. Available: https://arxiv.org/abs/1910.01739
-
[43]
Garnett (2023), Bayesian Optimization
R. Garnett (2023), Bayesian Optimization. Cambridge University Press
work page 2023
-
[44]
D. J. Lizotte (2008), Practical bayesian optimization. Ph.D. thesis, University of Alberta
work page 2008
-
[45]
A. Paleyes, M. Pullin, M. Mahsereci, C. McCollum, N. Lawrence, and J. Gonz´ alez (2021), Emulation of physical processes with Emukit.arXiv preprint arXiv:2110.13293
-
[46]
A. Paleyes, M. Mahsereci, and N. Lawrence (2023), Emukit: A Python toolkit for decision making under uncertainty. 68–75
work page 2023
-
[47]
P. I. Frazier (2018), A Tutorial on Bayesian Optimization. Available: https://arxiv.org/abs/ 1807.02811
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[48]
R. M. Neal (2012), Bayesian Learning for Neural Networks.Lecture Notes in Statistics, Springer New York
work page 2012
- [49]
-
[50]
Constrained Bayesian Optimization with Noisy Experiments
B. Letham, B. Karrer, G. Ottoni, and E. Bakshy (2018), Constrained Bayesian Optimization with Noisy Experiments. Available: https://arxiv.org/abs/1706.07094
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[51]
H. Zhou, X. Ma, and M. B. Blaschko (2024), A Corrected Expected Improvement Acquisi- tion Function Under Noisy Observations. inProc. 15th Asian Conf. on Machine Learning, 222:1747–1762
work page 2024
-
[52]
Y. Wu, A. Walsh, and A. M. Ganose (2024) Race to the bottom: Bayesian optimisation for chemical problems.Digital Discovery3:1086–1100
work page 2024
-
[53]
V. Picheny, T. Wagner, and D. Ginsbourger (2013), A benchmark of kriging-based infill criteria for noisy optimization.Structural and Multidisciplinary Optimization48:607-626. 24
work page 2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.