Multi-Variable Batch Bayesian Optimization in Materials Research: Synthetic Data Analysis of Noise Sensitivity and Problem Landscape Effects

Anna Ernst; Anusha Srivastava; Armi Tiihonen; Imon Mia; Julia W.P. Hsu; Tonio Buonassisi; William Vandenberghe

arxiv: 2504.03943 · v3 · submitted 2025-04-04 · 📊 stat.ML · cond-mat.mtrl-sci· cs.LG

Multi-Variable Batch Bayesian Optimization in Materials Research: Synthetic Data Analysis of Noise Sensitivity and Problem Landscape Effects

Imon Mia , Armi Tiihonen , Anna Ernst , Anusha Srivastava , Tonio Buonassisi , William Vandenberghe , Julia W.P. Hsu This is my paper

classification 📊 stat.ML cond-mat.mtrl-scics.LG

keywords noiseoptimizationmaterialsexperimentalresearchbatchfunctionlandscape

0 comments

read the original abstract

Bayesian Optimization (BO) machine learning method is increasingly used to guide experimental optimization tasks in materials science. To emulate the large number of input variables and noise-containing results in experimental materials research, we perform batch BO simulation of six design variables with a range of noise levels. Two test cases relevant for materials science problems are examined: a needle-in-a-haystack case (Ackley function) that may be encountered in, e.g., molecule optimizations, and a smooth landscape with a local optimum in addition to the global optimum (Hartmann function) that may be encountered in, e.g., material composition optimization. We show learning curves, performance metrics, and visualization to effectively track the optimization progression and evaluate how the optimization outcomes are affected by noise, batch-picking method, choice of acquisition function, and exploration hyperparameter values. We find that the effects of noise depend on the problem landscape: noise degrades the optimization results of a needle-in-a-haystack search (Ackley) dramatically more. However, with increasing noise, we observe an increasing probability of landing on the local optimum in Hartmann. Therefore, prior knowledge of the problem domain structure and noise level is essential when designing BO for materials research experiments. Synthetic data studies -- with known ground truth and controlled noise levels -- enable us to isolate and evaluate the impact of different batch BO components, {\it e.g.}, acquisition policy, objective metrics, and hyperparameter values, before transitioning to the inherent uncertainties of real experimental systems. The results and methodology of this study will facilitate a greater utilization of BO in guiding experimental materials research, specifically in settings with a large number of design variables to optimize.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Choosing a Suitable Acquisition Function for Batch Bayesian Optimization: Comparison of Serial and Monte Carlo Approaches
physics.comp-ph 2025-06 conditional novelty 5.0

Empirical comparison finds qUCB outperforms qlogEI and matches or exceeds UCB/LP for convergence in noiseless and noisy 6D optimization, recommending it as default for unknown landscapes.