Goal-Oriented Bayesian Optimal Experimental Design for Nonlinear Models using Markov Chain Monte Carlo

Shijie Zhong; Tommie Catanach; Wanggang Shen; Xun Huan

arxiv: 2403.18072 · v2 · submitted 2024-03-26 · 📊 stat.CO · cs.LG· stat.ME· stat.ML

Goal-Oriented Bayesian Optimal Experimental Design for Nonlinear Models using Markov Chain Monte Carlo

Shijie Zhong , Wanggang Shen , Tommie Catanach , Xun Huan This is my paper

Pith reviewed 2026-05-24 02:47 UTC · model grok-4.3

classification 📊 stat.CO cs.LGstat.MEstat.ML

keywords optimal experimental designgoal-oriented OEDexpected information gainMarkov chain Monte Carlokernel density estimationBayesian optimizationnonlinear modelssensor placement

0 comments

The pith

A nested Monte Carlo estimator with MCMC and kernel density estimation computes expected information gain directly on predictive quantities of interest for nonlinear optimal experimental design.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a goal-oriented Bayesian optimal experimental design framework that targets expected information gain on quantities of interest rather than on model parameters themselves. It introduces a nested Monte Carlo estimator that draws posterior samples via Markov chain Monte Carlo and uses kernel density estimation to evaluate the Kullback-Leibler divergence between the prior-predictive and posterior-predictive densities of the quantities of interest. The resulting estimator is then maximized over the space of possible experimental designs by Bayesian optimization. The approach applies to nonlinear observation and prediction models and is illustrated on test cases plus a sensor-placement problem for convection-diffusion source inversion. A sympathetic reader would care because many applications require uncertainty reduction in specific predictions, not in the underlying parameters.

Core claim

The authors present a predictive goal-oriented OED framework that seeks the experimental design providing the greatest expected information gain on the quantities of interest. They propose a nested Monte Carlo estimator for the QoI EIG featuring Markov chain Monte Carlo for posterior sampling and kernel density estimation for evaluating the posterior-predictive density and its Kullback-Leibler divergence from the prior-predictive. The GO-OED design is then found by maximizing the EIG over the design space using Bayesian optimization.

What carries the argument

Nested Monte Carlo estimator for QoI expected information gain, which uses MCMC posterior sampling and kernel density estimation to compute the KL divergence between prior-predictive and posterior-predictive densities.

If this is right

GO-OED designs differ from conventional parameter-focused OED by allocating information specifically to the quantities of interest.
The method applies to nonlinear observation and prediction models where direct analytic expressions for EIG are unavailable.
Bayesian optimization over the design space locates the design that maximizes the approximated QoI EIG.
The framework is demonstrated on sensor placement for source inversion in a convection-diffusion field.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Replacing kernel density estimation with other density estimators could extend the method to higher-dimensional quantities of interest.
The same nested-sampling structure could be paired with sequential design strategies that update after each experiment.
Applications in domains such as medical imaging or environmental monitoring may gain from prioritizing predictive accuracy over parameter recovery.

Load-bearing premise

The nested Monte Carlo estimator with MCMC sampling and KDE accurately approximates the true QoI EIG for nonlinear models, and Bayesian optimization reliably identifies the global maximizing design.

What would settle it

In a low-dimensional nonlinear test model where the QoI EIG can be computed to high accuracy by direct integration or very large reference samples, compare the nested Monte Carlo estimate against that reference value and check whether the discrepancy shrinks with increasing sample size.

Figures

Figures reproduced from arXiv: 2403.18072 by Shijie Zhong, Tommie Catanach, Wanggang Shen, Xun Huan.

**Figure 2.** Figure 2: Case BM: expected utility (left) and optimized KDE bandwidth in ADBW across [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 3.** Figure 3: Case T1: expected utility (left) and optimized KDE bandwidth in ADBW across [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: Case BM, T1, T2, T3: expected utility comparisons. The benchmark case (BM, GRID) has [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗

**Figure 5.** Figure 5: Case T3: example posterior distributions. The left plot conditions on [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

**Figure 6.** Figure 6: Case T3: Prior-predictive distribution and example posterior-predictive distributions. The bottom [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗

**Figure 7.** Figure 7: Expected utility comparisons for the 2D θ case. Expanding further to 2D design and 2D observation spaces, the next case involves a multidimensional observation model: y = G(θ, d) + ϵ = θ 3 1 d 2 1 + θ2 exp (− |0.2 − d2|) θ 3 2 d 2 1 + θ1 exp (− |0.2 − d2|) + ϵ (34) where ϵ ∼ N (0, 10−4 I). The prediction models are the same Easom and Rosenbrock equations from Eqn. (32) and Eqn. (33), respectively. The… view at source ↗

**Figure 8.** Figure 8: Expected utility contours for the 2D θ, d, and y case. dimensional d. An optimization algorithm, such as the BO algorithm presented in Sec. 3.3, would be more efficient to seek out d ∗ directly. Here we assess the BO performance in searching for the optimal design via the Easom prediction subcase. As seen in Fig. 9a, BO first randomly select 3 initial points (black) and then uses the UBC acquisition functi… view at source ↗

**Figure 9.** Figure 9: BO visited locations (left) and optimization convergence history (right) for the Easom prediction [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗

**Figure 10.** Figure 10: Computational cost (left) of each outer loop iteration and each inner loop MCMC, and their ratio [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗

**Figure 11.** Figure 11: Computational cost (left) of each outer loop iteration and each inner loop MCMC, and their ratio [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗

**Figure 12.** Figure 12: Example numerical solutions of the concentration field at different time snapshots with [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗

**Figure 13.** Figure 13: Example comparisons of the concentration field at [PITH_FULL_IMAGE:figures/full_fig_p020_13.png] view at source ↗

**Figure 14.** Figure 14: Convection-diffusion 1-sensor design: non-GO-OED expected utility contour. [PITH_FULL_IMAGE:figures/full_fig_p021_14.png] view at source ↗

**Figure 15.** Figure 15: Convection-diffusion 1-sensor design: GO-OED expected utility contours for subcases (a)–(d), where [PITH_FULL_IMAGE:figures/full_fig_p022_15.png] view at source ↗

**Figure 16.** Figure 16: Convection-diffusion 2-sensor design: GO-OED for subcases (a) and (b), where the predictive QoIs [PITH_FULL_IMAGE:figures/full_fig_p022_16.png] view at source ↗

**Figure 17.** Figure 17: Convection-diffusion 3-sensor design: GO-OED for subcases (a) and (b), where the predictive QoIs [PITH_FULL_IMAGE:figures/full_fig_p023_17.png] view at source ↗

**Figure 18.** Figure 18: Convection-diffusion 2-sensor design: GO-OED where the predictive QoI is the flux across the right [PITH_FULL_IMAGE:figures/full_fig_p023_18.png] view at source ↗

**Figure 19.** Figure 19: Convection-diffusion 3-sensor design: GO-OED where the predictive QoIs are the concentration at [PITH_FULL_IMAGE:figures/full_fig_p024_19.png] view at source ↗

read the original abstract

Optimal experimental design (OED) provides a systematic approach to quantify and maximize the value of experimental data. Under a Bayesian approach, conventional OED maximizes the expected information gain (EIG) on model parameters. However, we are often interested in not the parameters themselves, but predictive quantities of interest (QoIs) that depend on the parameters in a nonlinear manner. We present a computational framework of predictive goal-oriented OED (GO-OED) suitable for nonlinear observation and prediction models, which seeks the experimental design providing the greatest EIG on the QoIs. In particular, we propose a nested Monte Carlo estimator for the QoI EIG, featuring Markov chain Monte Carlo for posterior sampling and kernel density estimation for evaluating the posterior-predictive density and its Kullback-Leibler divergence from the prior-predictive. The GO-OED design is then found by maximizing the EIG over the design space using Bayesian optimization. We demonstrate the effectiveness of the overall nonlinear GO-OED method, and illustrate its differences versus conventional non-GO-OED, through various test problems and an application of sensor placement for source inversion in a convection-diffusion field.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a workable nested Monte Carlo scheme with MCMC and KDE to compute goal-oriented EIG on nonlinear QoIs and optimizes the design via Bayesian optimization, with demonstrations on sensor placement.

read the letter

The main thing here is a computational recipe for goal-oriented OED that targets predictive quantities rather than parameters, using a nested Monte Carlo estimator: MCMC draws from the posterior, KDE approximates the predictive densities, and the KL divergence gives the EIG on the QoI. They then maximize that over designs with Bayesian optimization. This is a direct extension of standard EIG methods to the nonlinear QoI case, and the sensor-placement example in convection-diffusion shows it can produce different designs than the usual parameter-focused version. The test problems illustrate the difference in practice, which is useful for people who care about downstream predictions rather than parameter recovery. The approach builds on established tools without obvious circularity. The soft spot is the KDE step for the posterior-predictive density. In nonlinear settings the predictive distribution can be multimodal or have varying scale, and the paper does not appear to supply error bounds or convergence rates on the EIG estimator itself. That leaves open how much the approximation error could shift the argmax over designs, especially as QoI dimension grows. Their numerical examples work, but without those controls it is hard to judge robustness beyond the cases shown. This is the kind of paper that belongs in a computational statistics or uncertainty-quantification venue. Readers working on inverse problems or sensor design will find the framework and the concrete comparison to non-GO-OED useful. It deserves a serious referee because the problem is real and the method is implementable, even if the approximation analysis could be tightened.

Referee Report

1 major / 1 minor

Summary. The paper presents a computational framework for goal-oriented Bayesian optimal experimental design (GO-OED) applicable to nonlinear observation and prediction models. It seeks designs that maximize the expected information gain (EIG) on quantities of interest (QoIs) rather than model parameters. The approach uses a nested Monte Carlo estimator combining Markov chain Monte Carlo for posterior sampling with kernel density estimation to compute the KL divergence between prior- and posterior-predictive densities on the QoIs; the resulting EIG is then maximized over the design space via Bayesian optimization. Effectiveness is illustrated on test problems and a convection-diffusion source inversion sensor placement application.

Significance. If the nested estimator is shown to control approximation error for nonlinear QoIs, the framework would enable predictive goal-oriented design in settings where conventional parameter-focused EIG is insufficient, with direct relevance to applications such as sensor placement. The combination of established MCMC, KDE, and Bayesian optimization tools into a GO-OED pipeline is a practical contribution, though its reliability hinges on the unanalyzed density estimation step.

major comments (1)

[nested Monte Carlo estimator description] Description of the nested Monte Carlo estimator (abstract and methods): the central claim that this estimator accurately recovers the QoI EIG for nonlinear models rests on KDE for the posterior-predictive density and its KL divergence, yet no error bounds, convergence rates, bandwidth sensitivity analysis, or numerical validation of the EIG approximation error are provided. This is load-bearing because, as noted in the skeptic's concern, multimodality or growing QoI dimension can systematically bias the estimated EIG and therefore distort the argmax over designs.

minor comments (1)

[Introduction] The abstract and introduction would benefit from explicit comparison of the proposed QoI EIG estimator against existing nested MC or importance-sampling alternatives for EIG computation.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review. We address the single major comment below.

read point-by-point responses

Referee: Description of the nested Monte Carlo estimator (abstract and methods): the central claim that this estimator accurately recovers the QoI EIG for nonlinear models rests on KDE for the posterior-predictive density and its KL divergence, yet no error bounds, convergence rates, bandwidth sensitivity analysis, or numerical validation of the EIG approximation error are provided. This is load-bearing because, as noted in the skeptic's concern, multimodality or growing QoI dimension can systematically bias the estimated EIG and therefore distort the argmax over designs.

Authors: We agree that the manuscript lacks explicit error analysis for the KDE step in the nested estimator. The current demonstrations rely on the test problems and convection-diffusion application to show that the resulting designs are sensible, but this does not directly quantify estimator bias or sensitivity. In the revision we will add a new subsection (in Methods or Results) that reports: (i) bandwidth sensitivity sweeps on the low-dimensional test problems, (ii) numerical checks of EIG approximation error against a high-sample reference where feasible, and (iii) additional experiments with deliberately multimodal QoI distributions. Theoretical convergence rates for arbitrary nonlinear QoIs remain difficult to obtain and will not be claimed; the added numerical evidence will instead support practical reliability for the regimes considered in the paper. revision: yes

Circularity Check

0 steps flagged

No circularity in GO-OED nested Monte Carlo estimator

full rationale

The paper's central contribution is a computational approximation (nested MC + MCMC posterior sampling + KDE for KL divergence between prior- and posterior-predictive densities) for the QoI EIG; this is an estimator built from standard, externally validated components (MCMC, KDE, Bayesian optimization) rather than any quantity defined in terms of itself or a fitted input renamed as a prediction. No self-citation chains, uniqueness theorems, or ansatzes are invoked to force the result, and the method does not reduce the EIG computation to its own inputs by construction. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Based on abstract only; no explicit free parameters, invented entities, or ad-hoc axioms detailed. Standard assumptions for MCMC convergence and density estimation are implicit.

axioms (2)

domain assumption MCMC chains converge to the target posterior distribution
Invoked for posterior sampling in the nested MC estimator (abstract description of MCMC use).
domain assumption Kernel density estimation provides a sufficiently accurate estimate of the predictive densities for KL divergence computation
Used to evaluate the posterior-predictive density and its divergence from prior-predictive.

pith-pipeline@v0.9.0 · 5748 in / 1431 out tokens · 39732 ms · 2026-05-24T02:47:53.324453+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

nested Monte Carlo estimator for the QoI EIG, featuring Markov chain Monte Carlo for posterior sampling and kernel density estimation for evaluating the posterior-predictive density and its Kullback-Leibler divergence
IndisputableMonolith/Foundation/DimensionForcing.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

GO-OED design is then found by maximizing the EIG over the design space using Bayesian optimization

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Variational Sequential Optimal Experimental Design using Reinforcement Learning
stat.ML 2023-06 unverdicted novelty 7.0

vsOED uses a variational one-point reward and RL policy optimization to provide a lower bound on expected information gain for sequential experimental design, supporting nuisance parameters, implicit likelihoods, and ...
Intelligent data collection for network discrimination in material flow analysis using Bayesian optimal experimental design
stat.AP 2025-04 unverdicted novelty 6.0

A bias-reduced Bayesian optimal experimental design procedure using Kullback-Leibler divergence is shown to select high-value steel mass-flow observations that reduce network-structure uncertainty in a U.S. steel MFA,...

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · cited by 2 Pith papers · 3 internal anchors

[1]

Alexanderian

A. Alexanderian. Optimal experimental design for infinite-dimensional Bayesian inverse problems governed by PDEs: A review. Inverse Problems, 37(4):043001, 2021

work page 2021
[2]

Andrieu, N

C. Andrieu, N. de Freitas, A. Doucet, and M. I. Jordan. An Introduction to MCMC for machine learning. Machine Learning, 50:5–43, 2003

work page 2003
[3]

A. C. Atkinson, A. N. Donev, and R. D. Tobias. Optimum Experimental Designs, With SAS . Oxford University Press, 2007

work page 2007
[4]

Attia, A

A. Attia, A. Alexanderian, and A. K. Saibaba. Goal-oriented optimal design of experiments for large-scale Bayesian linear inverse problems. Inverse Problems, 34(9):aad210, 2018

work page 2018
[5]

Barber and F

D. Barber and F. Agakov. The IM algorithm: A variational approach to information maxi- mization. In Advances in Neural Information Processing Systems 16 , pages 201–208. MIT Press, 2003

work page 2003
[6]

J. Beck, B. M. Dia, L. F. Espath, Q. Long, and R. Tempone. Fast Bayesian experimental design: Laplace-based importance sampling for the expected information gain. Computer Methods in Applied Mechanics and Engineering , 334:523–553, 2018

work page 2018
[7]

J. M. Bernardo. Expected information as expected utility. The Annals of Statistics, 7(3):686–690, 1979

work page 1979
[8]

Brooks, A

S. Brooks, A. Gelman, G. Jones, and X.-L. Meng, editors. Handbook of Markov Chain Monte Carlo. Chapman and Hall/CRC, 2011

work page 2011
[9]

Butler, J

T. Butler, J. Jakeman, and T. Wildey. Combining push-forward measures and Bayes’ rule to con- struct consistent solutions to stochastic inverse problems. SIAM Journal on Scientific Computing, 40(2):A984–A1011, 2018

work page 2018
[10]

Butler, J

T. Butler, J. Jakeman, and T. Wildey. Convergence of probability densities using approximate models for forward and inverse problems in uncertainty quantification.SIAM Journal on Scientific Computing, 40(5):A3523–A3548, 2018

work page 2018
[11]

Butler, J

T. Butler, J. D. Jakeman, and T. Wildey. Optimal experimental design for prediction based on push-forward probability measures. Journal of Computational Physics , 416:109518, 2020. 25

work page 2020
[12]

R. Cao, A. Cuevas, and W. Gonz´ alez Manteiga. A comparative study of several smoothing methods in density estimation. Computational Statistics & Data Analysis , 17(2):153–176, 1994

work page 1994
[13]

T. A. Catanach and J. L. Beck. Bayesian updating and uncertainty quantification using se- quential tempered MCMC with the rank-one modified Metropolis algorithm. arXiv preprint , arXiv:1804.08738, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[14]

Chaloner and I

K. Chaloner and I. Verdinelli. Bayesian experimental design: A review. Statistical Science , 10(3):273–304, 1995

work page 1995
[15]

Chopin and O

N. Chopin and O. Papaspiliopoulos. An Introduction to Sequential Monte Carlo Methods. Springer Nature Switzerland, 2020

work page 2020
[16]

M. K. Cowles and B. P. Carlin. Markov chain Monte Carlo convergence diagnostics: A compar- ative review. Journal of the American Statistical Association , 91(434):883–904, 1996

work page 1996
[17]

Duong, T

D.-L. Duong, T. Helin, and J. R. Rojo-Garcia. Stability estimates for the expected utility in Bayesian optimal experimental design. Inverse Problems, 39(12):125008, 2023

work page 2023
[18]

D. J. Earl and M. W. Deem. Parallel tempering: Theory, applications, and new perspectives. Physical Chemistry Chemical Physics , 7(23):3910–3916, 2005

work page 2005
[19]

Englezou, T

Y. Englezou, T. W. Waite, and D. C. Woods. Approximate Laplace importance sampling for the estimation of expected Shannon information gain in high-dimensional Bayesian design for nonlinear models. Statistics and Computing , 32(5):82, 2022

work page 2022
[20]

V. V. Fedorov. Theory of Optimal Experiments . Academic Press, 1972

work page 1972
[21]

A layered multiple importance sampling scheme for focused optimal Bayesian experimental design

C. Feng and Y. M. Marzouk. A layered multiple importance sampling scheme for focused optimal Bayesian experimental design. arXiv preprint, arXiv:1903.11187, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1903
[22]

Foreman-Mackey, D

D. Foreman-Mackey, D. W. Hogg, D. Lang, and J. Goodman. emcee: The MCMC hammer. Publications of the Astronomical Society of the Pacific , 125(925):306–312, 2013

work page 2013
[23]

Foster, M

A. Foster, M. Jankowiak, E. Bingham, P. Horsfall, Y. W. Teh, T. Rainforth, and N. Goodman. Variational Bayesian optimal experimental design. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d 'Alch´ e-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 14036–14047. Curran Associates, 2019

work page 2019
[24]

P. I. Frazier. Bayesian optimization. INFORMS TutORials in Operations Research, 2018:255–278, 2018

work page 2018
[25]

Goodman and J

J. Goodman and J. Weare. Ensemble samplers with affine invariance. Communications in Applied Mathematics and Computational Science , 5(1):65–80, 2010

work page 2010
[26]

R. B. Gramacy. Surrogates. Chapman and Hall/CRC, 2020

work page 2020
[27]

X. Huan, J. Jagalur, and Y. Marzouk. Optimal experimental design: Formulations and compu- tations. Acta Numerica, 33:715–840, 2024

work page 2024
[28]

Huan and Y

X. Huan and Y. M. Marzouk. Simulation-based optimal Bayesian experimental design for non- linear systems. Journal of Computational Physics , 232(1):288–317, 2013

work page 2013
[29]

Huan and Y

X. Huan and Y. M. Marzouk. Gradient-based stochastic optimization methods in Bayesian ex- perimental design. International Journal for Uncertainty Quantification , 4(6):479–510, 2014. 26

work page 2014
[30]

D. R. Jones, M. Schonlau, and W. J. Welch. Efficient global optimization of expensive black-box functions. Journal of Global Optimization , 13:455–492, 1998

work page 1998
[31]

M. C. Jones, J. S. Marron, and S. J. Sheather. A brief survey of bandwidth selection for density estimation. Journal of the American Statistical Association , 91(433):401–407, 1996

work page 1996
[32]

Kleinegesse and M

S. Kleinegesse and M. U. Gutmann. Bayesian experimental design for implicit models by mutual information neural estimation. In H. Daum´ e and A. Singh, editors,Proceedings of the 37th Inter- national Conference on Machine Learning (ICML 2020) , volume 119 of Proceedings of Machine Learning Research, pages 5316–5326. PMLR, 2020

work page 2020
[33]

J. Latz, J. P. Madrigal-Cianci, F. Nobile, and R. Tempone. Generalized parallel tempering on Bayesian inverse problems. Statistics and Computing , 31(5):67, 2021

work page 2021
[34]

B. Leonard. A stable and accurate convective modelling procedure based on quadratic upstream interpolation. Computer Methods in Applied Mechanics and Engineering , 19(1):59–98, 1979

work page 1979
[35]

D. V. Lindley. On a measure of the information provided by an experiment. The Annals of Mathematical Statistics, 27(4):986–1005, 1956

work page 1956
[36]

Q. Long, M. Scavino, R. Tempone, and S. Wang. Fast estimation of expected information gains for Bayesian experimental designs based on Laplace approximations. Computer Methods in Applied Mechanics and Engineering, 259:24–39, 2013

work page 2013
[37]

Moˇ ckus

J. Moˇ ckus. On Bayesian methods for seeking the extremum. In Optimization Techniques IFIP Technical Conference, pages 400–404, 1975

work page 1975
[38]

Nocedal and S

J. Nocedal and S. J. Wright. Numerical Optimization. Springer New York, 2006

work page 2006
[39]

Nogueira

F. Nogueira. Bayesian Optimization: Open source constrained global optimization tool for Python, 2014

work page 2014
[40]

A. M. Overstall, J. M. McGree, and C. C. Drovandi. An approach for finding fully Bayesian optimal designs using normal-based approximations to loss functions. Statistics and Computing , 28:343–358, 2018

work page 2018
[41]

B. U. Park and J. S. Marron. Comparison of data-driven bandwidth selectors. Journal of the American Statistical Association, 85(409):66–72, 1990

work page 1990
[42]

Paulin, A

D. Paulin, A. Jasra, and A. Thiery. Error bounds for sequential Monte Carlo samplers for multimodal distributions. Bernoulli, 25(1):310–340, 2019

work page 2019
[43]

Pedregosa, G

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Pret- tenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Re- search, 12:2825–2830, 2011

work page 2011
[44]

Pelikan, D

M. Pelikan, D. E. Goldberg, and E. Cant´ u-Paz. BOA: The Bayesian optimization algorithm. In Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation (GECCO 1999), page 525–532, 1999

work page 1999
[45]

A Framework for Adaptive MCMC Targeting Multimodal Distributions

E. Pompe, C. Holmes, and K. Latuszy´ nski. A framework for adaptive MCMC targeting multi- modal distributions. arXiv preprint, arXiv:1812.02609, 2018. 27

work page internal anchor Pith review Pith/arXiv arXiv 2018
[46]

Poole, S

B. Poole, S. Ozair, A. Van Den Oord, A. Alemi, and G. Tucker. On variational bounds of mutual information. In Proceedings of the 36th International Conference on Machine Learning (ICML 2019), volume 97 of Proceedings of Machine Learning Research, pages 5171–5180. PMLR, 2019

work page 2019
[47]

Rainforth, A

T. Rainforth, A. Foster, D. R. Ivanova, and F. B. Smith. Modern Bayesian experimental design. Statistical Science, 39(1):100–114, 2024

work page 2024
[48]

C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning . The MIT Press, 2006

work page 2006
[49]

C. P. Robert and G. Casella. Monte Carlo Statistical Methods . Springer New York, 2004

work page 2004
[50]

V. Roy. Convergence diagnostics for Markov chain Monte Carlo. Annual Review of Statistics and Its Application, 7:387–412, 2020

work page 2020
[51]

C. M. Ryan, C. C. Drovandi, and A. N. Pettitt. Optimal Bayesian experimental design for models with intractable likelihoods using indirect inference applied to biological process models. Bayesian Analysis, 11(3):857–883, 2016

work page 2016
[52]

E. G. Ryan, C. C. Drovandi, J. M. Mcgree, and A. N. Pettitt. A review of modern computational algorithms for Bayesian optimal design. International Statistical Review, 84(1):128–154, 2016

work page 2016
[53]

K. J. Ryan. Estimating expected information gains for experimental designs with application to the random fatigue-limit model. Journal of Computational and Graphical Statistics , 12(3):585– 603, 2003

work page 2003
[54]

Shahriari, K

B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. de Freitas. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2016

work page 2016
[55]

S. A. Sisson, Y. Fan, and M. Beaumont. Handbook of Approximate Bayesian Computation. Chapman and Hall/CRC , 2018

work page 2018
[56]

X. Wang, Y. Jin, S. Schmitt, and M. Olhofer. Recent advances in Bayesian optimization. ACM Computing Surveys, 55(13s):1–36, 2023

work page 2023
[57]

K. Wu, P. Chen, and O. Ghattas. An efficient method for goal-oriented linear Bayesian optimal experimental design: Application to optimal sensor placement. arXiv preprint, arXiv:2102.06627, 2021

work page arXiv 2021
[58]

Zhan and H

D. Zhan and H. Xing. Expected improvement for expensive optimization: A review. Journal of Global Optimization, 78(3):507–544, 2020. 28

work page 2020

[1] [1]

Alexanderian

A. Alexanderian. Optimal experimental design for infinite-dimensional Bayesian inverse problems governed by PDEs: A review. Inverse Problems, 37(4):043001, 2021

work page 2021

[2] [2]

Andrieu, N

C. Andrieu, N. de Freitas, A. Doucet, and M. I. Jordan. An Introduction to MCMC for machine learning. Machine Learning, 50:5–43, 2003

work page 2003

[3] [3]

A. C. Atkinson, A. N. Donev, and R. D. Tobias. Optimum Experimental Designs, With SAS . Oxford University Press, 2007

work page 2007

[4] [4]

Attia, A

A. Attia, A. Alexanderian, and A. K. Saibaba. Goal-oriented optimal design of experiments for large-scale Bayesian linear inverse problems. Inverse Problems, 34(9):aad210, 2018

work page 2018

[5] [5]

Barber and F

D. Barber and F. Agakov. The IM algorithm: A variational approach to information maxi- mization. In Advances in Neural Information Processing Systems 16 , pages 201–208. MIT Press, 2003

work page 2003

[6] [6]

J. Beck, B. M. Dia, L. F. Espath, Q. Long, and R. Tempone. Fast Bayesian experimental design: Laplace-based importance sampling for the expected information gain. Computer Methods in Applied Mechanics and Engineering , 334:523–553, 2018

work page 2018

[7] [7]

J. M. Bernardo. Expected information as expected utility. The Annals of Statistics, 7(3):686–690, 1979

work page 1979

[8] [8]

Brooks, A

S. Brooks, A. Gelman, G. Jones, and X.-L. Meng, editors. Handbook of Markov Chain Monte Carlo. Chapman and Hall/CRC, 2011

work page 2011

[9] [9]

Butler, J

T. Butler, J. Jakeman, and T. Wildey. Combining push-forward measures and Bayes’ rule to con- struct consistent solutions to stochastic inverse problems. SIAM Journal on Scientific Computing, 40(2):A984–A1011, 2018

work page 2018

[10] [10]

Butler, J

T. Butler, J. Jakeman, and T. Wildey. Convergence of probability densities using approximate models for forward and inverse problems in uncertainty quantification.SIAM Journal on Scientific Computing, 40(5):A3523–A3548, 2018

work page 2018

[11] [11]

Butler, J

T. Butler, J. D. Jakeman, and T. Wildey. Optimal experimental design for prediction based on push-forward probability measures. Journal of Computational Physics , 416:109518, 2020. 25

work page 2020

[12] [12]

R. Cao, A. Cuevas, and W. Gonz´ alez Manteiga. A comparative study of several smoothing methods in density estimation. Computational Statistics & Data Analysis , 17(2):153–176, 1994

work page 1994

[13] [13]

T. A. Catanach and J. L. Beck. Bayesian updating and uncertainty quantification using se- quential tempered MCMC with the rank-one modified Metropolis algorithm. arXiv preprint , arXiv:1804.08738, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[14] [14]

Chaloner and I

K. Chaloner and I. Verdinelli. Bayesian experimental design: A review. Statistical Science , 10(3):273–304, 1995

work page 1995

[15] [15]

Chopin and O

N. Chopin and O. Papaspiliopoulos. An Introduction to Sequential Monte Carlo Methods. Springer Nature Switzerland, 2020

work page 2020

[16] [16]

M. K. Cowles and B. P. Carlin. Markov chain Monte Carlo convergence diagnostics: A compar- ative review. Journal of the American Statistical Association , 91(434):883–904, 1996

work page 1996

[17] [17]

Duong, T

D.-L. Duong, T. Helin, and J. R. Rojo-Garcia. Stability estimates for the expected utility in Bayesian optimal experimental design. Inverse Problems, 39(12):125008, 2023

work page 2023

[18] [18]

D. J. Earl and M. W. Deem. Parallel tempering: Theory, applications, and new perspectives. Physical Chemistry Chemical Physics , 7(23):3910–3916, 2005

work page 2005

[19] [19]

Englezou, T

Y. Englezou, T. W. Waite, and D. C. Woods. Approximate Laplace importance sampling for the estimation of expected Shannon information gain in high-dimensional Bayesian design for nonlinear models. Statistics and Computing , 32(5):82, 2022

work page 2022

[20] [20]

V. V. Fedorov. Theory of Optimal Experiments . Academic Press, 1972

work page 1972

[21] [21]

A layered multiple importance sampling scheme for focused optimal Bayesian experimental design

C. Feng and Y. M. Marzouk. A layered multiple importance sampling scheme for focused optimal Bayesian experimental design. arXiv preprint, arXiv:1903.11187, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1903

[22] [22]

Foreman-Mackey, D

D. Foreman-Mackey, D. W. Hogg, D. Lang, and J. Goodman. emcee: The MCMC hammer. Publications of the Astronomical Society of the Pacific , 125(925):306–312, 2013

work page 2013

[23] [23]

Foster, M

A. Foster, M. Jankowiak, E. Bingham, P. Horsfall, Y. W. Teh, T. Rainforth, and N. Goodman. Variational Bayesian optimal experimental design. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d 'Alch´ e-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 14036–14047. Curran Associates, 2019

work page 2019

[24] [24]

P. I. Frazier. Bayesian optimization. INFORMS TutORials in Operations Research, 2018:255–278, 2018

work page 2018

[25] [25]

Goodman and J

J. Goodman and J. Weare. Ensemble samplers with affine invariance. Communications in Applied Mathematics and Computational Science , 5(1):65–80, 2010

work page 2010

[26] [26]

R. B. Gramacy. Surrogates. Chapman and Hall/CRC, 2020

work page 2020

[27] [27]

X. Huan, J. Jagalur, and Y. Marzouk. Optimal experimental design: Formulations and compu- tations. Acta Numerica, 33:715–840, 2024

work page 2024

[28] [28]

Huan and Y

X. Huan and Y. M. Marzouk. Simulation-based optimal Bayesian experimental design for non- linear systems. Journal of Computational Physics , 232(1):288–317, 2013

work page 2013

[29] [29]

Huan and Y

X. Huan and Y. M. Marzouk. Gradient-based stochastic optimization methods in Bayesian ex- perimental design. International Journal for Uncertainty Quantification , 4(6):479–510, 2014. 26

work page 2014

[30] [30]

D. R. Jones, M. Schonlau, and W. J. Welch. Efficient global optimization of expensive black-box functions. Journal of Global Optimization , 13:455–492, 1998

work page 1998

[31] [31]

M. C. Jones, J. S. Marron, and S. J. Sheather. A brief survey of bandwidth selection for density estimation. Journal of the American Statistical Association , 91(433):401–407, 1996

work page 1996

[32] [32]

Kleinegesse and M

S. Kleinegesse and M. U. Gutmann. Bayesian experimental design for implicit models by mutual information neural estimation. In H. Daum´ e and A. Singh, editors,Proceedings of the 37th Inter- national Conference on Machine Learning (ICML 2020) , volume 119 of Proceedings of Machine Learning Research, pages 5316–5326. PMLR, 2020

work page 2020

[33] [33]

J. Latz, J. P. Madrigal-Cianci, F. Nobile, and R. Tempone. Generalized parallel tempering on Bayesian inverse problems. Statistics and Computing , 31(5):67, 2021

work page 2021

[34] [34]

B. Leonard. A stable and accurate convective modelling procedure based on quadratic upstream interpolation. Computer Methods in Applied Mechanics and Engineering , 19(1):59–98, 1979

work page 1979

[35] [35]

D. V. Lindley. On a measure of the information provided by an experiment. The Annals of Mathematical Statistics, 27(4):986–1005, 1956

work page 1956

[36] [36]

Q. Long, M. Scavino, R. Tempone, and S. Wang. Fast estimation of expected information gains for Bayesian experimental designs based on Laplace approximations. Computer Methods in Applied Mechanics and Engineering, 259:24–39, 2013

work page 2013

[37] [37]

Moˇ ckus

J. Moˇ ckus. On Bayesian methods for seeking the extremum. In Optimization Techniques IFIP Technical Conference, pages 400–404, 1975

work page 1975

[38] [38]

Nocedal and S

J. Nocedal and S. J. Wright. Numerical Optimization. Springer New York, 2006

work page 2006

[39] [39]

Nogueira

F. Nogueira. Bayesian Optimization: Open source constrained global optimization tool for Python, 2014

work page 2014

[40] [40]

A. M. Overstall, J. M. McGree, and C. C. Drovandi. An approach for finding fully Bayesian optimal designs using normal-based approximations to loss functions. Statistics and Computing , 28:343–358, 2018

work page 2018

[41] [41]

B. U. Park and J. S. Marron. Comparison of data-driven bandwidth selectors. Journal of the American Statistical Association, 85(409):66–72, 1990

work page 1990

[42] [42]

Paulin, A

D. Paulin, A. Jasra, and A. Thiery. Error bounds for sequential Monte Carlo samplers for multimodal distributions. Bernoulli, 25(1):310–340, 2019

work page 2019

[43] [43]

Pedregosa, G

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Pret- tenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Re- search, 12:2825–2830, 2011

work page 2011

[44] [44]

Pelikan, D

M. Pelikan, D. E. Goldberg, and E. Cant´ u-Paz. BOA: The Bayesian optimization algorithm. In Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation (GECCO 1999), page 525–532, 1999

work page 1999

[45] [45]

A Framework for Adaptive MCMC Targeting Multimodal Distributions

E. Pompe, C. Holmes, and K. Latuszy´ nski. A framework for adaptive MCMC targeting multi- modal distributions. arXiv preprint, arXiv:1812.02609, 2018. 27

work page internal anchor Pith review Pith/arXiv arXiv 2018

[46] [46]

Poole, S

B. Poole, S. Ozair, A. Van Den Oord, A. Alemi, and G. Tucker. On variational bounds of mutual information. In Proceedings of the 36th International Conference on Machine Learning (ICML 2019), volume 97 of Proceedings of Machine Learning Research, pages 5171–5180. PMLR, 2019

work page 2019

[47] [47]

Rainforth, A

T. Rainforth, A. Foster, D. R. Ivanova, and F. B. Smith. Modern Bayesian experimental design. Statistical Science, 39(1):100–114, 2024

work page 2024

[48] [48]

C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning . The MIT Press, 2006

work page 2006

[49] [49]

C. P. Robert and G. Casella. Monte Carlo Statistical Methods . Springer New York, 2004

work page 2004

[50] [50]

V. Roy. Convergence diagnostics for Markov chain Monte Carlo. Annual Review of Statistics and Its Application, 7:387–412, 2020

work page 2020

[51] [51]

C. M. Ryan, C. C. Drovandi, and A. N. Pettitt. Optimal Bayesian experimental design for models with intractable likelihoods using indirect inference applied to biological process models. Bayesian Analysis, 11(3):857–883, 2016

work page 2016

[52] [52]

E. G. Ryan, C. C. Drovandi, J. M. Mcgree, and A. N. Pettitt. A review of modern computational algorithms for Bayesian optimal design. International Statistical Review, 84(1):128–154, 2016

work page 2016

[53] [53]

K. J. Ryan. Estimating expected information gains for experimental designs with application to the random fatigue-limit model. Journal of Computational and Graphical Statistics , 12(3):585– 603, 2003

work page 2003

[54] [54]

Shahriari, K

B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. de Freitas. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2016

work page 2016

[55] [55]

S. A. Sisson, Y. Fan, and M. Beaumont. Handbook of Approximate Bayesian Computation. Chapman and Hall/CRC , 2018

work page 2018

[56] [56]

X. Wang, Y. Jin, S. Schmitt, and M. Olhofer. Recent advances in Bayesian optimization. ACM Computing Surveys, 55(13s):1–36, 2023

work page 2023

[57] [57]

K. Wu, P. Chen, and O. Ghattas. An efficient method for goal-oriented linear Bayesian optimal experimental design: Application to optimal sensor placement. arXiv preprint, arXiv:2102.06627, 2021

work page arXiv 2021

[58] [58]

Zhan and H

D. Zhan and H. Xing. Expected improvement for expensive optimization: A review. Journal of Global Optimization, 78(3):507–544, 2020. 28

work page 2020