Goal-Oriented Bayesian Optimal Experimental Design for Nonlinear Models using Markov Chain Monte Carlo
Pith reviewed 2026-05-24 02:47 UTC · model grok-4.3
The pith
A nested Monte Carlo estimator with MCMC and kernel density estimation computes expected information gain directly on predictive quantities of interest for nonlinear optimal experimental design.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors present a predictive goal-oriented OED framework that seeks the experimental design providing the greatest expected information gain on the quantities of interest. They propose a nested Monte Carlo estimator for the QoI EIG featuring Markov chain Monte Carlo for posterior sampling and kernel density estimation for evaluating the posterior-predictive density and its Kullback-Leibler divergence from the prior-predictive. The GO-OED design is then found by maximizing the EIG over the design space using Bayesian optimization.
What carries the argument
Nested Monte Carlo estimator for QoI expected information gain, which uses MCMC posterior sampling and kernel density estimation to compute the KL divergence between prior-predictive and posterior-predictive densities.
If this is right
- GO-OED designs differ from conventional parameter-focused OED by allocating information specifically to the quantities of interest.
- The method applies to nonlinear observation and prediction models where direct analytic expressions for EIG are unavailable.
- Bayesian optimization over the design space locates the design that maximizes the approximated QoI EIG.
- The framework is demonstrated on sensor placement for source inversion in a convection-diffusion field.
Where Pith is reading between the lines
- Replacing kernel density estimation with other density estimators could extend the method to higher-dimensional quantities of interest.
- The same nested-sampling structure could be paired with sequential design strategies that update after each experiment.
- Applications in domains such as medical imaging or environmental monitoring may gain from prioritizing predictive accuracy over parameter recovery.
Load-bearing premise
The nested Monte Carlo estimator with MCMC sampling and KDE accurately approximates the true QoI EIG for nonlinear models, and Bayesian optimization reliably identifies the global maximizing design.
What would settle it
In a low-dimensional nonlinear test model where the QoI EIG can be computed to high accuracy by direct integration or very large reference samples, compare the nested Monte Carlo estimate against that reference value and check whether the discrepancy shrinks with increasing sample size.
Figures
read the original abstract
Optimal experimental design (OED) provides a systematic approach to quantify and maximize the value of experimental data. Under a Bayesian approach, conventional OED maximizes the expected information gain (EIG) on model parameters. However, we are often interested in not the parameters themselves, but predictive quantities of interest (QoIs) that depend on the parameters in a nonlinear manner. We present a computational framework of predictive goal-oriented OED (GO-OED) suitable for nonlinear observation and prediction models, which seeks the experimental design providing the greatest EIG on the QoIs. In particular, we propose a nested Monte Carlo estimator for the QoI EIG, featuring Markov chain Monte Carlo for posterior sampling and kernel density estimation for evaluating the posterior-predictive density and its Kullback-Leibler divergence from the prior-predictive. The GO-OED design is then found by maximizing the EIG over the design space using Bayesian optimization. We demonstrate the effectiveness of the overall nonlinear GO-OED method, and illustrate its differences versus conventional non-GO-OED, through various test problems and an application of sensor placement for source inversion in a convection-diffusion field.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a computational framework for goal-oriented Bayesian optimal experimental design (GO-OED) applicable to nonlinear observation and prediction models. It seeks designs that maximize the expected information gain (EIG) on quantities of interest (QoIs) rather than model parameters. The approach uses a nested Monte Carlo estimator combining Markov chain Monte Carlo for posterior sampling with kernel density estimation to compute the KL divergence between prior- and posterior-predictive densities on the QoIs; the resulting EIG is then maximized over the design space via Bayesian optimization. Effectiveness is illustrated on test problems and a convection-diffusion source inversion sensor placement application.
Significance. If the nested estimator is shown to control approximation error for nonlinear QoIs, the framework would enable predictive goal-oriented design in settings where conventional parameter-focused EIG is insufficient, with direct relevance to applications such as sensor placement. The combination of established MCMC, KDE, and Bayesian optimization tools into a GO-OED pipeline is a practical contribution, though its reliability hinges on the unanalyzed density estimation step.
major comments (1)
- [nested Monte Carlo estimator description] Description of the nested Monte Carlo estimator (abstract and methods): the central claim that this estimator accurately recovers the QoI EIG for nonlinear models rests on KDE for the posterior-predictive density and its KL divergence, yet no error bounds, convergence rates, bandwidth sensitivity analysis, or numerical validation of the EIG approximation error are provided. This is load-bearing because, as noted in the skeptic's concern, multimodality or growing QoI dimension can systematically bias the estimated EIG and therefore distort the argmax over designs.
minor comments (1)
- [Introduction] The abstract and introduction would benefit from explicit comparison of the proposed QoI EIG estimator against existing nested MC or importance-sampling alternatives for EIG computation.
Simulated Author's Rebuttal
We thank the referee for the constructive review. We address the single major comment below.
read point-by-point responses
-
Referee: Description of the nested Monte Carlo estimator (abstract and methods): the central claim that this estimator accurately recovers the QoI EIG for nonlinear models rests on KDE for the posterior-predictive density and its KL divergence, yet no error bounds, convergence rates, bandwidth sensitivity analysis, or numerical validation of the EIG approximation error are provided. This is load-bearing because, as noted in the skeptic's concern, multimodality or growing QoI dimension can systematically bias the estimated EIG and therefore distort the argmax over designs.
Authors: We agree that the manuscript lacks explicit error analysis for the KDE step in the nested estimator. The current demonstrations rely on the test problems and convection-diffusion application to show that the resulting designs are sensible, but this does not directly quantify estimator bias or sensitivity. In the revision we will add a new subsection (in Methods or Results) that reports: (i) bandwidth sensitivity sweeps on the low-dimensional test problems, (ii) numerical checks of EIG approximation error against a high-sample reference where feasible, and (iii) additional experiments with deliberately multimodal QoI distributions. Theoretical convergence rates for arbitrary nonlinear QoIs remain difficult to obtain and will not be claimed; the added numerical evidence will instead support practical reliability for the regimes considered in the paper. revision: yes
Circularity Check
No circularity in GO-OED nested Monte Carlo estimator
full rationale
The paper's central contribution is a computational approximation (nested MC + MCMC posterior sampling + KDE for KL divergence between prior- and posterior-predictive densities) for the QoI EIG; this is an estimator built from standard, externally validated components (MCMC, KDE, Bayesian optimization) rather than any quantity defined in terms of itself or a fitted input renamed as a prediction. No self-citation chains, uniqueness theorems, or ansatzes are invoked to force the result, and the method does not reduce the EIG computation to its own inputs by construction. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption MCMC chains converge to the target posterior distribution
- domain assumption Kernel density estimation provides a sufficiently accurate estimate of the predictive densities for KL divergence computation
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
nested Monte Carlo estimator for the QoI EIG, featuring Markov chain Monte Carlo for posterior sampling and kernel density estimation for evaluating the posterior-predictive density and its Kullback-Leibler divergence
-
IndisputableMonolith/Foundation/DimensionForcing.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
GO-OED design is then found by maximizing the EIG over the design space using Bayesian optimization
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
Variational Sequential Optimal Experimental Design using Reinforcement Learning
vsOED uses a variational one-point reward and RL policy optimization to provide a lower bound on expected information gain for sequential experimental design, supporting nuisance parameters, implicit likelihoods, and ...
-
Intelligent data collection for network discrimination in material flow analysis using Bayesian optimal experimental design
A bias-reduced Bayesian optimal experimental design procedure using Kullback-Leibler divergence is shown to select high-value steel mass-flow observations that reduce network-structure uncertainty in a U.S. steel MFA,...
Reference graph
Works this paper leans on
-
[1]
A. Alexanderian. Optimal experimental design for infinite-dimensional Bayesian inverse problems governed by PDEs: A review. Inverse Problems, 37(4):043001, 2021
work page 2021
-
[2]
C. Andrieu, N. de Freitas, A. Doucet, and M. I. Jordan. An Introduction to MCMC for machine learning. Machine Learning, 50:5–43, 2003
work page 2003
-
[3]
A. C. Atkinson, A. N. Donev, and R. D. Tobias. Optimum Experimental Designs, With SAS . Oxford University Press, 2007
work page 2007
- [4]
-
[5]
D. Barber and F. Agakov. The IM algorithm: A variational approach to information maxi- mization. In Advances in Neural Information Processing Systems 16 , pages 201–208. MIT Press, 2003
work page 2003
-
[6]
J. Beck, B. M. Dia, L. F. Espath, Q. Long, and R. Tempone. Fast Bayesian experimental design: Laplace-based importance sampling for the expected information gain. Computer Methods in Applied Mechanics and Engineering , 334:523–553, 2018
work page 2018
-
[7]
J. M. Bernardo. Expected information as expected utility. The Annals of Statistics, 7(3):686–690, 1979
work page 1979
- [8]
- [9]
- [10]
- [11]
-
[12]
R. Cao, A. Cuevas, and W. Gonz´ alez Manteiga. A comparative study of several smoothing methods in density estimation. Computational Statistics & Data Analysis , 17(2):153–176, 1994
work page 1994
-
[13]
T. A. Catanach and J. L. Beck. Bayesian updating and uncertainty quantification using se- quential tempered MCMC with the rank-one modified Metropolis algorithm. arXiv preprint , arXiv:1804.08738, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[14]
K. Chaloner and I. Verdinelli. Bayesian experimental design: A review. Statistical Science , 10(3):273–304, 1995
work page 1995
-
[15]
N. Chopin and O. Papaspiliopoulos. An Introduction to Sequential Monte Carlo Methods. Springer Nature Switzerland, 2020
work page 2020
-
[16]
M. K. Cowles and B. P. Carlin. Markov chain Monte Carlo convergence diagnostics: A compar- ative review. Journal of the American Statistical Association , 91(434):883–904, 1996
work page 1996
- [17]
-
[18]
D. J. Earl and M. W. Deem. Parallel tempering: Theory, applications, and new perspectives. Physical Chemistry Chemical Physics , 7(23):3910–3916, 2005
work page 2005
-
[19]
Y. Englezou, T. W. Waite, and D. C. Woods. Approximate Laplace importance sampling for the estimation of expected Shannon information gain in high-dimensional Bayesian design for nonlinear models. Statistics and Computing , 32(5):82, 2022
work page 2022
-
[20]
V. V. Fedorov. Theory of Optimal Experiments . Academic Press, 1972
work page 1972
-
[21]
A layered multiple importance sampling scheme for focused optimal Bayesian experimental design
C. Feng and Y. M. Marzouk. A layered multiple importance sampling scheme for focused optimal Bayesian experimental design. arXiv preprint, arXiv:1903.11187, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1903
-
[22]
D. Foreman-Mackey, D. W. Hogg, D. Lang, and J. Goodman. emcee: The MCMC hammer. Publications of the Astronomical Society of the Pacific , 125(925):306–312, 2013
work page 2013
-
[23]
A. Foster, M. Jankowiak, E. Bingham, P. Horsfall, Y. W. Teh, T. Rainforth, and N. Goodman. Variational Bayesian optimal experimental design. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d 'Alch´ e-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 14036–14047. Curran Associates, 2019
work page 2019
-
[24]
P. I. Frazier. Bayesian optimization. INFORMS TutORials in Operations Research, 2018:255–278, 2018
work page 2018
-
[25]
J. Goodman and J. Weare. Ensemble samplers with affine invariance. Communications in Applied Mathematics and Computational Science , 5(1):65–80, 2010
work page 2010
-
[26]
R. B. Gramacy. Surrogates. Chapman and Hall/CRC, 2020
work page 2020
-
[27]
X. Huan, J. Jagalur, and Y. Marzouk. Optimal experimental design: Formulations and compu- tations. Acta Numerica, 33:715–840, 2024
work page 2024
-
[28]
X. Huan and Y. M. Marzouk. Simulation-based optimal Bayesian experimental design for non- linear systems. Journal of Computational Physics , 232(1):288–317, 2013
work page 2013
-
[29]
X. Huan and Y. M. Marzouk. Gradient-based stochastic optimization methods in Bayesian ex- perimental design. International Journal for Uncertainty Quantification , 4(6):479–510, 2014. 26
work page 2014
-
[30]
D. R. Jones, M. Schonlau, and W. J. Welch. Efficient global optimization of expensive black-box functions. Journal of Global Optimization , 13:455–492, 1998
work page 1998
-
[31]
M. C. Jones, J. S. Marron, and S. J. Sheather. A brief survey of bandwidth selection for density estimation. Journal of the American Statistical Association , 91(433):401–407, 1996
work page 1996
-
[32]
S. Kleinegesse and M. U. Gutmann. Bayesian experimental design for implicit models by mutual information neural estimation. In H. Daum´ e and A. Singh, editors,Proceedings of the 37th Inter- national Conference on Machine Learning (ICML 2020) , volume 119 of Proceedings of Machine Learning Research, pages 5316–5326. PMLR, 2020
work page 2020
-
[33]
J. Latz, J. P. Madrigal-Cianci, F. Nobile, and R. Tempone. Generalized parallel tempering on Bayesian inverse problems. Statistics and Computing , 31(5):67, 2021
work page 2021
-
[34]
B. Leonard. A stable and accurate convective modelling procedure based on quadratic upstream interpolation. Computer Methods in Applied Mechanics and Engineering , 19(1):59–98, 1979
work page 1979
-
[35]
D. V. Lindley. On a measure of the information provided by an experiment. The Annals of Mathematical Statistics, 27(4):986–1005, 1956
work page 1956
-
[36]
Q. Long, M. Scavino, R. Tempone, and S. Wang. Fast estimation of expected information gains for Bayesian experimental designs based on Laplace approximations. Computer Methods in Applied Mechanics and Engineering, 259:24–39, 2013
work page 2013
- [37]
-
[38]
J. Nocedal and S. J. Wright. Numerical Optimization. Springer New York, 2006
work page 2006
- [39]
-
[40]
A. M. Overstall, J. M. McGree, and C. C. Drovandi. An approach for finding fully Bayesian optimal designs using normal-based approximations to loss functions. Statistics and Computing , 28:343–358, 2018
work page 2018
-
[41]
B. U. Park and J. S. Marron. Comparison of data-driven bandwidth selectors. Journal of the American Statistical Association, 85(409):66–72, 1990
work page 1990
- [42]
-
[43]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Pret- tenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Re- search, 12:2825–2830, 2011
work page 2011
-
[44]
M. Pelikan, D. E. Goldberg, and E. Cant´ u-Paz. BOA: The Bayesian optimization algorithm. In Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation (GECCO 1999), page 525–532, 1999
work page 1999
-
[45]
A Framework for Adaptive MCMC Targeting Multimodal Distributions
E. Pompe, C. Holmes, and K. Latuszy´ nski. A framework for adaptive MCMC targeting multi- modal distributions. arXiv preprint, arXiv:1812.02609, 2018. 27
work page internal anchor Pith review Pith/arXiv arXiv 2018
- [46]
-
[47]
T. Rainforth, A. Foster, D. R. Ivanova, and F. B. Smith. Modern Bayesian experimental design. Statistical Science, 39(1):100–114, 2024
work page 2024
-
[48]
C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning . The MIT Press, 2006
work page 2006
-
[49]
C. P. Robert and G. Casella. Monte Carlo Statistical Methods . Springer New York, 2004
work page 2004
-
[50]
V. Roy. Convergence diagnostics for Markov chain Monte Carlo. Annual Review of Statistics and Its Application, 7:387–412, 2020
work page 2020
-
[51]
C. M. Ryan, C. C. Drovandi, and A. N. Pettitt. Optimal Bayesian experimental design for models with intractable likelihoods using indirect inference applied to biological process models. Bayesian Analysis, 11(3):857–883, 2016
work page 2016
-
[52]
E. G. Ryan, C. C. Drovandi, J. M. Mcgree, and A. N. Pettitt. A review of modern computational algorithms for Bayesian optimal design. International Statistical Review, 84(1):128–154, 2016
work page 2016
-
[53]
K. J. Ryan. Estimating expected information gains for experimental designs with application to the random fatigue-limit model. Journal of Computational and Graphical Statistics , 12(3):585– 603, 2003
work page 2003
-
[54]
B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. de Freitas. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2016
work page 2016
-
[55]
S. A. Sisson, Y. Fan, and M. Beaumont. Handbook of Approximate Bayesian Computation. Chapman and Hall/CRC , 2018
work page 2018
-
[56]
X. Wang, Y. Jin, S. Schmitt, and M. Olhofer. Recent advances in Bayesian optimization. ACM Computing Surveys, 55(13s):1–36, 2023
work page 2023
- [57]
-
[58]
D. Zhan and H. Xing. Expected improvement for expensive optimization: A review. Journal of Global Optimization, 78(3):507–544, 2020. 28
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.