Bayesian Surrogate Training on Multiple Data Sources: A Hybrid Modeling Strategy
Pith reviewed 2026-05-23 06:58 UTC · model grok-4.3
The pith
Two hybrid Bayesian methods integrate simulation and real-world data during surrogate model training via a novel weighting strategy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper proposes two probabilistic hybrid modeling strategies for surrogate training on multiple data sources. The first trains separate surrogates and combines predictive distributions; the second trains one surrogate incorporating both. A novel weighting strategy combines heterogeneous data independently of the surrogate family. Case studies show improved predictive accuracy, coverage, and ability to diagnose simulation model problems.
What carries the argument
Novel weighting strategy for combining heterogeneous simulation and measurement data sources during Bayesian surrogate training, used either by combining separate surrogates or integrating into one surrogate.
If this is right
- Hybrid approaches improve predictive accuracy compared to simulation-only training.
- They enhance predictive coverage.
- They allow diagnosis of issues in the simulation model.
- The weighting works across different surrogate families.
Where Pith is reading between the lines
- The methods could guide refinement of simulation models using measurement discrepancies.
- Applications may extend to fields with abundant measurements but imperfect simulations, such as environmental modeling.
- Further work might test the weighting strategy with non-Bayesian surrogates.
Load-bearing premise
Real-world measurement data contain usable hints about misspecifications or missing processes in the simulation model that can be leveraged during surrogate training without new biases outweighing the benefits.
What would settle it
Observing no improvement in predictive accuracy or coverage when using the hybrid methods versus standard simulation-only surrogates on held-out real-world data would falsify the performance claims.
Figures
read the original abstract
Surrogate models are often used as computationally efficient approximations to complex simulation models, enabling tasks such as solving inverse problems, sensitivity analysis, and probabilistic forward predictions, which would otherwise be computationally infeasible. During training, surrogate parameters are fitted such that the surrogate reproduces the simulation model's outputs as closely as possible. However, the simulation model itself is merely a simplification of the real-world system, often missing relevant processes or suffering from misspecifications e.g., in inputs or boundary conditions. Hints about these might be captured in real-world measurement data, and yet, we typically ignore those hints during surrogate building. In this paper, we propose two novel probabilistic approaches to integrate simulation data and real-world measurement data during surrogate training. The first method trains separate surrogate models for each data source and combines their predictive distributions, while the second incorporates both data sources by training a single surrogate. Both hybrid modeling approaches employ a novel weighting strategy for combining heterogeneous data sources during surrogate training, which operates independently of the chosen surrogate family. We show the conceptual differences and benefits of the two approaches through both synthetic and real-world case studies. The results demonstrate the potential of these methods to improve predictive accuracy, predictive coverage, and to diagnose problems in the underlying simulation model. These insights can improve system understanding and future model development.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes two hybrid Bayesian surrogate modeling strategies for integrating simulation outputs with real-world measurement data during training. The first trains separate surrogates on each source and combines their predictive distributions; the second trains a single surrogate on both sources. Both rely on a claimed-novel weighting scheme asserted to be independent of the surrogate family. Benefits for accuracy, coverage, and simulation-model diagnosis are illustrated on synthetic and real-world case studies.
Significance. If the weighting scheme proves reproducible and the empirical gains hold under quantitative scrutiny, the work could meaningfully advance surrogate construction in domains where simulators are known to be misspecified, by providing a practical route to incorporate real data without architecture-specific retraining.
major comments (2)
- [§3] §3 (Hybrid Modeling Approaches): the novel weighting strategy is described conceptually but without explicit equations, likelihood terms, or algorithmic pseudocode that would define how the weights are computed from the two data sources or demonstrate independence from the surrogate family. This is load-bearing for the central novelty claim.
- [§4] §4 (Case Studies): the abstract and introduction assert improvements in predictive accuracy and coverage from the case studies, yet the provided description contains no quantitative metrics, baseline comparisons, error bars, or statistical tests. Without these, the empirical support for the claimed benefits cannot be evaluated.
minor comments (1)
- [Abstract] The abstract would be strengthened by including one or two key quantitative results (e.g., RMSE or coverage percentages) from the case studies.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive comments. We address each of the major comments below.
read point-by-point responses
-
Referee: [§3] §3 (Hybrid Modeling Approaches): the novel weighting strategy is described conceptually but without explicit equations, likelihood terms, or algorithmic pseudocode that would define how the weights are computed from the two data sources or demonstrate independence from the surrogate family. This is load-bearing for the central novelty claim.
Authors: We agree with the referee that providing explicit mathematical details is essential to substantiate the novelty claim. In the revised manuscript, we will expand §3 to include the full equations defining the weighting strategy, the likelihood formulations for integrating the two data sources, and pseudocode for the algorithm. These additions will explicitly demonstrate the independence from the surrogate family. revision: yes
-
Referee: [§4] §4 (Case Studies): the abstract and introduction assert improvements in predictive accuracy and coverage from the case studies, yet the provided description contains no quantitative metrics, baseline comparisons, error bars, or statistical tests. Without these, the empirical support for the claimed benefits cannot be evaluated.
Authors: The referee is correct that the current case study section would benefit from more rigorous quantitative presentation. We will revise §4 to include quantitative metrics such as predictive accuracy measures, coverage probabilities, comparisons to baseline methods, error bars where applicable, and statistical tests to support the asserted improvements. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper proposes two novel hybrid Bayesian surrogate approaches that integrate simulation and real-world measurement data via a weighting strategy claimed to be independent of the surrogate family. No equations, derivations, or self-citations are shown that reduce the claimed improvements in accuracy, coverage, or model diagnosis to fitted quantities defined by the same data or to prior self-referential results. The central claims rest on synthetic and real-world case studies as external validation, making the derivation chain self-contained against external benchmarks with no load-bearing reductions to inputs by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Bayesian selection of hydro-morphodynamic models under computational time constraints
Farid Mohammadi, Rebekka Kopmann, Anneli Guthke, Sergey Oladyshkin, and Wolfgang Nowak. Bayesian selection of hydro-morphodynamic models under computational time constraints. Advances in Water Resources, 117: 0 53--64, July 2018. ISSN 03091708. doi:10.1016/j.advwatres.2018.05.007
-
[2]
Alexander Tarakanov and Ahmed H. Elsheikh. Regression-based sparse polynomial chaos for uncertainty quantification of subsurface flow models. Journal of Computational Physics, 399: 0 108909, December 2019. ISSN 0021-9991. doi:10.1016/j.jcp.2019.108909
-
[3]
Marissa Renardy, Tau-Mu Yi, Dongbin Xiu, and Ching-Shan Chou. Parameter uncertainty quantification using surrogate models applied to a spatial model of yeast mating polarization. PLoS computational biology, 14 0 (5): 0 e1006181, May 2018. ISSN 1553-7358. doi:10.1371/journal.pcbi.1006181
-
[4]
Using Emulation to Engineer and Understand Simulations of Biological Systems
Kieran Alden, Jason Cosgrove, Mark Coles, and Jon Timmis. Using Emulation to Engineer and Understand Simulations of Biological Systems . IEEE/ACM transactions on computational biology and bioinformatics, 17 0 (1): 0 302--315, 2020. ISSN 1557-9964. doi:10.1109/TCBB.2018.2843339
-
[5]
Norbert Wiener. The Homogeneous Chaos . American Journal of Mathematics, 60 0 (4): 0 897, October 1938. ISSN 00029327. doi:10.2307/2371268
-
[6]
Bruno Sudret. Global sensitivity analysis using polynomial chaos expansions. Reliability Engineering & System Safety, 93 0 (7): 0 964--979, July 2008. ISSN 09518320. doi:10.1016/j.ress.2007.04.002
-
[7]
S. Oladyshkin and W. Nowak. Data-driven uncertainty quantification using the arbitrary polynomial chaos expansion. Reliability Engineering & System Safety, 106: 0 179--190, October 2012. ISSN 09518320. doi:10.1016/j.ress.2012.05.002
-
[8]
Paul-Christian B \"u rkner, Ilja Kr \"o ker, Sergey Oladyshkin, and Wolfgang Nowak. A fully Bayesian sparse polynomial chaos expansion approach with joint priors on the coefficients and global selection of terms. Journal of Computational Physics, 488: 0 112210, September 2023. ISSN 0021-9991. doi:10.1016/j.jcp.2023.112210
-
[9]
Bayesian calibration of computer models
Marc C. Kennedy and Anthony O'Hagan. Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63 0 (3): 0 425--464, 2001. ISSN 1467-9868. doi:10.1111/1467-9868.00294
-
[10]
Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge, Mass., 3. print edition, 2008. ISBN 978-0-262-18253-9
work page 2008
-
[11]
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org
work page 2016
-
[12]
Adaptive construction of surrogates for the bayesian solution of inverse problems
Jinglai Li and Youssef M Marzouk. Adaptive construction of surrogates for the bayesian solution of inverse problems. SIAM Journal on Scientific Computing, 36 0 (3): 0 A1163--A1186, 2014
work page 2014
-
[13]
Uncertainty quantification and propagation in surrogate-based bayesian inference
Philipp Reiser, Javier Enrique Aguilar, Anneli Guthke, and Paul-Christian Bürkner. Uncertainty quantification and propagation in surrogate-based bayesian inference. Statistics and Computing, 35 0 (3): 0 66, 2025. ISSN 0960-3174. doi:10.1007/s11222-025-10597-8
-
[14]
Bayesian Surrogate Analysis and Uncertainty Propagation
Sascha Ranftl and Wolfgang von der Linden . Bayesian Surrogate Analysis and Uncertainty Propagation . Physical Sciences Forum, 3 0 (1): 0 6, 2021. ISSN 2673-9984. doi:10.3390/psf2021003006
-
[15]
M. J. Bayarri, J. O. Berger, and F. Liu. Modularization in Bayesian analysis, with emphasis on analysis of computer models. Bayesian Analysis, 4 0 (1): 0 119--150, March 2009. ISSN 1936-0975, 1931-6690. doi:10.1214/09-BA404
-
[16]
Stefania Scheurer, Aline Sch \"a fer Rodrigues Silva, Farid Mohammadi, Johannes Hommel, Sergey Oladyshkin, Bernd Flemisch, and Wolfgang Nowak. Surrogate-based Bayesian comparison of computationally expensive models: Application to microbially induced calcite precipitation. Computational Geosciences, 25 0 (6): 0 1899--1917, December 2021. ISSN 1573-1499. d...
-
[17]
Wyatt Bridgman, Uma Balakrishnan, Reese Jones, Jiefu Chen, Xuqing Wu, Cosmin Safta, Yueqin Huang, and Mohammad Khalil. Enhancing Polynomial Chaos Expansion Based Surrogate Modeling using a Novel Probabilistic Transfer Learning Strategy , December 2023
work page 2023
-
[18]
Bayesian Filtering and Smoothing
Simo Särkkä and Lennart Svensson. Bayesian Filtering and Smoothing. Institute of Mathematical Statistics Textbooks. Cambridge University Press, 2 edition, 2023
work page 2023
-
[19]
A Probabilistic State Space Model for Joint Inference from Differential Equations and Data
Jonathan Schmidt, Nicholas Kr \"a mer, and Philipp Hennig. A Probabilistic State Space Model for Joint Inference from Differential Equations and Data . In Advances in Neural Information Processing Systems , volume 34, pages 12374--12385. Curran Associates, Inc., 2021
work page 2021
-
[20]
Filip Tronarp, Hans Kersting, Simo S \"a rkk \"a , and Philipp Hennig. Probabilistic solutions to ordinary differential equations as nonlinear Bayesian filtering: A new perspective. Statistics and Computing, 29 0 (6): 0 1297--1315, November 2019. ISSN 1573-1375. doi:10.1007/s11222-019-09900-1
-
[21]
Calibrated Adaptive Probabilistic ODE Solvers
Nathanael Bosch, Philipp Hennig, and Filip Tronarp. Calibrated Adaptive Probabilistic ODE Solvers . In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics , pages 3466--3474. PMLR, March 2021
work page 2021
-
[22]
Stan Modeling Language Users Guide and Reference Manual , 2024
Stan Development Team . Stan Modeling Language Users Guide and Reference Manual , 2024. URL http://mc-stan.org/. Version 2.35
work page 2024
-
[23]
2023, PeerJ Computer Science, 9, e1516, doi: 10.7717/peerj-cs.1516
Oriol Abril-Pla, Virgile Andr \' e ani, Colin Carroll, Larry Dong, Christopher Fonnesbeck, Maxim Kochurov, Ravin Kumar, Junpeng Lao, Christian C. Luhmann, Osvaldo A. Martin, Michael Osthege, Ricardo Vieira, Thomas V. Wiecki, and Robert Zinkov. PyMC: a modern, and comprehensive probabilistic programming framework in Python . PeerJ Comput. Sci., 9: 0 e1516,...
-
[24]
C.P. Robert and G. Casella. Monte Carlo statistical methods . Springer Verlag, 2004
work page 2004
-
[25]
Qian Shao, Anis Younes, Marwan Fahs, and Thierry A. Mara. Bayesian sparse polynomial chaos expansion for global sensitivity analysis. Computer Methods in Applied Mechanics and Engineering, 318: 0 474--496, May 2017. ISSN 0045-7825. doi:10.1016/j.cma.2017.01.033
-
[26]
Robert B. Gramacy. Surrogates: G aussian Process Modeling, Design and Optimization for the Applied Sciences . Chapman Hall/CRC, Boca Raton, Florida, 2020. http://bobby.gramacy.com/surrogates/
work page 2020
-
[27]
Data-driven polynomial chaos expansion for machine learning regression
Emiliano Torre, Stefano Marelli, Paul Embrechts, and Bruno Sudret. Data-driven polynomial chaos expansion for machine learning regression. Journal of Computational Physics, 388: 0 601--623, July 2019. ISSN 0021-9991. doi:10.1016/j.jcp.2019.03.039
-
[28]
Using Stacking to Average Bayesian Predictive Distributions (with Discussion )
Yuling Yao, Aki Vehtari, Daniel Simpson, and Andrew Gelman. Using Stacking to Average Bayesian Predictive Distributions (with Discussion ). Bayesian Analysis, 13 0 (3): 0 917--1007, September 2018. ISSN 1936-0975, 1931-6690. doi:10.1214/17-BA1091
-
[29]
Interpreting Statistical Evidence by using Imperfect Models : Robust Adjusted Likelihood Functions
Richard Royall and Tsung-Shan Tsou. Interpreting Statistical Evidence by using Imperfect Models : Robust Adjusted Likelihood Functions . Journal of the Royal Statistical Society Series B: Statistical Methodology, 65 0 (2): 0 391--404, May 2003. ISSN 1369-7412. doi:10.1111/1467-9868.00392
-
[30]
A weighted strategy to handle likelihood uncertainty in Bayesian inference
Claudio Agostinelli and Luca Greco. A weighted strategy to handle likelihood uncertainty in Bayesian inference. Computational Statistics, 28 0 (1): 0 319--339, February 2013. ISSN 1613-9658. doi:10.1007/s00180-011-0301-1
-
[31]
A General Framework for Updating Belief Distributions
Pier Giovanni Bissiri, Chris Holmes, and Stephen Walker. A General Framework for Updating Belief Distributions . Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78 0 (5): 0 1103--1130, November 2016. ISSN 13697412. doi:10.1111/rssb.12158
-
[32]
Peter Gr \"u nwald and Thijs van Ommen. Inconsistency of Bayesian Inference for Misspecified Linear Models , and a Proposal for Repairing It . Bayesian Analysis, 12 0 (4): 0 1069--1103, December 2017. ISSN 1936-0975, 1931-6690. doi:10.1214/17-BA1085
-
[33]
Assigning a value to a power likelihood in a general Bayesian model, January 2017
Chris Holmes and Stephen Walker. Assigning a value to a power likelihood in a general Bayesian model, January 2017
work page 2017
-
[34]
Frazier, and Jeremias Knoblauch
Yann McLatchie, Edwin Fong, David T. Frazier, and Jeremias Knoblauch. Predictive performance of power posteriors, August 2024
work page 2024
-
[35]
Detecting and diagnosing prior and likelihood sensitivity with power-scaling
Noa Kallioinen, Topi Paananen, Paul-Christian B \"u rkner, and Aki Vehtari. Detecting and diagnosing prior and likelihood sensitivity with power-scaling. Statistics and Computing, 34 0 (1): 0 57, February 2024. ISSN 0960-3174, 1573-1375. doi:10.1007/s11222-023-10366-5
-
[36]
Chris U. Carmona and Geoff K. Nicholls. Semi-modular inference: enhanced learning in multi-modular models by tempering the influence of components. In Silvia Chiappa and Roberto Calandra, editors, Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, AISTATS 2020, volume 108 of Proceedings of Machine Learning Research...
work page 2020
-
[37]
A survey of Bayesian predictive methods for model assessment, selection and comparison
Aki Vehtari and Janne Ojanen. A survey of Bayesian predictive methods for model assessment, selection and comparison. Statistics Surveys, 6 0 (none): 0 142--228, January 2012. ISSN 1935-7516. doi:10.1214/12-SS102
-
[38]
Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC
Aki Vehtari, Andrew Gelman, and Jonah Gabry. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC . Statistics and Computing, 27 0 (5): 0 1413--1432, September 2017. ISSN 0960-3174, 1573-1375. doi:10.1007/s11222-016-9696-4
-
[39]
I. M Sobol'. On the distribution of points in a cube and the approximate evaluation of integrals. USSR Computational Mathematics and Mathematical Physics, 7 0 (4): 0 86--112, January 1967. ISSN 0041-5553. doi:10.1016/0041-5553(67)90144-9
-
[40]
Journal of Statistical Software , author =
Bob Carpenter, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. Stan: A probabilistic programming language. Journal of Statistical Software, 76 0 (1): 0 1–32, 2017. doi:10.18637/jss.v076.i01. URL https://www.jstatsoft.org/index.php/jss/article/view/v076i01
-
[41]
Rank-normalization, folding, and localization: An improved R for assessing convergence of MCMC
Aki Vehtari, Andrew Gelman, Daniel Simpson, Bob Carpenter, and Paul-Christian B \"u rkner. Rank-normalization, folding, and localization: An improved R for assessing convergence of MCMC . Bayesian Analysis, 16 0 (2), June 2021. ISSN 1936-0975. doi:10.1214/20-BA1221
-
[42]
Herbert W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42 0 (4): 0 599--653, 2000. doi:10.1137/S0036144500371907. URL https://doi.org/10.1137/S0036144500371907
-
[43]
Modelling the covid-19 epidemic and implementation of population-wide interventions in italy
Giulia Giordano, Franco Blanchini, Raffaele Bruno, Patrizio Colaneri, Alessandro Filippo, Angela Di Matteo, and Marta Colaneri. Modelling the covid-19 epidemic and implementation of population-wide interventions in italy. Nature Medicine, 26: 0 1--6, 06 2020. doi:10.1038/s41591-020-0883-7
-
[44]
Emanuele Guidotti and David Ardia. Covid-19 data hub. Journal of Open Source Software, 5 0 (51): 0 2376, 2020. doi:10.21105/joss.02376
-
[45]
A worldwide epidemiological database for covid-19 at fine-grained spatial resolution
Emanuele Guidotti. A worldwide epidemiological database for covid-19 at fine-grained spatial resolution. Scientific Data, 9 0 (1): 0 112, 2022. doi:10.1038/s41597-022-01245-1
-
[46]
Implicitly adaptive importance sampling
Topi Paananen, Juho Piironen, Paul-Christian B \"u rkner, and Aki Vehtari. Implicitly adaptive importance sampling. Statistics and Computing, 31 0 (2): 0 16, February 2021. ISSN 1573-1375. doi:10.1007/s11222-020-09982-2
-
[47]
Being bayesian, even just a bit, fixes overconfidence in relu networks
Agustinus Kristiadi, Matthias Hein, and Philipp Hennig. Being bayesian, even just a bit, fixes overconfidence in relu networks. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event , volume 119 of Proceedings of Machine Learning Research, pages 5436--5446. PMLR , 2020. URL http://proceedings.ml...
work page 2020
-
[48]
Improved Uncertainty Quantification for Neural Networks With Bayesian Last Layer
Felix Fiedler and Sergio Lucia. Improved Uncertainty Quantification for Neural Networks With Bayesian Last Layer . IEEE Access, 11: 0 123149--123160, 2023. ISSN 2169-3536. doi:10.1109/ACCESS.2023.3329685
-
[49]
Variational Bayesian Last Layers
James Harrison, John Willes, and Jasper Snoek. Variational Bayesian Last Layers . In The Twelfth International Conference on Learning Representations , October 2023
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.