pith. sign in

arxiv: 1906.10737 · v1 · pith:FHKSBT62new · submitted 2019-06-25 · 📊 stat.ME

Prediction Using a Bayesian Heteroscedastic Composite Gaussian Process

Pith reviewed 2026-05-25 16:14 UTC · model grok-4.3

classification 📊 stat.ME
keywords Gaussian processcomposite modelheteroscedasticityBayesian predictionnon-stationaryMCMCprediction intervalsvariance modeling
0
0 comments X

The pith

A Bayesian extension to the composite Gaussian process adds an input-dependent variance term to predict both stationary and non-stationary responses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds a Bayesian model that replaces the usual regression term in a Gaussian process setup with a global GP for large-scale trends. An independent local GP captures finer deviations, while a separate process lets the variance of the response change with the input values. A prior is chosen so the global trend stays smoother than the local part, and the covariance is modified to let the two components receive different weights. MCMC sampling produces posterior estimates of all parameters and of the variance at both training and new points, which are then used to form predictions and intervals. The approach is shown to apply whether the underlying response is stationary or not.

Core claim

The model Y(x) extends the composite Gaussian process by including a heteroscedastic component whose variance depends on the inputs. Large-scale trends are estimated by one Gaussian process and local trends by an independent second process. A prior is introduced that keeps the fitted global mean smoother than the local deviations, and the covariance structure is extended so the global and local components can be weighted differently. Markov chain Monte Carlo sampling yields the full posterior, from which predictions and prediction intervals are obtained for both stationary and non-stationary responses.

What carries the argument

The Bayesian heteroscedastic composite Gaussian process, which combines a global trend GP, an independent local deviation GP, and an input-dependent variance process under a smoothness-enforcing prior and differentially weighted covariance.

If this is right

  • The model produces predictions and uncertainty intervals for both stationary and non-stationary responses.
  • Posterior samples give estimates of the heteroscedastic variance at every training and test location.
  • Differential weighting of the global and local components is available through the extended covariance.
  • Markov chain Monte Carlo supplies the full posterior over all model parameters.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same structure could be tested on spatial data sets where measurement error visibly changes across the domain.
  • Allowing the local process to carry its own length-scale parameters might further increase flexibility without changing the overall three-process layout.
  • Direct comparison of out-of-sample interval coverage on simulated data with known input-dependent variance would quantify the benefit of the third process.

Load-bearing premise

The prior that forces the global mean to be smoother than the local deviations is appropriate for the data at hand.

What would settle it

On data generated from a process whose global trend is rougher than its local deviations, the model would either fail to separate the components or produce worse predictions than a standard stationary Gaussian process.

Figures

Figures reproduced from arXiv: 1906.10737 by Casey B. Davis, Christopher M. Hans, Thomas J. Santner.

Figure 1
Figure 1. Figure 1: Kriging predictors (red lines) for the BJX function (black lines) given in equation (2) based on the training data shown as black points together with 95% prediction intervals. Left Panel: constant mean; Right Panel: cubic mean. BJX function as having three behavior paradigms. For small x, y(x) can be described as having a relatively flat global trend with rapidly-changing local adjustments. For intermedia… view at source ↗
Figure 2
Figure 2. Figure 2: Predictions (in red) of the BJX test function y(x) in (2) and associated 95% uncertainty intervals (as a gray shadow) based on the CGP model. The dashed blue line is the estimate of the global component YG(x) under the CGP model. level in a hierarchical model and an additional step in a Markov chain Monte Carlo algorithm. We believe this direct approach to modeling will result in more accurate representati… view at source ↗
Figure 3
Figure 3. Figure 3: Prediction and 95% uncertainty bounds for the BJX function, y(x), in Example 4.1 (solid black): the BCGP predictor of y(x) (solid blue); 95% UQ limits of y(x) (dashed blue); estimated posterior mean of the YG(x) process (solid green); estimated posterior mean of the of the YL(x) process (solid magenta). relatively large variations in y(x) for x < 0.5. In contrast, the 95% bands produced by kriging predicto… view at source ↗
Figure 4
Figure 4. Figure 4: Marginal plots of state rate of heat transfer versus x1, x2, x3, x4 for Example 4.2. most active because they have the smallest median draws, and the smaller ρG,4 values show that x4 appears more active than x2. This is consistent with exploratory plots of the data in [PITH_FULL_IMAGE:figures/full_fig_p024_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Boxplots of the posterior draws of all BCGP model parameters for Example 4.2 24 [PITH_FULL_IMAGE:figures/full_fig_p024_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Predicted versus simulated values for the 24 steady state heat exchange inputs of Example 4.2 The posterior predictive mean of the Y (x) process was estimated at the 24 test data locations. A plot of the simulated versus predicted values is shown in [PITH_FULL_IMAGE:figures/full_fig_p025_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Predicted wing weight versus calculated wing weight for 150 test inputs based on 50 training inputs from a maximin LHD [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Boxplots of the predicted global trend function for the wing weight function, ybG(x), based on grouped x8 and x4 values for 150 test inputs. One opportunity that CGP and BCGP provide is the opportunity to examine the global trend curve, ybG(x). Here we consider the activity of inputs on ybG(x). Recall that x8, x3, x9, were considered active for wing weight y(x) while x4 was considered in-/low-activity. It … view at source ↗
read the original abstract

This research proposes a flexible Bayesian extension of the composite Gaussian process (CGP) model of Ba and Joseph (2012) for predicting (stationary or) non-stationary $y(\mathbf{x})$. The CGP generalizes the regression plus stationary Gaussian process (GP) model by replacing the regression term with a GP. The new model, $Y(\mathbf{x})$, can accommodate large-scale trends estimated by a global GP, local trends estimated by an independent local GP, and a third process to describe heteroscedastic data in which $Var(Y(\mathbf{x}))$ can depend on the inputs. This paper proposes a prior which ensures that the fitted global mean is smoother than the local deviations, and extends the covariance structure of the CGP to allow for differentially-weighted global and local components. A Markov chain Monte Carlo algorithm is proposed to provide posterior estimates of the parameters, including the values of the heteroscedastic variance at the training and test data locations. The posterior distribution is used to make predictions and to quantify the uncertainty of the predictions using prediction intervals. The method is illustrated using both stationary and non-stationary $y(\mathbf{x})$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript proposes a Bayesian extension of the composite Gaussian process (CGP) model from Ba and Joseph (2012) for predicting stationary or non-stationary responses y(x). The model Y(x) combines a global GP for large-scale trends, an independent local GP for local deviations, and a third process for input-dependent heteroscedastic variance. It introduces a prior ensuring the fitted global mean is smoother than local deviations, extends the covariance structure to allow differentially-weighted global and local components, and uses an MCMC algorithm to obtain posterior estimates of parameters (including heteroscedastic variance at training and test points) for prediction and interval construction. The approach is illustrated on both stationary and non-stationary functions.

Significance. If the prior construction and MCMC procedure achieve the claimed separation of scales without identifiability problems, the model offers a practical Bayesian framework for non-stationary heteroscedastic prediction with uncertainty quantification. The explicit prior for smoothness ordering and the covariance extension for differential weighting are strengths, as is the provision of an MCMC algorithm for full posterior inference rather than point estimates alone.

minor comments (3)
  1. [Abstract] Abstract: the claim that the prior 'ensures' the global mean is smoother than local deviations should be cross-referenced to the specific prior definition (likely in the model section) so readers can verify the mechanism.
  2. The manuscript should include a brief discussion of MCMC convergence diagnostics or mixing behavior for the heteroscedastic variance process parameters, as these are central to the prediction intervals.
  3. Notation for the three processes and their covariance kernels should be introduced with a single consistent table or diagram early in the methods to aid readability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their detailed summary of our work on the Bayesian heteroscedastic composite Gaussian process and for the positive assessment of its significance. The recommendation of minor revision is appreciated. However, the report lists no specific major comments under the MAJOR COMMENTS section.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper extends the external CGP model of Ba and Joseph (2012) by proposing a new prior for smoothness ordering between global and local GPs, extending the covariance to allow differential weighting, and adding a third heteroscedastic process. Posterior inference uses a standard MCMC algorithm whose outputs (parameter estimates and prediction intervals) are generated from the joint posterior rather than being algebraically identical to any fitted input. No derivation step reduces a claimed prediction to a fitted quantity by construction, no uniqueness theorem is imported from self-citation, and the model construction is presented as an independent modeling choice whose validity can be assessed against external data. The derivation chain is therefore self-contained.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The model rests on a custom prior for smoothness ordering and an extended covariance kernel; both are introduced without external benchmarks or machine-checked proofs. MCMC fitting implies multiple hyperparameters whose values are not fixed a priori.

free parameters (2)
  • global and local GP length-scale and variance hyperparameters
    Standard GP parameters estimated via MCMC; their specific values are not stated in the abstract.
  • heteroscedastic variance process parameters
    Input-dependent variance parameters introduced as part of the third process and sampled by MCMC.
axioms (1)
  • domain assumption A prior exists that enforces the global mean to be smoother than local deviations
    Explicitly proposed in the abstract as a modeling choice.

pith-pipeline@v0.9.0 · 5735 in / 1164 out tokens · 33790 ms · 2026-05-25T16:14:16.769774+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

  1. [1]

    and Joseph, V

    Ba, S. and Joseph, V. R. (2012). Composite G aussian process models for emulating expensive functions. Annals of Applied Statistics\/ , 6 (4), 1838--1860

  2. [2]

    and Joseph, V

    Ba, S. and Joseph, V. R. (2018). CGP : Composite G aussian Process Models\/ . R package version 2.1-1

  3. [3]

    P., and Gelfand, A

    Banerjee, S., Carlin, B. P., and Gelfand, A. E. (2004). Hierarchical Modeling and Analysis for Spatial Data\/ . Chapman and Hall, New York

  4. [4]

    H., Olshen, R

    Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1984). Classification and Regression Trees\/ . Chapman & Hall, New York

  5. [5]

    A., George, E

    Chipman, H. A., George, E. I., and McCulloch, R. E. (1998). Bayesian cart model search. Journal of the American Statistical Association\/ , 93 (443), 935--960

  6. [6]

    Cressie, N. A. (1993). Statistics for Spatial Data\/ . J. Wiley, New York, F irst edition

  7. [7]

    Davis, C. B. (2015). A B ayesian approach to prediction and variable selection using nonstationary G aussian processes\/ . Ph.D. thesis, The Ohio State University

  8. [8]

    Forrester, A., Sobester, A., and Keane, A. (2008). Engineering design via surrogate modelling: A practical guide\/ . Wiley, Chicester, UK

  9. [9]

    Gattiker, J. R. (2008). Gaussian Process models for simulation analysis (GPM/SA) command, function, and data structure reference. Technical Report LA-UR-08-08057, Los Alamos National Laboratory

  10. [10]

    Gelman, A., Roberts, G., and Gilks, W. (1996). Efficient M etropolis jumping rules. In J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Smith, editors, Bayesian Statistics 5: Proceedings of the Fifth V alencia International Meeting\/ , pages 599--608. Oxford University Press, Oxford

  11. [11]

    Gramacy, R. B. and Lee, H. K. H. (2008). Bayesian treed G aussian process models with an application to computer modeling. Journal of the American Statistical Association\/ , 103 (483), 1119--1130

  12. [12]

    Gramacy, R. B. and Lee, H. K. H. (2012). Cases for the nugget in modeling computer experiments. Statistics and Computing\/ , 22 (3), 713--722

  13. [13]

    Gu, M., Wang, X., and Berger, J. O. (2018). Robust gaussian stochastic process emulation. Annals of Statistics\/ , 46 , 3038--306

  14. [14]

    Higdon, D., Kennedy, M., Cavendish, J., Cafeo, J., and Ryne, R. (2004). Combining field data and computer simulations for calibration and prediction. SIAM Journal of Scientific Computing\/ , 26 , 448--466

  15. [15]

    Higdon, D., Gattiker, J., Williams, B., and Rightley, M. (2008). Computer model calibration using high dimensional output. Journal of the American Statistical Association\/ , 103 , 570--583

  16. [16]

    and O'Hagan, A

    Kennedy, M. and O'Hagan, A. (2001). Bayesian calibration of computer models (with discussion). Journal of the Royal Statistical Society Series B\/ , 63 , 425--464

  17. [17]

    E., Bankes, S., and Andronova, N

    Lempert, R., Schlensinger, M. E., Bankes, S., and Andronova, N. (2000). The impacts of climate variability on near-term policy choices and the value of information. Climate Change\/ , 45 , 129--161

  18. [18]

    Neal, R. (1998). Regression and classification using G aussian process priors (with discussion). In J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Smith, editors, Bayesian Statistics 6: Proceedings of the Sixth V alencia International Meeting\/ , pages 475--501. Oxford University Press, Oxford

  19. [19]

    Oakley, J. (2002). Eliciting G aussian process priors for complex computer codes. Journal of the Royal Statistical Society, Series D\/ , 51 (1), 81--97

  20. [20]

    and O'Hagan, A

    Oakley, J. and O'Hagan, A. (2004). Probabilistic sensitivity analysis of complex models: A B ayesian approach. Journal of the Royal Statistical Society Series B\/ , 66 , 751--769

  21. [21]

    O'Hagan, A. (1978). Curve fitting and optimal design for prediction (with discussion). Journal of the Royal Statistical Society B\/ , 40 , 1--42

  22. [22]

    Ong, K., Santner, T., and Bartel, D. (2008). Robust design for acetabular cup stability accounting for patient and surgical variability. Journal of Biomechanical Engineering\/ , 130 , 1--11

  23. [23]

    Z., Seepersad, C

    Qian, P. Z., Seepersad, C. C., Joseph, V. R., Allen, J. K., and Wu, C. F. J. (2006). Building surrogate models with details and approximate simulations. ASME Journal of Mechanical Design\/ , 128 , 668--677

  24. [24]

    Roberts, G. O. and Rosenthal, J. S. (2001). Optimal scaling for various M etropolis-- H astings algorithms. Statistical Science\/ , 16 (4), 351--367

  25. [25]

    O., Gelman, A., and Gilks, W

    Roberts, G. O., Gelman, A., and Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk M etropolis algorithms. The Annals of Applied Probability\/ , 7 (1), 110--120

  26. [26]

    Sacks, J., Welch, W., Mitchell, T., and Wynn, H. (1989). Design and analysis of computer experiments. Statistical Science\/ , 4 (4), 409--435

  27. [27]

    J., Williams, B

    Santner, T. J., Williams, B. J., and Notz, W. I. (2018). The Design and Analysis of Computer Experiments, Second Edition\/ . Springer Verlag, New York

  28. [28]

    G., Chen, P.-H., Mulyana, R., Santner, T

    Villarreal-Marroqu \' n, M. G., Chen, P.-H., Mulyana, R., Santner, T. J., Dean, A. M., and Castro, J. M. (2017). Multiobjective optimization of injection molding using a calibrated predictor based on physical and simulated data. Polymer Engineering & Science\/ , 57 (3), 248--257

  29. [29]

    W., and Ding, X

    Xiong, Y., Chen, W., Apley, D. W., and Ding, X. (2007). A non-stationary covariance-based kriging method for metamodelling in engineering design. International Journal for Numerical Methods in Engineering\/ , 71 , 733--756