Adaptive Partitioning Design and Analysis for Emulation of a Complex Computer Code
Pith reviewed 2026-05-25 11:16 UTC · model grok-4.3
The pith
An adaptive partitioning emulator places more points in high-variability regions of computer model input spaces to achieve accurate predictions with smaller overall designs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By taking a data-adaptive approach to the development of a design, and choosing to partition the space in the regions of highest variability, we obtain a higher density of points in these regions and hence accurate prediction.
What carries the argument
The adaptive partitioning emulator (APE), which partitions the input space in regions of highest variability to allocate design points adaptively.
If this is right
- Emulators achieve accurate prediction in complex regions without proportional increase in total design size.
- Gaussian process fitting becomes feasible for larger designs by limiting points in low-variability regions.
- Predictive uncertainty measures remain reliable while computational cost grows more slowly than with full designs.
- The method outperforms non-adaptive approaches in scenarios where model behavior varies by input region.
Where Pith is reading between the lines
- The same partitioning logic could be applied to other surrogate models that suffer from cubic scaling.
- Sequential versions of the method might integrate with active learning to refine partitions iteratively.
- High-dimensional input spaces with localized features would likely show the largest gains over uniform designs.
Load-bearing premise
Most computer models are only complex in particular regions of the input space.
What would settle it
A computer model with uniform complexity and variability across the full input space would show no accuracy gain for the adaptive method over a standard space-filling design of equal size.
Figures
read the original abstract
Computer models are used as replacements for physical experiments in a large variety of applications. Nevertheless, direct use of the computer model for the ultimate scientific objective is often limited by the complexity and cost of the model. Historically, Gaussian process regression has proven to be the almost ubiquitous choice for a fast statistical emulator for such a computer model, due to its flexible form and analytical expressions for measures of predictive uncertainty. However, even this statistical emulator can be computationally intractable for large designs, due to computing time increasing with the cube of the design size. Multiple methods have been proposed for addressing this problem. We discuss several of them, and compare their predictive and computational performance in several scenarios. We then propose solving this problem using an adaptive partitioning emulator (APE). The new approach is motivated by the idea that most computer models are only complex in particular regions of the input space. By taking a data-adaptive approach to the development of a design, and choosing to partition the space in the regions of highest variability, we obtain a higher density of points in these regions and hence accurate prediction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes an adaptive partitioning emulator (APE) for Gaussian process emulation of complex computer codes. Motivated by the assumption that most computer models are complex only in particular regions of the input space, the approach uses data-adaptive partitioning to place higher point density in regions of highest variability, claiming this yields accurate predictions while addressing the cubic scaling issue of standard GPs. The paper also reviews and compares several existing methods for large designs in terms of predictive and computational performance across scenarios.
Significance. If the central claims hold under empirical validation, APE could provide a practical alternative to existing large-design approximations for GP emulation by exploiting localized complexity, potentially improving efficiency without sacrificing accuracy in critical regions. The review of competing methods adds context, but the significance is limited by the absence of detailed derivations or reproducible results in the provided text.
major comments (2)
- [Abstract] Abstract: the description of APE supplies no equations, partitioning criterion, variability threshold, implementation details, error analysis, or empirical comparisons, preventing verification of the accuracy claim or assessment of whether the adaptive strategy outperforms reviewed alternatives.
- [Abstract] Abstract: the performance advantage is tied to the premise that complexity is concentrated in particular input-space regions allowing higher density there; if this premise fails (uniform complexity or misidentified regions), the method reduces to a non-uniform design with no guaranteed predictive gain, and this load-bearing assumption receives no formal test or counterexample analysis.
minor comments (2)
- [Abstract] The abstract mentions comparing predictive and computational performance 'in several scenarios' but provides no table, figure, or quantitative summary of those comparisons.
- Notation for the emulator and design is not introduced, making it difficult to connect the high-level description to standard GP literature.
Simulated Author's Rebuttal
We thank the referee for their constructive comments on our manuscript. We respond to each major comment below, providing clarifications and indicating where revisions will be made if appropriate.
read point-by-point responses
-
Referee: [Abstract] Abstract: the description of APE supplies no equations, partitioning criterion, variability threshold, implementation details, error analysis, or empirical comparisons, preventing verification of the accuracy claim or assessment of whether the adaptive strategy outperforms reviewed alternatives.
Authors: The abstract provides a concise summary of the APE method and its motivation. The full details, including the partitioning criterion based on variability, the threshold used, implementation, error analysis, and empirical comparisons with other methods, are presented in the main body of the paper, specifically in Sections 2-5. This is standard for abstracts, which are limited in length and serve to outline rather than detail the technical aspects. revision: no
-
Referee: [Abstract] Abstract: the performance advantage is tied to the premise that complexity is concentrated in particular input-space regions allowing higher density there; if this premise fails (uniform complexity or misidentified regions), the method reduces to a non-uniform design with no guaranteed predictive gain, and this load-bearing assumption receives no formal test or counterexample analysis.
Authors: We agree that the method's advantage depends on localized complexity. The manuscript includes empirical comparisons across multiple scenarios, some of which test performance under varying degrees of localized vs. uniform complexity. However, we recognize the value in explicitly addressing potential failures of the assumption. In the revision, we will add a brief discussion and possibly a counterexample analysis in the results section to better evaluate the robustness of APE. revision: partial
Circularity Check
No circularity: adaptive partitioning proposal is independent of its own outputs
full rationale
The paper proposes the adaptive partitioning emulator (APE) as a data-adaptive design that partitions in high-variability regions to achieve higher point density and accurate prediction. This is presented as a methodological choice motivated by the assumption of localized model complexity, without any equations, derivations, or self-citations that reduce the claimed performance gain to a quantity fitted or defined by the method itself. No load-bearing steps match the enumerated circularity patterns; the approach is self-contained against external benchmarks and does not rename known results or import uniqueness via author citations.
Axiom & Free-Parameter Ledger
free parameters (1)
- variability threshold or partitioning criterion
axioms (1)
- domain assumption Most computer models are only complex in particular regions of the input space.
Reference graph
Works this paper leans on
-
[1]
Barthelmann, V., Novak, E., & Ritter, K. (2000). High dimensional polynomial interpolation on sparse grids. Advances in Computational Mathematics , 12(4), 273–288
work page 2000
-
[2]
Chen, H. (2018). Design and analysis of computer experiments: Assessing and advancing the state of the art (Unpublished doctoral dissertation). University of British Columbia
work page 2018
-
[3]
Currin, C., Mitchell, T., Morris, M., & Ylvisaker, D. (1991). Bayesian prediction of deter- ministic functions, with applications to the design and analysis of computer experiments. Journal of the American Statistical Association , 86(416), 953–963
work page 1991
-
[4]
Dancik, G. M., & Dorman, K. S. (2008). mlegp: Statistical analysis for computer models of biological systems using R. Bioinformatics, 24(17), 1966–1967
work page 2008
-
[5]
Franke, R. (1979). A critical comparison of some methods for interpolation of scattered data (Tech. Rep.). Naval Postgraduate School Monterey CA
work page 1979
-
[6]
Gramacy, R. B. (2007). tgp: An R package for Bayesian nonstationary, semiparametric nonlin- ear regression and design by treed Gaussian process models. Journal of Statistical Software , 19(9), 1–46
work page 2007
-
[7]
Gramacy, R. B. (2016). laGP: Large-scale spatial modeling via local approximate Gaussian processes in R. Journal of Statistical Software , 72(1), 1–46
work page 2016
-
[8]
Gramacy, R. B., & Apley, D. W. (2015). Local Gaussian process approximation for large computer experiments. Journal of Computational and Graphical Statistics , 24, 561–578
work page 2015
-
[9]
Gramacy, R. B., & Lee, H. K. H. (2008). Bayesian treed Gaussian process models with an application to computer modeling. Journal of the American Statistical Association , 103, 1119–1130
work page 2008
-
[10]
Jones, D. R., Schonlau, M., & Welch, W. J. (1998). Efficient global optimization of expensive black-box functions. Journal of Global Optimization , 13(4), 455–492
work page 1998
-
[11]
Kaufman, C. (2010). SparseEm: Statistical emulation using sparse correlation structure. Re- trieved Aug. 2, 2017, from https://www.stat.berkeley.edu/~cgk/rcode/index.html
work page 2010
-
[12]
Kaufman, C., Bingham, D., Habib, S., Heitmann, K., & Frieman, J. A. (2011). Efficient emulators of computer experiments using compactly supported correlation functions, with an application to cosmology. Annals of Applied Statistics , 5, 2470–2492. MATLAB. (2017). Version r2017b. Natick, Massachusetts: The MathWorks Inc
work page 2011
-
[13]
McKay, M. D., Beckman, R. J., & Conover, W. J. (1979). A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 21(2), 239–245. O’Hagan, A. (1992). Some Bayesian numerical analysis. Bayesian Statistics , 4, 345–363
work page 1979
-
[14]
Plumlee, M. (2014b). Sparse Grid Designs: MATLAB package, version 1.5.0.0. Retrieved Oct. 2016, from https://www.mathworks.com/matlabcentral/fileexchange/45668-sparse -grid-designs. R Core Team. (2017). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from https://www.R-project.org/
work page 2016
-
[15]
Sacks, J., Welch, W. J., Mitchell, T. J., & Wynn, H. P. (1989). Design and analysis of computer experiments. Statistical Science, 4, 409–423
work page 1989
-
[16]
Stein, M. L., Chi, Z., & Welty, L. J. (2004). Approximating likelihoods for large spatial data sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 66(2), 275–296
work page 2004
-
[17]
Surjanovic, S., & Bingham, D. (2013). Virtual library of simulation experiments: Test functions and datasets. Retrieved from http://www.sfu.ca/~ssurjano. 20
work page 2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.