Recognition: unknown
Frequentist coverage and sup-norm convergence rate in Gaussian process regression
read the original abstract
Gaussian process (GP) regression is a powerful interpolation technique due to its flexibility in capturing non-linearity. In this paper, we provide a general framework for understanding the frequentist coverage of point-wise and simultaneous Bayesian credible sets in GP regression. As an intermediate result, we develop a Bernstein von-Mises type result under supremum norm in random design GP regression. Identifying both the mean and covariance function of the posterior distribution of the Gaussian process as regularized $M$-estimators, we show that the sampling distribution of the posterior mean function and the centered posterior distribution can be respectively approximated by two population level GPs. By developing a comparison inequality between two GPs, we provide exact characterization of frequentist coverage probabilities of Bayesian point-wise credible intervals and simultaneous credible bands of the regression function. Our results show that inference based on GP regression tends to be conservative; when the prior is under-smoothed, the resulting credible intervals and bands have minimax-optimal sizes, with their frequentist coverage converging to a non-degenerate value between their nominal level and one. As a byproduct of our theory, we show that the GP regression also yields minimax-optimal posterior contraction rate relative to the supremum norm, which provides a positive evidence to the long standing problem on optimal supremum norm contraction rate in GP regression.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Optimal Confidence Band for Kernel Gradient Flow Estimator
Kernel gradient flows attain minimax-optimal sup-norm generalization rates and admit simultaneous confidence bands with near-optimal widths under standard capacity-source conditions.
-
A data-driven prediction for the primordial deuterium abundance
Gaussian process regression on nuclear data predicts 10^5 D/H = 2.442 ± 0.040, 1.70 sigma below observation and consistent with first-principles calculations.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.