Posterior Variance Analysis of Gaussian Processes with Application to Average Learning Curves
read the original abstract
The posterior variance of Gaussian processes is a valuable measure of the learning error which is exploited in various applications such as safe reinforcement learning and control design. However, suitable analysis of the posterior variance which captures its behavior for finite and infinite number of training data is missing. This paper derives a novel bound for the posterior variance function which requires only local information because it depends only on the number of training samples in the proximity of a considered test point. Furthermore, we prove sufficient conditions which ensure the convergence of the posterior variance to zero. Finally, we demonstrate that the extension of our bound to an average learning bound outperforms existing approaches.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
PolicyGuard: Towards Test-time and Step-level Adversary (Backdoor) Defense for Reinforcement Learning Agent
PolicyGuard provides a test-time step-level defense against backdoor attacks in RL using GP posterior variance, showing high detection AUROC on seven games.
-
Estimating Mixture Distributions via Stochastic Mirror Descent
Proposes stochastic mirror descent estimators for mixture models that scale to many components, avoid strict support bounds for discrete cases, and achieve near-optimal KL and l2 rates under mild conditions.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.