Understanding Variational Inference in Function-Space
read the original abstract
Recent work has attempted to directly approximate the `function-space' or predictive posterior distribution of Bayesian models, without approximating the posterior distribution over the parameters. This is appealing in e.g. Bayesian neural networks, where we only need the former, and the latter is hard to represent. In this work, we highlight some advantages and limitations of employing the Kullback-Leibler divergence in this setting. For example, we show that minimizing the KL divergence between a wide class of parametric distributions and the posterior induced by a (non-degenerate) Gaussian process prior leads to an ill-defined objective function. Then, we propose (featurized) Bayesian linear regression as a benchmark for `function-space' inference methods that directly measures approximation quality. We apply this methodology to assess aspects of the objective function and inference scheme considered in Sun, Zhang, Shi, and Grosse (2018), emphasizing the quality of approximation to Bayesian inference as opposed to predictive performance.
This paper has not been read by Pith yet.
Forward citations
Cited by 5 Pith papers
-
Gaussian Mean Field Variational Inference can Overestimate Predictive Variance
In conjugate BLR, MFVI overestimates expected predictive variance on in-distribution points relative to the exact posterior, with overestimation aligned to training data directions.
-
Function-Space Priors for Bayesian Neural ODEs with Application to Vessel Trajectory Prediction
A kernel-based regularizer derived from a GP prior on the Neural ODE vector field at finite points is added to the variational objective, paired with multiple shooting to handle long irregular trajectories.
-
Flow-Transformed Implicit Processes for Function-Space Variational Inference
FTIP replaces Gaussian variational distributions over sampled function combination weights with normalizing flows to induce richer, more flexible posterior distributions over functions.
-
Benchmarking and Improving Monitors for Out-Of-Distribution Alignment Failure in LLMs
MOOD benchmark shows guard models fail to generalize to OOD alignment failures in LLMs, but combining them with Mahalanobis and perplexity OOD detectors improves recall from 39% to 45% with better scaling than larger ...
-
Benchmarking and Improving Monitors for Out-Of-Distribution Alignment Failure in LLMs
Introduces MOOD benchmark for OOD LLM alignment failures and shows guard models plus Mahalanobis and perplexity OOD detectors improve recall from 39% to 45% with positive scaling.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.