pith. sign in

arxiv: 1206.6873 · v1 · pith:WKLVI4E7new · submitted 2012-06-27 · 💻 cs.LG · stat.ML

Variable noise and dimensionality reduction for sparse Gaussian processes

classification 💻 cs.LG stat.ML
keywords dataspacenoisespgpapproximationdimensionaldimensionalitygaussian
0
0 comments X
read the original abstract

The sparse pseudo-input Gaussian process (SPGP) is a new approximation method for speeding up GP regression in the case of a large number of data points N. The approximation is controlled by the gradient optimization of a small set of M `pseudo-inputs', thereby reducing complexity from N^3 to NM^2. One limitation of the SPGP is that this optimization space becomes impractically big for high dimensional data sets. This paper addresses this limitation by performing automatic dimensionality reduction. A projection of the input space to a low dimensional space is learned in a supervised manner, alongside the pseudo-inputs, which now live in this reduced space. The paper also investigates the suitability of the SPGP for modeling data with input-dependent noise. A further extension of the model is made to make it even more powerful in this regard - we learn an uncertainty parameter for each pseudo-input. The combination of sparsity, reduced dimension, and input-dependent noise makes it possible to apply GPs to much larger and more complex data sets than was previously practical. We demonstrate the benefits of these methods on several synthetic and real world problems.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. On the Uncertainty Quantification Ability of Tabular Foundation Models

    stat.ML 2026-05 unverdicted novelty 3.0

    Empirical study finds GPs often superior to TabPFN for UQ and accuracy in data-scarce tabular regression, with TabPFN competitive in complex high-dimensional high-data regimes.