Deep Convolutional Networks as shallow Gaussian Processes
read the original abstract
We show that the output of a (residual) convolutional neural network (CNN) with an appropriate prior over the weights and biases is a Gaussian process (GP) in the limit of infinitely many convolutional filters, extending similar results for dense networks. For a CNN, the equivalent kernel can be computed exactly and, unlike "deep kernels", has very few parameters: only the hyperparameters of the original CNN. Further, we show that this kernel has two properties that allow it to be computed efficiently; the cost of evaluating the kernel for a pair of images is similar to a single forward pass through the original CNN with only one filter per layer. The kernel equivalent to a 32-layer ResNet obtains 0.84% classification error on MNIST, a new record for GPs with a comparable number of parameters.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Neural Operator: Graph Kernel Network for Partial Differential Equations
Graph Kernel Networks learn PDE solution operators that generalize across discretization methods and grid resolutions using graph-based kernel integration.
-
Viability of perturbative expansion for quantum field theories on neurons
The work tests perturbative viability of single-layer neural networks for local QFTs at finite neuron number N in phi^4 theory, finding UV-cutoff-sensitive O(1/N) corrections with weak convergence and proposing a modi...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.