Bayesian Hypernetworks

David Krueger , Chin-Wei Huang , Riashat Islam , Ryan Turner , Alexandre Lacoste , Aaron Courville

Authors on Pith no claims yet

classification 📊 stat.ML cs.AIcs.LG

keywords bayesiannetworkneuralapproximatedistributionepsilonhypernetshypernetworks

read the original abstract

We study Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork $\h$ is a neural network which learns to transform a simple noise distribution, $p(\vec\epsilon) = \N(\vec 0,\mat I)$, to a distribution $q(\pp) := q(h(\vec\epsilon))$ over the parameters $\pp$ of another neural network (the "primary network")\@. We train $q$ with variational inference, using an invertible $\h$ to enable efficient estimation of the variational lower bound on the posterior $p(\pp | \D)$ via sampling. In contrast to most methods for Bayesian deep learning, Bayesian hypernets can represent a complex multimodal approximate posterior with correlations between parameters, while enabling cheap iid sampling of~$q(\pp)$. In practice, Bayesian hypernets can provide a better defense against adversarial examples than dropout, and also exhibit competitive performance on a suite of tasks which evaluate model uncertainty, including regularization, active learning, and anomaly detection.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Instance-Adaptive Parametrization for Amortized Variational Inference
cs.LG 2026-04 unverdicted novelty 7.0

IA-VAE augments amortized variational inference with hypernetwork-generated instance-adaptive modulations, strictly containing the standard variational family and improving held-out ELBO on synthetic and image data.
Possibilistic Predictive Uncertainty for Deep Learning
cs.LG 2026-05 unverdicted novelty 6.0

DAPPr introduces a possibilistic framework that projects parameter posteriors to predictions via supremum and approximates them with Dirichlet possibility functions to yield efficient, closed-form epistemic uncertaint...
HyperFitS -- Hypernetwork Fitting Spectra for metabolic quantification of ${}^1$H MR spectroscopic imaging
cs.LG 2026-04 unverdicted novelty 6.0

HyperFitS is a hypernetwork for configurable spectral fitting in 1H MRSI that matches conventional LCModel results while processing whole-brain data in seconds instead of hours and adapting to varied protocols without...
Lost in the Tower of Babel: The Adverse Effects of Incidental Multilingualism in LLMs
cs.CL 2026-05 unverdicted novelty 5.0

Incidental multilingualism from uneven web training makes LLMs unequal, brittle, and opaque across languages.