Bayesian Hypernetworks

Aaron Courville; Alexandre Lacoste; Chin-Wei Huang; David Krueger; Riashat Islam; Ryan Turner

arxiv: 1710.04759 · v2 · pith:B7WWZXMYnew · submitted 2017-10-13 · 📊 stat.ML · cs.AI· cs.LG

Bayesian Hypernetworks

David Krueger , Chin-Wei Huang , Riashat Islam , Ryan Turner , Alexandre Lacoste , Aaron Courville This is my paper

classification 📊 stat.ML cs.AIcs.LG

keywords bayesiannetworkneuralapproximatedistributionepsilonhypernetshypernetworks

0 comments

read the original abstract

We study Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork $\h$ is a neural network which learns to transform a simple noise distribution, $p(\vec\epsilon) = \N(\vec 0,\mat I)$, to a distribution $q(\pp) := q(h(\vec\epsilon))$ over the parameters $\pp$ of another neural network (the "primary network")\@. We train $q$ with variational inference, using an invertible $\h$ to enable efficient estimation of the variational lower bound on the posterior $p(\pp | \D)$ via sampling. In contrast to most methods for Bayesian deep learning, Bayesian hypernets can represent a complex multimodal approximate posterior with correlations between parameters, while enabling cheap iid sampling of~$q(\pp)$. In practice, Bayesian hypernets can provide a better defense against adversarial examples than dropout, and also exhibit competitive performance on a suite of tasks which evaluate model uncertainty, including regularization, active learning, and anomaly detection.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Instance-Adaptive Parametrization for Amortized Variational Inference
cs.LG 2026-04 unverdicted novelty 7.0

IA-VAE augments amortized variational inference with hypernetwork-generated instance-adaptive modulations, strictly containing the standard variational family and improving held-out ELBO on synthetic and image data.
Possibilistic Predictive Uncertainty for Deep Learning
cs.LG 2026-05 unverdicted novelty 6.0

DAPPr introduces a possibilistic framework that projects parameter posteriors to predictions via supremum and approximates them with Dirichlet possibility functions to yield efficient, closed-form epistemic uncertaint...
HyperFitS -- Hypernetwork Fitting Spectra for metabolic quantification of ${}^1$H MR spectroscopic imaging
cs.LG 2026-04 unverdicted novelty 6.0

HyperFitS is a hypernetwork for configurable spectral fitting in 1H MRSI that matches conventional LCModel results while processing whole-brain data in seconds instead of hours and adapting to varied protocols without...
U-FaceBP: Uncertainty-aware Bayesian Ensemble Deep Learning for Face Video-based Blood Pressure Estimation
cs.CV 2024-12 unverdicted novelty 6.0

U-FaceBP combines multiple Bayesian neural networks in an ensemble to estimate blood pressure from face video modalities while quantifying uncertainty, showing improved performance on datasets with 1197 diverse subjects.
Lost in the Tower of Babel: The Adverse Effects of Incidental Multilingualism in LLMs
cs.CL 2026-05 unverdicted novelty 5.0

Incidental multilingualism from uneven web training makes LLMs unequal, brittle, and opaque across languages.