pith. machine review for the scientific record. sign in

arxiv: 1701.05369 · v3 · submitted 2017-01-19 · 📊 stat.ML · cs.LG

Recognition: unknown

Variational Dropout Sparsifies Deep Neural Networks

Authors on Pith no claims yet
classification 📊 stat.ML cs.LG
keywords dropoutvariationaleffectnetworksnumberratesreducetimes
0
0 comments X
read the original abstract

We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout. We extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rates per weight. Interestingly, it leads to extremely sparse solutions both in fully-connected and convolutional layers. This effect is similar to automatic relevance determination effect in empirical Bayes but has a number of advantages. We reduce the number of parameters up to 280 times on LeNet architectures and up to 68 times on VGG-like networks with a negligible decrease of accuracy.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification

    stat.ML 2026-04 unverdicted novelty 6.0

    Ensemble-based method of moments on softmax outputs produces stable Dirichlet predictive distributions that improve uncertainty-guided tasks like selective classification over evidential deep learning.