https://arxiv.org/abs/1506.02557

Kingma, D · 2015 · stat.ML · arXiv 1506.02557

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

We investigate a local reparameterizaton technique for greatly reducing the variance of stochastic gradients for variational Bayesian inference (SGVB) of a posterior over model parameters, while retaining parallelizability. This local reparameterization translates uncertainty about global parameters into local noise that is independent across datapoints in the minibatch. Such parameterizations can be trivially parallelized and have variance that is inversely proportional to the minibatch size, generally leading to much faster convergence. Additionally, we explore a connection with dropout: Gaussian dropout objectives correspond to SGVB with local reparameterization, a scale-invariant prior and proportionally fixed posterior variance. Our method allows inference of more flexibly parameterized posteriors; specifically, we propose variational dropout, a generalization of Gaussian dropout where the dropout rates are learned, often leading to better models. The method is demonstrated through several experiments.

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

cs.LG · 2025-02-08 · unverdicted · novelty 6.0

TabICL scales in-context learning to large tabular data via column-then-row attention for row embeddings followed by a transformer, matching TabPFNv2 speed and performance while outperforming it and CatBoost on datasets over 10K samples.

Low Rank Based Subspace Inference for the Laplace Approximation of Bayesian Neural Networks

cs.LG · 2025-02-04 · unverdicted · novelty 6.0

Derives optimal low-rank subspace for Laplace approx in BNNs, provides scalable outperforming version, and new comparison metric.

Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification

stat.ML · 2026-04-07 · unverdicted · novelty 6.0

Ensemble-based method of moments on softmax outputs produces stable Dirichlet predictive distributions that improve uncertainty-guided tasks like selective classification over evidential deep learning.

citing papers explorer

Showing 3 of 3 citing papers.

TabICL: A Tabular Foundation Model for In-Context Learning on Large Data cs.LG · 2025-02-08 · unverdicted · none · ref 270 · internal anchor
TabICL scales in-context learning to large tabular data via column-then-row attention for row embeddings followed by a transformer, matching TabPFNv2 speed and performance while outperforming it and CatBoost on datasets over 10K samples.
Low Rank Based Subspace Inference for the Laplace Approximation of Bayesian Neural Networks cs.LG · 2025-02-04 · unverdicted · none · ref 7 · internal anchor
Derives optimal low-rank subspace for Laplace approx in BNNs, provides scalable outperforming version, and new comparison metric.
Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification stat.ML · 2026-04-07 · unverdicted · none · ref 4
Ensemble-based method of moments on softmax outputs produces stable Dirichlet predictive distributions that improve uncertainty-guided tasks like selective classification over evidential deep learning.

https://arxiv.org/abs/1506.02557

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer