Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

Alexander Pritzel; Balaji Lakshminarayanan; Charles Blundell

arxiv: 1612.01474 · v3 · pith:JZW7YMUAnew · submitted 2016-12-05 · 📊 stat.ML · cs.LG

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

Balaji Lakshminarayanan , Alexander Pritzel , Charles Blundell This is my paper

classification 📊 stat.ML cs.LG

keywords uncertaintypredictivebayesianestimatesmethoddeepdemonstrateexamples

0 comments

read the original abstract

Deep neural networks (NNs) are powerful black box predictors that have recently achieved impressive performance on a wide spectrum of tasks. Quantifying predictive uncertainty in NNs is a challenging and yet unsolved problem. Bayesian NNs, which learn a distribution over weights, are currently the state-of-the-art for estimating predictive uncertainty; however these require significant modifications to the training procedure and are computationally expensive compared to standard (non-Bayesian) NNs. We propose an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates. Through a series of experiments on classification and regression benchmarks, we demonstrate that our method produces well-calibrated uncertainty estimates which are as good or better than approximate Bayesian NNs. To assess robustness to dataset shift, we evaluate the predictive uncertainty on test examples from known and unknown distributions, and show that our method is able to express higher uncertainty on out-of-distribution examples. We demonstrate the scalability of our method by evaluating predictive uncertainty estimates on ImageNet.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 20 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

VISTA: Variance-Gated Inter-Sequence Test-Time Adaptation for Multi-Sequence MRI Segmentation
cs.CV 2026-05 conditional novelty 7.0

VISTA is a source-free TTA framework for multi-sequence MRI segmentation that uses inter-sequence spectral/patch interventions and cross-view variance gating to handle modality-interaction shifts, reporting Dice gains...
Inducing Artificial Uncertainty in Language Models
cs.CL 2026-05 unverdicted novelty 7.0

Inducing artificial uncertainty on trivial tasks allows training probes that achieve higher calibration on hard data than standard approaches while retaining performance on easy data.
DualTCN: A Physics-Constrained Temporal Convolutional Network for 2 Time-Domain Marine CSEM Inversion
cs.LG 2026-05 unverdicted novelty 7.0

DualTCN is the first deep-learning model for time-domain marine CSEM inversion that regresses four earth parameters, achieves high accuracy on simulated data, and runs up to 21,000 times faster than classical optimizers.
SegWithU: Uncertainty as Perturbation Energy for Single-Forward-Pass Risk-Aware Medical Image Segmentation
cs.CV 2026-04 unverdicted novelty 7.0

SegWithU treats uncertainty as perturbation energy via rank-1 probes in a post-hoc head for efficient single-pass risk-aware medical image segmentation, outperforming other single-forward-pass methods on ACDC, BraTS20...
Extraction of the color dipole amplitude with physics-informed neural networks
hep-ph 2026-01 unverdicted novelty 7.0

Physics-informed neural networks extract a model-independent color dipole amplitude from inclusive HERA data that predicts exclusive J/ψ photoproduction cross-sections without parameter retuning.
Empirical Bayes Conformal Prediction for Vision and Language Models
cs.LG 2026-05 unverdicted novelty 6.0

Empirical Bayes conformal prediction converts score variability into r-value nonconformity scores that preserve target coverage while reducing inclusion of high-variance false candidates in image classification, CLIP ...
Multi-Quantile Regression for Extreme Precipitation Downscaling
cs.LG 2026-05 unverdicted novelty 6.0

Q-SRDRN multi-quantile network with pinball loss and per-quantile heads detects extreme precipitation events up to 18 times more effectively than deterministic baselines while preserving augmentation benefits for the median.
COMPASS: A Unified Decision-Intelligence System for Navigating Performance Trade-off in HPC
cs.PF 2026-04 conditional novelty 6.0

COMPASS formalizes HPC configuration questions as ML tasks on traces, quantifies recommendation trustworthiness, and delivers 65.93% lower average job turnaround time plus 80.93% lower node usage versus prior methods ...
AI-assisted modeling and Bayesian inference of unpolarized quark transverse momentum distributions from Drell-Yan data
hep-ph 2026-04 unverdicted novelty 6.0

An AI-assisted Bayesian framework extracts TMD PDFs from global Drell-Yan data using surrogate models for scalable MCMC sampling.
Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification
stat.ML 2026-04 unverdicted novelty 6.0

Ensemble-based method of moments on softmax outputs produces stable Dirichlet predictive distributions that improve uncertainty-guided tasks like selective classification over evidential deep learning.
Cell-induced densification and tether formation in fibrous extracellular matrices with biomimetic physics-informed neural networks
cs.LG 2026-03 unverdicted novelty 6.0

Bio-PINNs with a near-to-far curriculum and deformation-uncertainty proxy recover cell-induced densified phases and tether morphologies more reliably than standard adaptive PINN baselines in single-cell and multicellu...
Ensemble-Based Uncertainty Estimation for Code Correctness Estimation
cs.SE 2026-03 unverdicted novelty 6.0

Ensemble Semantic Entropy improves correlation with code correctness over single-model methods and powers a cascading scaling system that cuts FLOPs by 64.9% while preserving performance on LiveCodeBench.
Bayesian E(3)-Equivariant Interatomic Potential with Iterative Restratification of Many-body Message Passing
cs.LG 2025-10 unverdicted novelty 6.0

Bayesian E(3)-equivariant MLPs with joint energy-force NLL loss achieve competitive accuracy while enabling uncertainty-guided active learning, OOD detection, and calibration.
Language Models (Mostly) Know What They Know
cs.CL 2022-07 unverdicted novelty 6.0

Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.
Testing the Assumptions of Active Learning for Translation Tasks with Few Samples
cs.CL 2026-04 unverdicted novelty 5.0

Informativeness and diversity of samples selected by active learning show no correlation with test performance on translation tasks using few samples; ordering and pre-training effects dominate instead.
Confident in a Confidence Score: Investigating the Sensitivity of Confidence Scores to Supervised Fine-Tuning
cs.CL 2026-04 unverdicted novelty 5.0

Supervised fine-tuning degrades the correlation between confidence scores and output quality in language models, driven by factors like training distribution similarity rather than true quality.
Expectation and Acoustic Neural Network Representations Enhance Music Identification from Brain Activity
cs.AI 2026-03 unverdicted novelty 5.0

Separating acoustic and expectation ANN representations as teacher targets improves EEG music identification beyond baselines and seed ensembles.
Uncertainty-Calibrated Explainable Artificial Intelligence for Fetal Ultrasound Plane Classification: A Systematic Review
eess.IV 2026-01 unverdicted novelty 5.0

PRISMA 2020 systematic review of 78 studies on fetal ultrasound plane classification paired with explainability or uncertainty, introducing the CALIB-XFUS reporting framework across six domains.
Calibrated Model-Based Deep Reinforcement Learning
cs.LG 2019-06 unverdicted novelty 5.0

Augmenting model-based RL agents with calibrated predictive uncertainties improves planning, sample efficiency, and exploration on continuous control tasks.
Deep Learning for Sequential Decision Making under Uncertainty: Foundations, Frameworks, and Frontiers
math.OC 2026-04 unverdicted novelty 2.0

A tutorial framing deep learning as a complement to optimization for sequential decision-making under uncertainty, with applications in supply chains, healthcare, and energy.