Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

Balaji Lakshminarayanan , Alexander Pritzel , Charles Blundell

Authors on Pith no claims yet

classification 📊 stat.ML cs.LG

keywords uncertaintypredictivebayesianestimatesmethoddeepdemonstrateexamples

read the original abstract

Deep neural networks (NNs) are powerful black box predictors that have recently achieved impressive performance on a wide spectrum of tasks. Quantifying predictive uncertainty in NNs is a challenging and yet unsolved problem. Bayesian NNs, which learn a distribution over weights, are currently the state-of-the-art for estimating predictive uncertainty; however these require significant modifications to the training procedure and are computationally expensive compared to standard (non-Bayesian) NNs. We propose an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates. Through a series of experiments on classification and regression benchmarks, we demonstrate that our method produces well-calibrated uncertainty estimates which are as good or better than approximate Bayesian NNs. To assess robustness to dataset shift, we evaluate the predictive uncertainty on test examples from known and unknown distributions, and show that our method is able to express higher uncertainty on out-of-distribution examples. We demonstrate the scalability of our method by evaluating predictive uncertainty estimates on ImageNet.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 13 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Inducing Artificial Uncertainty in Language Models
cs.CL 2026-05 unverdicted novelty 7.0

Inducing artificial uncertainty on trivial tasks allows training probes that achieve higher calibration on hard data than standard approaches while retaining performance on easy data.
DualTCN: A Physics-Constrained Temporal Convolutional Network for 2 Time-Domain Marine CSEM Inversion
cs.LG 2026-05 unverdicted novelty 7.0

DualTCN is the first deep-learning model for time-domain marine CSEM inversion that regresses four earth parameters, achieves high accuracy on simulated data, and runs up to 21,000 times faster than classical optimizers.
SegWithU: Uncertainty as Perturbation Energy for Single-Forward-Pass Risk-Aware Medical Image Segmentation
cs.CV 2026-04 unverdicted novelty 7.0

SegWithU treats uncertainty as perturbation energy via rank-1 probes in a post-hoc head for efficient single-pass risk-aware medical image segmentation, outperforming other single-forward-pass methods on ACDC, BraTS20...
Multi-Quantile Regression for Extreme Precipitation Downscaling
cs.LG 2026-05 unverdicted novelty 6.0

Q-SRDRN multi-quantile network with pinball loss and per-quantile heads detects extreme precipitation events up to 18 times more effectively than deterministic baselines while preserving augmentation benefits for the median.
COMPASS: A Unified Decision-Intelligence System for Navigating Performance Trade-off in HPC
cs.PF 2026-04 conditional novelty 6.0

COMPASS formalizes HPC configuration questions as ML tasks on traces, quantifies recommendation trustworthiness, and delivers 65.93% lower average job turnaround time plus 80.93% lower node usage versus prior methods ...
AI-assisted modeling and Bayesian inference of unpolarized quark transverse momentum distributions from Drell-Yan data
hep-ph 2026-04 unverdicted novelty 6.0

An AI-assisted Bayesian framework extracts TMD PDFs from global Drell-Yan data using surrogate models for scalable MCMC sampling.
Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification
stat.ML 2026-04 unverdicted novelty 6.0

Ensemble-based method of moments on softmax outputs produces stable Dirichlet predictive distributions that improve uncertainty-guided tasks like selective classification over evidential deep learning.
Cell-induced densification and tether formation in fibrous extracellular matrices with biomimetic physics-informed neural networks
cs.LG 2026-03 unverdicted novelty 6.0

Bio-PINNs with a near-to-far curriculum and deformation-uncertainty proxy recover cell-induced densified phases and tether morphologies more reliably than standard adaptive PINN baselines in single-cell and multicellu...
Ensemble-Based Uncertainty Estimation for Code Correctness Estimation
cs.SE 2026-03 unverdicted novelty 6.0

Ensemble Semantic Entropy improves correlation with code correctness over single-model methods and powers a cascading scaling system that cuts FLOPs by 64.9% while preserving performance on LiveCodeBench.
Language Models (Mostly) Know What They Know
cs.CL 2022-07 unverdicted novelty 6.0

Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.
Testing the Assumptions of Active Learning for Translation Tasks with Few Samples
cs.CL 2026-04 unverdicted novelty 5.0

Informativeness and diversity of samples selected by active learning show no correlation with test performance on translation tasks using few samples; ordering and pre-training effects dominate instead.
Confident in a Confidence Score: Investigating the Sensitivity of Confidence Scores to Supervised Fine-Tuning
cs.CL 2026-04 unverdicted novelty 5.0

Supervised fine-tuning degrades the correlation between confidence scores and output quality in language models, driven by factors like training distribution similarity rather than true quality.
Deep Learning for Sequential Decision Making under Uncertainty: Foundations, Frameworks, and Frontiers
math.OC 2026-04 unverdicted novelty 2.0

A tutorial framing deep learning as a complement to optimization for sequential decision-making under uncertainty, with applications in supply chains, healthcare, and energy.