On Calibration and Out-of-domain Generalization

Amir Feder; Daniel Greenfeld; Uri Shalit; Yoav Wald

arxiv: 2102.10395 · v4 · pith:HNWJRFLHnew · submitted 2021-02-20 · 💻 cs.LG

On Calibration and Out-of-domain Generalization

Yoav Wald , Amir Feder , Daniel Greenfeld , Uri Shalit This is my paper

classification 💻 cs.LG

keywords calibrationdomainsgeneralizationmodelsperformancemulti-domainacrossbetter

0 comments

read the original abstract

Out-of-domain (OOD) generalization is a significant challenge for machine learning models. Many techniques have been proposed to overcome this challenge, often focused on learning models with certain invariance properties. In this work, we draw a link between OOD performance and model calibration, arguing that calibration across multiple domains can be viewed as a special case of an invariant representation leading to better OOD generalization. Specifically, we show that under certain conditions, models which achieve \emph{multi-domain calibration} are provably free of spurious correlations. This leads us to propose multi-domain calibration as a measurable and trainable surrogate for the OOD performance of a classifier. We therefore introduce methods that are easy to apply and allow practitioners to improve multi-domain calibration by training or modifying an existing model, leading to better performance on unseen domains. Using four datasets from the recently proposed WILDS OOD benchmark, as well as the Colored MNIST dataset, we demonstrate that training or tuning models so they are calibrated across multiple domains leads to significantly improved performance on unseen test domains. We believe this intriguing connection between calibration and OOD generalization is promising from both a practical and theoretical point of view.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Experiment-free disruption prediction for new devices enabled by synthetic diagnostic data augmentation
physics.plasm-ph 2026-06 unverdicted novelty 5.0

Augmenting EAST tokamak experimental data with synthetic J-TEXT diagnostic signals from NIMROD MHD simulations and applying Fourier Domain Adaptation improves zero-shot disruption prediction early warning rate from 50...