pith. sign in

arxiv: 1510.08437 · v1 · pith:YKYOZ6WBnew · submitted 2015-10-28 · 📊 stat.AP · stat.ME

Second Order Calibration: A Simple Way to Get Approximate Posteriors

classification 📊 stat.AP stat.ME
keywords thetaestimatesdistributioncalibrationestimatemethodpointposterior
0
0 comments X
read the original abstract

Many large-scale machine learning problems involve estimating an unknown parameter $\theta_{i}$ for each of many items. For example, a key problem in sponsored search is to estimate the click through rate (CTR) of each of billions of query-ad pairs. Most common methods, though, only give a point estimate of each $\theta_{i}$. A posterior distribution for each $\theta_{i}$ is usually more useful but harder to get. We present a simple post-processing technique that takes point estimates or scores $t_{i}$ (from any method) and estimates an approximate posterior for each $\theta_{i}$. We build on the idea of calibration, a common post-processing technique that estimates $\mathrm{E}\left(\theta_{i}\!\!\bigm|\!\! t_{i}\right)$. Our method, second order calibration, uses empirical Bayes methods to estimate the distribution of $\theta_{i}\!\!\bigm|\!\! t_{i}$ and uses the estimated distribution as an approximation to the posterior distribution of $\theta_{i}$. We show that this can yield improved point estimates and useful accuracy estimates. The method scales to large problems - our motivating example is a CTR estimation problem involving tens of billions of query-ad pairs.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.