Unbiased Learning to Rank with Unbiased Propensity Estimation

Cheng Luo; Jiafeng Guo; Keping Bi; Qingyao Ai; W. Bruce Croft

arxiv: 1804.05938 · v2 · pith:5DPD62RJnew · submitted 2018-04-16 · 💻 cs.IR

Unbiased Learning to Rank with Unbiased Propensity Estimation

Qingyao Ai , Keping Bi , Cheng Luo , Jiafeng Guo , W. Bruce Croft This is my paper

classification 💻 cs.IR

keywords clickunbiasedlearningmodelsdatapropensityrankranking

0 comments

read the original abstract

Learning to rank with biased click data is a well-known challenge. A variety of methods has been explored to debias click data for learning to rank such as click models, result interleaving and, more recently, the unbiased learning-to-rank framework based on inverse propensity weighting. Despite their differences, most existing studies separate the estimation of click bias (namely the \textit{propensity model}) from the learning of ranking algorithms. To estimate click propensities, they either conduct online result randomization, which can negatively affect the user experience, or offline parameter estimation, which has special requirements for click data and is optimized for objectives (e.g. click likelihood) that are not directly related to the ranking performance of the system. In this work, we address those problems by unifying the learning of propensity models and ranking models. We find that the problem of estimating a propensity model from click data is a dual problem of unbiased learning to rank. Based on this observation, we propose a Dual Learning Algorithm (DLA) that jointly learns an unbiased ranker and an \textit{unbiased propensity model}. DLA is an automatic unbiased learning-to-rank framework as it directly learns unbiased ranking models from biased click data without any preprocessing. It can adapt to the change of bias distributions and is applicable to online learning. Our empirical experiments with synthetic and real-world data show that the models trained with DLA significantly outperformed the unbiased learning-to-rank algorithms based on result randomization and the models trained with relevance signals extracted by click models.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Conceptual Framework for Evaluating Fairness in Search
cs.IR 2019-07 unverdicted novelty 6.0

Introduces distributional fairness notion, axioms for ideal fairness evaluation in search, repurposes TREC collections, measures data bias, and proposes interpolation of fairness with relevance metrics.
Unbiased Learning to Rank: Counterfactual and Online Approaches
cs.IR 2019-07 unverdicted novelty 2.0

Tutorial overview contrasting counterfactual and online approaches to unbiased learning to rank.