pith. machine review for the scientific record. sign in

arxiv: 1707.01875 · v1 · submitted 2017-07-06 · 💻 cs.LG

Recognition: unknown

Calibrated Fairness in Bandits

Authors on Pith no claims yet
classification 💻 cs.LG
keywords fairnesssimilarprobabilityqualityalgorithmarmsbanditcalibrated
0
0 comments X
read the original abstract

We study fairness within the stochastic, \emph{multi-armed bandit} (MAB) decision making framework. We adapt the fairness framework of "treating similar individuals similarly" to this setting. Here, an `individual' corresponds to an arm and two arms are `similar' if they have a similar quality distribution. First, we adopt a {\em smoothness constraint} that if two arms have a similar quality distribution then the probability of selecting each arm should be similar. In addition, we define the {\em fairness regret}, which corresponds to the degree to which an algorithm is not calibrated, where perfect calibration requires that the probability of selecting an arm is equal to the probability with which the arm has the best quality realization. We show that a variation on Thompson sampling satisfies smooth fairness for total variation distance, and give an $\tilde{O}((kT)^{2/3})$ bound on fairness regret. This complements prior work, which protects an on-average better arm from being less favored. We also explain how to extend our algorithm to the dueling bandit setting.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Price of Fairness in Short-Term and Long-Term Algorithmic Selections

    cs.AI 2026-05 unverdicted novelty 6.0

    Short-term group fairness in repeated selections can incur a high price of fairness even with nearly identical group distributions, but long-term disparities can vanish under simple investment policies with low PoF.