pith. sign in

arxiv: 1511.08681 · v1 · pith:KOVSAX4Rnew · submitted 2015-11-27 · 📊 stat.ML · cs.CR· cs.LG

Algorithms for Differentially Private Multi-Armed Bandits

classification 📊 stat.ML cs.CRcs.LG
keywords algorithmsprivatedifferentiallyepsilonboundsmechanismmulti-armedprevious
0
0 comments X
read the original abstract

We present differentially private algorithms for the stochastic Multi-Armed Bandit (MAB) problem. This is a problem for applications such as adaptive clinical trials, experiment design, and user-targeted advertising where private information is connected to individual rewards. Our major contribution is to show that there exist $(\epsilon, \delta)$ differentially private variants of Upper Confidence Bound algorithms which have optimal regret, $O(\epsilon^{-1} + \log T)$. This is a significant improvement over previous results, which only achieve poly-log regret $O(\epsilon^{-2} \log^{2} T)$, because of our use of a novel interval-based mechanism. We also substantially improve the bounds of previous family of algorithms which use a continual release mechanism. Experiments clearly validate our theoretical bounds.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolving Data Marketplaces

    cs.DB 2026-05 unverdicted novelty 5.0

    CHRONOS is a three-layer system for evolving data marketplaces that applies neural-ODE temporal decay, changepoint-aware Shapley valuation, and EXP3-IX private coordination to achieve 0.937 recall, 2.74 qps, 161 ms la...