Data Poisoning Attacks in Contextual Bandits

Kwang-Sung Jun; Lihong Li; Xiaojin Zhu; Yuzhe Ma

arxiv: 1808.05760 · v2 · pith:2U6KPEBJnew · submitted 2018-08-17 · 💻 cs.LG · cs.CR· stat.ML

Data Poisoning Attacks in Contextual Bandits

Yuzhe Ma , Kwang-Sung Jun , Lihong Li , Xiaojin Zhu This is my paper

classification 💻 cs.LG cs.CRstat.ML

keywords contextualdatatargetattackerattacksalgorithmattackbandit

0 comments

read the original abstract

We study offline data poisoning attacks in contextual bandits, a class of reinforcement learning problems with important applications in online recommendation and adaptive medical treatment, among others. We provide a general attack framework based on convex optimization and show that by slightly manipulating rewards in the data, an attacker can force the bandit algorithm to pull a target arm for a target contextual vector. The target arm and target contextual vector are both chosen by the attacker. That is, the attacker can hijack the behavior of a contextual bandit. We also investigate the feasibility and the side effects of such attacks, and identify future directions for defense. Experiments on both synthetic and real-world data demonstrate the efficiency of the attack algorithm.

This paper has not been read by Pith yet.

Data Poisoning Attacks in Contextual Bandits

discussion (0)