Data poisoning attacks in contextual bandits

Yuzhe Ma, Kwang Sung Jun, Lihong Li, Xiaojin Zhu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

45 Scopus citations

Abstract

We study offline data poisoning attacks in contextual bandits, a class of reinforcement learning problems with important applications in online recommendation and adaptive medical treatment, among others. We provide a general attack framework based on convex optimization and show that by slightly manipulating rewards in the data, an attacker can force the bandit algorithm to pull a target arm for a target contextual vector. The target arm and target contextual vector are both chosen by the attacker. That is, the attacker can hijack the behavior of a contextual bandit. We also investigate the feasibility and the side effects of such attacks, and identify future directions for defense. Experiments on both synthetic and real-world data demonstrate the efficiency of the attack algorithm.

Original languageEnglish (US)
Title of host publicationDecision and Game Theory for Security - 9th International Conference, GameSec 2018, Proceedings
EditorsLinda Bushnell, Radha Poovendran, Tamer Basar
PublisherSpringer-Verlag
Pages186-204
Number of pages19
ISBN (Print)9783030015534
DOIs
StatePublished - 2018
Externally publishedYes
Event9th International Conference on Decision and Game Theory for Security, GameSec 2018 - Seattle, United States
Duration: Oct 29 2018Oct 31 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11199 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference9th International Conference on Decision and Game Theory for Security, GameSec 2018
Country/TerritoryUnited States
CitySeattle
Period10/29/1810/31/18

Keywords

  • Adversarial attack
  • Contextual bandit
  • Data poisoning

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Data poisoning attacks in contextual bandits'. Together they form a unique fingerprint.

Cite this