Multi-Armed Bandits for Human-Machine Decision Making

Paul Reverdy, Vaibhav Srivastava

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Building an integrated human-machine decision-making system requires developing effective interfaces between the human and the machine. We develop such an interface by studying the multi-armed bandit problem, a simple sequential decision-making paradigm that can model a variety of tasks. We construct Bayesian algorithms for the multi-armed bandit problem, prove conditions under which these algorithms achieve good performance, and empirically show that, with appropriate priors, these algorithms effectively model human choice behavior; the priors then form a principled interface from human to machine. We take a signal processing perspective on the prior estimation problem and develop methods to estimate the priors given human choice data.

Original languageEnglish (US)
Title of host publication2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6986-6990
Number of pages5
ISBN (Print)9781538646588
DOIs
StatePublished - Sep 10 2018
Event2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Calgary, Canada
Duration: Apr 15 2018Apr 20 2018

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2018-April
ISSN (Print)1520-6149

Conference

Conference2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
Country/TerritoryCanada
CityCalgary
Period4/15/184/20/18

Keywords

  • Active inference
  • Bayesian inference
  • Human decision making
  • Multi-armed bandit

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Multi-Armed Bandits for Human-Machine Decision Making'. Together they form a unique fingerprint.

Cite this