Satisficing in Gaussian bandit problems

Paul Reverdy, Naomi E. Leonard

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

We propose a satisficing objective for the multi-armed bandit problem, i.e., where the objective is to achieve performance above a given threshold. We show that this new problem is equivalent to a standard multi-armed bandit problem with a maximizing objective and use this equivalence to find bounds on performance in terms of the satisficing objective. For the special case of Gaussian rewards we show that the satisficing problem is equivalent to a related standard multi-armed bandit problem again with Gaussian rewards. We apply the Upper Credible Limit (UCL) algorithm to this standard problem and show how it achieves optimal performance in terms of the satisficing objective.

Original languageEnglish (US)
Title of host publication53rd IEEE Conference on Decision and Control,CDC 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5718-5723
Number of pages6
EditionFebruary
ISBN (Electronic)9781479977468
DOIs
StatePublished - 2014
Externally publishedYes
Event2014 53rd IEEE Annual Conference on Decision and Control, CDC 2014 - Los Angeles, United States
Duration: Dec 15 2014Dec 17 2014

Publication series

NameProceedings of the IEEE Conference on Decision and Control
NumberFebruary
Volume2015-February
ISSN (Print)0743-1546
ISSN (Electronic)2576-2370

Conference

Conference2014 53rd IEEE Annual Conference on Decision and Control, CDC 2014
Country/TerritoryUnited States
CityLos Angeles
Period12/15/1412/17/14

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Modeling and Simulation
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Satisficing in Gaussian bandit problems'. Together they form a unique fingerprint.

Cite this