On optimal foraging and multi-armed bandits

Vaibhav Srivastava, Paul Reverdy, Naomi E. Leonard

Research output: Chapter in Book/Report/Conference proceedingConference contribution

25 Scopus citations

Abstract

We consider two variants of the standard multi-armed bandit problem, namely, the multi-armed bandit problem with transition costs and the multi-armed bandit problem on graphs. We develop block allocation algorithms for these problems that achieve an expected cumulative regret that is uniformly dominated by a logarithmic function of time, and an expected cumulative number of transitions from one arm to another arm uniformly dominated by a double-logarithmic function of time. We observe that the multi-armed bandit problem with transition costs and the associated block allocation algorithm capture the key features of popular animal foraging models in literature.

Original languageEnglish (US)
Title of host publication2013 51st Annual Allerton Conference on Communication, Control, and Computing, Allerton 2013
PublisherIEEE Computer Society
Pages494-499
Number of pages6
ISBN (Print)9781479934096
DOIs
StatePublished - 2013
Externally publishedYes
Event51st Annual Allerton Conference on Communication, Control, and Computing, Allerton 2013 - Monticello, IL, United States
Duration: Oct 2 2013Oct 4 2013

Publication series

Name2013 51st Annual Allerton Conference on Communication, Control, and Computing, Allerton 2013

Conference

Conference51st Annual Allerton Conference on Communication, Control, and Computing, Allerton 2013
Country/TerritoryUnited States
CityMonticello, IL
Period10/2/1310/4/13

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Control and Systems Engineering

Fingerprint

Dive into the research topics of 'On optimal foraging and multi-armed bandits'. Together they form a unique fingerprint.

Cite this