TY - GEN
T1 - Satisficing in Gaussian bandit problems
AU - Reverdy, Paul
AU - Leonard, Naomi E.
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014
Y1 - 2014
N2 - We propose a satisficing objective for the multi-armed bandit problem, i.e., where the objective is to achieve performance above a given threshold. We show that this new problem is equivalent to a standard multi-armed bandit problem with a maximizing objective and use this equivalence to find bounds on performance in terms of the satisficing objective. For the special case of Gaussian rewards we show that the satisficing problem is equivalent to a related standard multi-armed bandit problem again with Gaussian rewards. We apply the Upper Credible Limit (UCL) algorithm to this standard problem and show how it achieves optimal performance in terms of the satisficing objective.
AB - We propose a satisficing objective for the multi-armed bandit problem, i.e., where the objective is to achieve performance above a given threshold. We show that this new problem is equivalent to a standard multi-armed bandit problem with a maximizing objective and use this equivalence to find bounds on performance in terms of the satisficing objective. For the special case of Gaussian rewards we show that the satisficing problem is equivalent to a related standard multi-armed bandit problem again with Gaussian rewards. We apply the Upper Credible Limit (UCL) algorithm to this standard problem and show how it achieves optimal performance in terms of the satisficing objective.
UR - http://www.scopus.com/inward/record.url?scp=84988269471&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84988269471&partnerID=8YFLogxK
U2 - 10.1109/CDC.2014.7040284
DO - 10.1109/CDC.2014.7040284
M3 - Conference contribution
AN - SCOPUS:84988269471
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 5718
EP - 5723
BT - 53rd IEEE Conference on Decision and Control,CDC 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 53rd IEEE Annual Conference on Decision and Control, CDC 2014
Y2 - 15 December 2014 through 17 December 2014
ER -