TY - GEN
T1 - Surveillance in an abruptly changing world via multiarmed bandits
AU - Srivastava, Vaibhav
AU - Reverdy, Paul
AU - Leonard, Naomi E.
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014
Y1 - 2014
N2 - We study a path planning problem in an environment that is abruptly changing due to the arrival of unknown spatial events. The objective of the path planning problem is to collect the data that is most evidential about the events. We formulate this problem as a multiarmed bandit (MAB) problem with Gaussian rewards and change points, and address the fundamental tradeoff between learning the true event (exploration), and collecting the data that is most evidential about the true event (exploitation). We extend the switching-window UCB algorithm for MAB problems with bounded rewards and change points to the context of correlated Gaussian rewards and develop the switching-window UCL (SW-UCL) algorithm. We extend the SW-UCL algorithm to an adaptive SW-UCL algorithm that utilizes statistical change detection to adapt the SW-UCL algorithm. We also develop a block SW-UCL algorithm that reduces the number of transitions among arms in the SW-UCL algorithm, and is more amenable to robotic applications.
AB - We study a path planning problem in an environment that is abruptly changing due to the arrival of unknown spatial events. The objective of the path planning problem is to collect the data that is most evidential about the events. We formulate this problem as a multiarmed bandit (MAB) problem with Gaussian rewards and change points, and address the fundamental tradeoff between learning the true event (exploration), and collecting the data that is most evidential about the true event (exploitation). We extend the switching-window UCB algorithm for MAB problems with bounded rewards and change points to the context of correlated Gaussian rewards and develop the switching-window UCL (SW-UCL) algorithm. We extend the SW-UCL algorithm to an adaptive SW-UCL algorithm that utilizes statistical change detection to adapt the SW-UCL algorithm. We also develop a block SW-UCL algorithm that reduces the number of transitions among arms in the SW-UCL algorithm, and is more amenable to robotic applications.
UR - http://www.scopus.com/inward/record.url?scp=84988286023&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84988286023&partnerID=8YFLogxK
U2 - 10.1109/CDC.2014.7039462
DO - 10.1109/CDC.2014.7039462
M3 - Conference contribution
AN - SCOPUS:84988286023
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 692
EP - 697
BT - 53rd IEEE Conference on Decision and Control,CDC 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 53rd IEEE Annual Conference on Decision and Control, CDC 2014
Y2 - 15 December 2014 through 17 December 2014
ER -