Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization

Kwang Sung Jun, Jungtaek Kim

Research output: Contribution to journalConference articlepeer-review

Abstract

Adapting to a priori unknown noise level is a very important but challenging problem in sequential decision-making as efficient exploration typically requires knowledge of the noise level, which is often loosely specified. We report significant progress in addressing this issue for linear bandits in two respects. First, we propose a novel confidence set that is 'semi-adaptive' to the unknown sub-Gaussian parameter σ2 in the sense that the (normalized) confidence width scales with √dσ2 + σ02 where d is the dimension and σ02 is the specified sub-Gaussian parameter (known) that can be much larger than σ2. This is a significant improvement over √dσ02 of the standard confidence set of Abbasi-Yadkori et al. (2011), especially when d is large. We show that this leads to an improved regret bound in linear bandits. Second, for bounded rewards, we propose a novel variance-adaptive confidence set that has much improved numerical performance upon prior art. We then apply this confidence set to develop, as we claim, the first practical variance-adaptive linear bandit algorithm via an optimistic approach, which is enabled by our novel regret analysis technique. Both of our confidence sets rely critically on 'regret equality' from online learning. Our empirical evaluation in diverse Bayesian optimization tasks shows that our proposed algorithms demonstrate better or comparable performance compared to existing methods.

Original languageEnglish (US)
Pages (from-to)22643-22671
Number of pages29
JournalProceedings of Machine Learning Research
Volume235
StatePublished - 2024
Event41st International Conference on Machine Learning, ICML 2024 - Vienna, Austria
Duration: Jul 21 2024Jul 27 2024

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization'. Together they form a unique fingerprint.

Cite this