Improved Confidence Bounds for the Linear Logistic Model and Applications to Bandits

Kwang Sung Jun, Lalit Jain, Blake Mason, Houssam Nassif

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Scopus citations

Abstract

We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bound by Li et al. (2017) via recent developments of the self-concordant analysis of the logistic loss (Faury et al., 2020). Specifically, our confidence bound avoids a direct dependence on 1/κ, where κ is the minimal variance over all arms' reward distributions. In general, 1/κ scales exponentially with the norm of the unknown linear parameter θ. Instead of relying on this worst case quantity, our confidence bound for the reward of any given arm depends directly on the variance of that arm's reward distribution. We present two applications of our novel bounds to pure exploration and regret minimization logistic bandits, improving upon state-of-the-art performance guarantees. For pure exploration we also provide a lower bound highlighting a dependence on 1/κ for a family of instances.

Original languageEnglish (US)
Title of host publicationProceedings of the 38th International Conference on Machine Learning, ICML 2021
PublisherML Research Press
Pages5148-5157
Number of pages10
ISBN (Electronic)9781713845065
StatePublished - 2021
Event38th International Conference on Machine Learning, ICML 2021 - Virtual, Online
Duration: Jul 18 2021Jul 24 2021

Publication series

NameProceedings of Machine Learning Research
Volume139
ISSN (Electronic)2640-3498

Conference

Conference38th International Conference on Machine Learning, ICML 2021
CityVirtual, Online
Period7/18/217/24/21

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Improved Confidence Bounds for the Linear Logistic Model and Applications to Bandits'. Together they form a unique fingerprint.

Cite this