TY - GEN
T1 - Improved Confidence Bounds for the Linear Logistic Model and Applications to Bandits
AU - Jun, Kwang Sung
AU - Jain, Lalit
AU - Mason, Blake
AU - Nassif, Houssam
N1 - Publisher Copyright:
Copyright © 2021 by the author(s)
PY - 2021
Y1 - 2021
N2 - We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bound by Li et al. (2017) via recent developments of the self-concordant analysis of the logistic loss (Faury et al., 2020). Specifically, our confidence bound avoids a direct dependence on 1/κ, where κ is the minimal variance over all arms' reward distributions. In general, 1/κ scales exponentially with the norm of the unknown linear parameter θ∗. Instead of relying on this worst case quantity, our confidence bound for the reward of any given arm depends directly on the variance of that arm's reward distribution. We present two applications of our novel bounds to pure exploration and regret minimization logistic bandits, improving upon state-of-the-art performance guarantees. For pure exploration we also provide a lower bound highlighting a dependence on 1/κ for a family of instances.
AB - We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bound by Li et al. (2017) via recent developments of the self-concordant analysis of the logistic loss (Faury et al., 2020). Specifically, our confidence bound avoids a direct dependence on 1/κ, where κ is the minimal variance over all arms' reward distributions. In general, 1/κ scales exponentially with the norm of the unknown linear parameter θ∗. Instead of relying on this worst case quantity, our confidence bound for the reward of any given arm depends directly on the variance of that arm's reward distribution. We present two applications of our novel bounds to pure exploration and regret minimization logistic bandits, improving upon state-of-the-art performance guarantees. For pure exploration we also provide a lower bound highlighting a dependence on 1/κ for a family of instances.
UR - http://www.scopus.com/inward/record.url?scp=85161353177&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85161353177&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85161353177
T3 - Proceedings of Machine Learning Research
SP - 5148
EP - 5157
BT - Proceedings of the 38th International Conference on Machine Learning, ICML 2021
PB - ML Research Press
T2 - 38th International Conference on Machine Learning, ICML 2021
Y2 - 18 July 2021 through 24 July 2021
ER -