The Randomized Elliptical Potential Lemma with an Application to Linear Thompson Sampling.

Nima Hamidi,Mohsen Bayati

The Randomized Elliptical Potential Lemma with an Application to Linear Thompson Sampling.

2021

Nima Hamidi
Mohsen Bayati

In this note, we introduce a randomized version of the well-known elliptical potential lemma that is widely used in the analysis of algorithms in sequential learning and decision-making problems such as stochastic linear bandits. Our randomized elliptical potential lemma relaxes the Gaussian assumption on the observation noise and on the prior distribution of the problem parameters. We then use this generalization to prove an improved Bayesian regret bound for Thompson sampling for the linear stochastic bandits with changing action sets where prior and noise distributions are general. This bound is minimax optimal up to constants.

Keywords:

Thompson sampling
Mathematics
Analysis of algorithms
Minimax
Bayesian probability
Gaussian
Prior probability
Regret
Applied mathematics
Lemma (mathematics)

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations