Batched Bandit Problems

Vianney Perchet,Philippe Rigollet,Sylvain Chassang,Erik Snowberg

Batched Bandit Problems

2015

Vianney Perchet
Philippe Rigollet
Sylvain Chassang
Erik Snowberg

Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.

Keywords:

Minimax
Mathematical optimization
Regret
Small number
Computer science
Sample size determination

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations