Double-Linear Thompson Sampling for Context-Attentive Bandits

Djallel Bouneffouf,Raphaël Féraud,Sohini Upadhyay,Yasaman Khazaeni,Irina Rish

Double-Linear Thompson Sampling for Context-Attentive Bandits

2021

Djallel Bouneffouf
Raphaël Féraud
Sohini Upadhyay
Yasaman Khazaeni
Irina Rish

In this paper, we analyze and extend an online learning framework known as Context-Attentive Bandit, motivated by various practical applications, from medical diagnosis to dialog systems, where due to observation costs only a small subset of a potentially large number of context variables can be observed at each iteration;however, the agent has a freedom to choose which variables to observe. We derive a novel algorithm, called Context-Attentive Thompson Sampling (CATS), which builds upon the Linear Thompson Sampling approach, adapting it to Context-Attentive Bandit setting. We provide a theoretical regret analysis and an extensive empirical evaluation demonstrating advantages of the proposed approach over several baseline methods on a variety of real-life datasets

Keywords:

Dialog box
Thompson sampling
Medical diagnosis
Baseline (configuration management)
Variety (cybernetics)
Machine learning
context
Regret
Artificial intelligence
online learning
Computer science

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations