Low-rank Tensor Bandits

Botao Hao,Jie Zhou,Zheng Wen,Will Wei Sun

Low-rank Tensor Bandits

2020

Botao Hao
Jie Zhou
Zheng Wen
Will Wei Sun

In recent years, multi-dimensional online decision making has been playing a crucial role in many practical applications such as online recommendation and digital marketing. To solve it, we introduce stochastic low-rank tensor bandits, a class of bandits whose mean rewards can be represented as a low-rank tensor. We propose two learning algorithms, tensor epoch-greedy and tensor elimination, and develop finite-time regret bounds for them. We observe that tensor elimination has an optimal dependency on the time horizon, while tensor epoch-greedy has a sharper dependency on tensor dimensions. Numerical experiments further back up these theoretical findings and show that our algorithms outperform various state-of-the-art approaches that ignore the tensor low-rank structure.

Keywords:

Mathematics
Pure mathematics
Tensor

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations