Online Learning Based Reconfigurable Antenna Mode Selection Exploiting Channel Correlation

2021 
Reconfigurable antennas (RAs) emerged as a promising technology that can deal with channel variations and enhance the capacity and reliability of the wireless channel. To fully exploit the advantage of RAs, optimal antenna modes need to be selected in an online manner. However, the channel statistics are unknown a priori. Multi-armed bandit-based online learning algorithms were proposed to address this challenge, but the main drawback of existing approaches are that their regret scales linearly with the number of antenna modes, which converges slowly when the latter is large. To improve the scalability, we first apply an existing algorithm: Thompson sampling via Gaussian process (TS-GP), and propose two new algorithms for antenna mode selection: upper confidence bound with channel prediction (UCB-CP) and Thompson Sampling with channel prediction (TS-CP). TS-GP uses Gaussian prior to model the reward distribution of each antenna mode, as well as the correlation among them. UCB-CP and TS-CP exploit channel modeling to predict the channel conditions of unexplored antenna modes at each time step, by relating the correlation between different channel states to the underlying antenna modes. We prove the finite-time regret bound of UCB-CP and show that it is independent from the number of arms, when the expected channel estimation errors are small enough. We also extend the algorithms to the mobile setting. Both simulation results and real-world experiments show that all of our proposed learning algorithms can significantly improve the convergence rate and yield much lower regret (thus higher throughput) than existing schemes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    32
    References
    0
    Citations
    NaN
    KQI
    []