Throughput Enhancement via Multi-Armed Bandit in Heterogeneous 5G Networks

2018 
In heterogeneous networks, a user equipment (UE) can directly communicate with the macro base station (BS) or a small, low-power pico or femto BS. Alternatively, it can indirectly communicate with the macro BS through one or more intermediate device (UE) or a relay-station that uses the over-the-air backhaul to the macro BS. Due to the highly dynamic and uncertain nature of wireless communication, it is essential for a UE to choose an optimal communication mode and a neighbor to which it connects, e.g., a macro/small BS in the direct communication mode or a nearby relay/device in the indirect communication mode. In this paper, we apply an effective reinforcement learning method, called multi-armed bandit (MAB), to shed light on this problem. Especially, we apply MAB supported by the Thompson sampling theorem to pick an optimal arm—a neighbor that determines the communication mode and resulting performance, while effectively dealing with the exploration-exploitation dilemma in MAB. In a simulation study undertaken in Matlab, we compare the performance of the proposed approach to several baselines representing the current state of the art. Our approach enhances the throughput normalized to the optimal throughput by approximately 8-97% compared to several baselines representing the state of the art. Further, it improves the throughput by up to 15% compared to the best performing baseline [1], [2].
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    2
    Citations
    NaN
    KQI
    []