Online Maneuver Design for UAV-Enabled NOMA Systems via Reinforcement Learning

2020 
This paper considers an unmanned aerial vehicle (UAV)-enabled uplink non-orthogonal multiple-access (NOMA) system, where multiple users on the ground send independent messages to a UAV via NOMA transmission. We aim to design the UAV’s dynamic maneuver in real time for maximizing the sum-rate throughput of all ground users over a finite time horizon. Different from conventional offline designs considering static user locations under deterministic or stochastic channel models, we consider a more challenging scenario with mobile users and segmented channel models, where the UAV only causally knows the users’ (moving) locations and channel state information (CSI). Under this setup, we first propose a new approach for UAV dynamic maneuver design based on reinforcement learning (RL) via Q-learning. Next, in order to further speed up the convergence and increase the throughput, we present an enhanced RL-based approach by additionally exploiting expert knowledge of well-established wireless channel models to initialize the Q-table values. Numerical results show that our proposed RL-based and enhanced RL-based approaches significantly improve the sum-rate throughput, and the enhanced RL-based approach considerably speeds up the learning process owing to the proposed Q-table initialization.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    11
    Citations
    NaN
    KQI
    []