A power allocation algorithm based on cooperative Q-learning for multi-agent D2D communication networks

2021 
Abstract In the multi-agent device to device (D2D) communication networks, the scene of the multi-agent will change due to its mobility. To address communication interference and energy overconsumption problems caused by the lack of adaptability to changeful scenes, a power allocation algorithm based on scenes adaptive cooperative Q-learning (SACL) is proposed in the paper. Specifically, the scene variable is added into the state space, and the reward function in the algorithm is improved to achieve a larger system capacity with less power. Then, in order to improve the convergence speed of SACL algorithm, the balance factor is introduced based on the location distribution of multiple agents, and a fast scene adaptive reinforcement learning (FSACL) algorithm is proposed. Simulation experiments verify the adaptability of SACL and FSACL algorithm when the scene is changed. Compared with traditional cooperative Q-learning algorithm (CL) and independent Q-learning (IL) algorithms, the SACL and FSACL algorithm can obtain larger system capacity with smaller power to some extent. In addition, the FSACL algorithm converges faster than the CL, IL and the SACL algorithm.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    36
    References
    0
    Citations
    NaN
    KQI
    []