Experience Replay Q(λ)-learning with Leader-Following Control for Multi-Evader Pursuit Evasion Games

2021 
This paper addresses a pursuit evasion game with multi-evader, and only some pursuers which are termed as leading pursuers can access the positions of the evaders during the pursuit. Furthermore, Q(λ)-learning is utilized to train the pursuers. Because Q(λ)-learning exhibits slow convergence, a new method that combines experience replay Q(λ)-learning and dynamic target assignment is introduced. Simulation shows that the proposed method achieves better convergence results than Q(λ)-learning in our multi-evader pursuit evasion game.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    17
    References
    0
    Citations
    NaN
    KQI
    []