Learning to Cooperate: Application of Deep Reinforcement Learning for Online AGV Path Finding.

2020 
Multi-agent path finding (MAPF), naturally exists in applications like picking-up and dropping-off parcels by automated guided vehicles (AGVs) in the warehouse. Existing algorithms, like conflict-based search (CBS), windowed hierarchical cooperative A* (WHCA), and other A* variants, are widely used to find the shortest paths in different manners. However, in real-world environments, MAPF cases are dynamically generated and need to be solved in real time. In this work, a decentralized multi-agent reinforcement learning (MARL) framework with multi-step ahead tree search (MATS) strategy is proposed to make efficient decisions. Through performing experiments on a 30*30 grid world and a real-world warehouse case, our proposed MARL policy is proved to be capable of: 1) scaling to a large number of agents in real-world environment with online response time within acceptable levels; 2) outperforming existing algorithms with shorter path length and solution time, as the number of agents increases.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    3
    Citations
    NaN
    KQI
    []