A composite learning method for multi-ship collision avoidance based on reinforcement learning and inverse control

Shuo Xie,Xiumin Chu,Mao Zheng,Chenguang Liu

A composite learning method for multi-ship collision avoidance based on reinforcement learning and inverse control

2020

Abstract Model-free reinforcement learning methods have potentials in ship collision avoidance under unknown environment. To defect the low efficiency problem of the model-free reinforcement learning, a composite learning method is proposed based on an asynchronous advantage actor-critic (A3C) algorithm, a long short-term memory neural network (LSTM) and Q-learning. The proposed method uses Q-learning for adaptive decisions between a LSTM inverse model-based controller and the model-free A3C policy. Multi-ship collision avoidance simulations are conducted to verify the effectiveness of the model-free A3C method, the proposed inverse model-based method and the composite learning method. The simulation results indicate that the proposed composite learning based ship collision avoidance method outperforms the A3C learning method and a traditional optimization-based method.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations