Robot Path Planning via Deep Reinforcement Learning with Improved Reward Function

2022 
The robot path planning technology based on deep reinforcement learning algorithm enables the robot arm to realize intelligent trajectory planning and capture task in unknown environment. However, due to the characteristics of redundant degrees of freedom, continuous workspace and non-unique mapping between joint space and Cartesian space, deep reinforcement learning algorithms often have the problems of unnecessary exploration, slow learning efficiency, low accuracy and poor robustness. In order to improve this problem, this paper proposes a path planning algorithm based on twin delayed deep deterministic policy gradient (TD3) algorithm to train an end-to-end network between the expected pose and current joint space variables. In addition, to measure the tip position and attitude of the robot more reasonably, the reward function is improved to clarify its physical meaning. Training and testing in the simulation environment proved that this method can realize the auto-path planning and capture task. Compared with other reward function, this method can avoid unnecessary exploration and improved the convergence speed and robustness.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    12
    References
    0
    Citations
    NaN
    KQI
    []