Reinforcement Learning with Converging Goal Space and Binary Reward Function

2020 
Usage of a sparse and binary reward function remains one of the most challenging problems in reinforcement learning. In particular, when the environments wherein robotic agents learn are sufficiently vast, it is much more difficult to learn tasks because the probability of reaching the goal is minimal. A Hindsight Experience Replay algorithm was proposed to overcome these difficulties; however, problems persist that affect the learning speed and delay learning when a learning agent cannot receive proper rewards at the beginning of the learning process. In this paper, we present a simple method called Converging Goal Space and Binary Reward Function, which helps agents learn tasks easily and efficiently in large environments while providing a binary reward. At an early stage in training, a larger goal space margin facilitates the reward function for a more rapid policy learning. As the number of successes increases, the goal space is gradually reduced to the size used to the size used in the test. We apply this reward function to two different task experiments: sliding and throwing, which must be explored at a wider range than the reach of the robotic arms, and then compare the learning efficiency to that of experiments that only employ a sparse and binary reward function. We show that the proposed reward function performs better in large environments using physics simulation, and we demonstrate that the function is applicable to real world robotic arms.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    0
    Citations
    NaN
    KQI
    []