A Reinforcement Learning Method with Implicit Critics from a Bystander

2017 
In Reinforcement Learning, we train agent many times, so agents can get experience from learning, and then, agent can complete every behavior of different missions. In this paper, we propose architecture to allow agent get experience from environment. We use Adaptive Heuristic Critic (AHC) as a learning architecture and combine an action bias with AHC to solve the problem of continuous action system. On account of the problems of recognition error and state delay, we use Reinforcement Learning which learns from cumulative reward to update the experience of agents.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    23
    References
    1
    Citations
    NaN
    KQI
    []