Output-feedback Quadratic Tracking Control of Continuous-time Systems by Using Off-policy Reinforcement Learning with Neural Networks Observer

2020 
In this paper, an improved off-policy reinforcement learning (RL) algorithm with neural networks(NN) observer is proposed to solve the linear quadratic tracking (LQT) problem for continuous-time (CT) systems without any knowledge of the system dynamics. The offline algorithm solves a Lyapunov equation to find a optimal solution which requires complete knowledge of the system dynamics. Later the off-policy RL algorithm was used to solve the state-feedback control which does not require any knowledge of the system dynamics by using the same input and state information repeatedly in previous research. The proposed output-feedback (OPFB) control algorithm solves Bellman equation which demands the system state information by using an adaptive NN state observer to estimate the system state with the input and output information of CT systems. Simulation results provide the efficiency of the proposed approach. Key Words: Off-policy, Reinforcement Learning (RL), Linear Quadratic Tracking (LQT), Output-feedback (OPFB)
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    0
    Citations
    NaN
    KQI
    []