Finite-Time Error Analysis of Asynchronous Q-Learning with Discrete-Time Switching System Models.

Donghwan Lee

Finite-Time Error Analysis of Asynchronous Q-Learning with Discrete-Time Switching System Models.

2021

Donghwan Lee

This paper develops a novel framework to analyze the convergence of Q-learning algorithm from a discrete-time switching system perspective. We prove that asynchronous Q-learning with a constant step-size can be naturally formulated as discrete-time stochastic switched linear systems. It offers novel and intuitive insights on Q-learning mainly based on control theoretic frameworks. For instance, the proposed analysis explains the overestimation phenomenon in Q-learning due to the maximization bias. Based on the control system theoretic argument and some nice structures of Q-learning, a new finite-time analysis of the Q-learning is given with a novel error bound.

Keywords:

error analysis
Linear system
Asynchronous communication
Algorithm
Control system
Convergence (routing)
Q-learning
Discrete time and continuous time
Computer science
Maximization

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations