A Dissipativity Theory for Undiscounted Markov Decision Processes.

Sebastien Gros,Mario Zanon

A Dissipativity Theory for Undiscounted Markov Decision Processes.

2021

Dissipativity theory is central to discussing the stability of policies resulting from minimzing economic stage costs. In its current form, the dissipativity theory applies to problems based on deterministic dynamics, and does not readily extends to Markov Decision Processes, where the dynamics are stochastic. In this paper, we clarify the core reason for this difficulty, and propose a generalization of the dissipativity theory that circumvents it. This generalization is based on nonlinear stage cost functionals, allowing one to discuss the Lyapunov asymptotic stability of policies for Markov Decision Processes in the set of probability measures. This theory is illustrated in the stochastic Linear Quadratic Regulator case, for which a storage functional can be provided analytically. For the sake of brevity, we limit our discussion to undiscounted MDPs.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations