A Dissipativity Theory for Undiscounted Markov Decision Processes.

2021 
Dissipativity theory is central to discussing the stability of policies resulting from minimzing economic stage costs. In its current form, the dissipativity theory applies to problems based on deterministic dynamics, and does not readily extends to Markov Decision Processes, where the dynamics are stochastic. In this paper, we clarify the core reason for this difficulty, and propose a generalization of the dissipativity theory that circumvents it. This generalization is based on nonlinear stage cost functionals, allowing one to discuss the Lyapunov asymptotic stability of policies for Markov Decision Processes in the set of probability measures. This theory is illustrated in the stochastic Linear Quadratic Regulator case, for which a storage functional can be provided analytically. For the sake of brevity, we limit our discussion to undiscounted MDPs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []