Quantitative controller synthesis for consumption Markov decision processes

2023 
Consumption Markov decision processes (CMDPs) are a kind of resource-constrained probabilistic decision-making systems allowing the consumption and replenishment to capacity of resource. Making each decision in CMDPs consumes some amount of resource, which can be reloaded to the full capacity in a specified set of states. We consider here the reachability probability controller synthesis problems for CMDPs that extract a policy preventing resource exhaustion and achieve (repeated) reachability goals with , in contrast to the original work focusing on the controller synthesis for CMDPs, e.g., fulfilling the (repeated) reachability goals with or . We first prove that the quantitative reachability probability problems are -hard by a reduction from the 0–1 Knapsack problem. We solve them by converting a given CMDP to a flattened Markov decision process (MDP) and then constructing an optimal policy for that MDP. Unfortunately, the naive flattened MDPs are prohibitively large and thus make the subsequent synthesis not scale well. To overcome this drawback, we propose a pruned construction and a quotient construction of the flattened MDP to reduce the sizes of MDPs. The empirical evaluation shows that our optimizations can significantly improve the scalability of the purely flattening-based synthesis method for CMDPs on benchmarks from real-world scenarios — autonomous electric vehicle routing and helicopter planning in Mars Perseverance Rover project.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []