Safe Reinforcement Learning for CPSs via Formal Modeling and Verification

2021 
Reinforcement learning (RL) can be defined as the process of learning policies that maximize the expectation of the rewards. It has shown success in solving complex decision-making tasks. However, reinforcement learning-based controllers do not provide guarantees of safety of physical models in Cyber-physical systems (CPSs). In this paper, we propose a framework, which allows implementing RL to the safe control system by transforming formal analysis to learned policy. For satisfaction verification and quantitative analysis, we propose an uncertainty modeling language CSML to describe behaviors of the system, and transform CSML design into networks of probabilistic timed automata (NPTA). For safe learning, we present an algorithm called Safe Control with Formal Methods (SCFM). SCFM constructs a state set that obeys the constraint described by probabilistic computation tree logic (PCTL) via exploring state space before the learning process. The monitor monitors the system, determines whether the chosen action is safe and corrects unsafe decisions. We validate our method through experiments of lane-change control for autonomous cars.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    0
    Citations
    NaN
    KQI
    []