IPO: Interior-Point Policy Optimization under Constraints

Yongshuai Liu,Jiaxin Ding,Xin Liu

IPO: Interior-Point Policy Optimization under Constraints

2020

Yongshuai Liu
Jiaxin Ding
Xin Liu

In this paper, we study reinforcement learning (RL) algorithms to solve real-world decision problems with the objective of maximizing the long-term reward as well as satisfying cumulative constraints. We propose a novel first-order policy optimization method, Interior-point Policy Optimization (IPO), which augments the objective with logarithmic barrier functions, inspired by the interior-point method. Our proposed method is easy to implement with performance guarantees and can handle general types of cumulative multi-constraint settings. We conduct extensive evaluations to compare our approach with state-of-the-art baselines. Our algorithm outperforms the baseline algorithms, in terms of reward maximization and constraint satisfaction.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations