Risk Averse Reinforcement Learning for Mixed Multi-agent Environments

D. Sai Koti Reddy,Amrita Saha,Srikanth G. Tamilselvam,Priyanka Agrawal,Pankaj Dayama

Risk Averse Reinforcement Learning for Mixed Multi-agent Environments

2019

D. Sai Koti Reddy
Amrita Saha
Srikanth G. Tamilselvam
Priyanka Agrawal
Pankaj Dayama

Most real world applications of multi-agent systems, need to keep a balance between maximizing the rewards and minimizing the risks. In this work we consider a popular risk measure, variance of return (VOR), as a constraint in the agent's policy learning algorithm in the mixed cooperative and competitive environments. We present a multi-timescale actor critic method for risk sensitive Markov games where the risk is modeled as a VOR constraint. We also show that the risk-averse policies satisfy the desired risk constraint without compromising much on the overall reward for a popular task.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations