A Multi-Layer Architecture for Cooperative Multi-Agent Systems

2019 
In multi-agent cooperative systems with valued based reinforcement learning, agents learn how to complete the task by an optimal policy learned through value-policy improvement iterations. But how to design a policy that avoids social dilemmas and come to a common consensus between agents is an important issue. A method that increases the success rate of cooperation by assessing the cooperative tendency is proposed in this article. The method learns the rules of cooperation by recording cooperation probabilities for agents in a Layered Cooperation Model (LCM). Probabilities are considered as a base while agents are making a decision beneficial to all through the game theory. The method is tested using two cooperative tasks. The results show that the proposed algorithm, which addresses the instability and ambiguity in the win-or-learning-fast and policy hill-climbing method (WoLF-PHC) and requires significantly less memory space than the Nash Bargaining Solution (NBS), is more stable and more efficient than other methods.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    0
    Citations
    NaN
    KQI
    []