Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems
2021
We introduce a curriculum learning algorithm, Variational Automatic
Curriculum Learning (VACL), for solving challenging goal-conditioned
cooperative multi-agent reinforcement learning problems. We motivate our
paradigm through a variational perspective, where the learning objective can be
decomposed into two terms: task learning on the current task distribution, and
curriculum update to a new task distribution. Local optimization over the
second term suggests that the curriculum should gradually expand the training
tasks from easy to hard. Our VACL algorithm implements this variational
paradigm with two practical components, task expansion and entity progression,
which produces training curricula over both the task configurations as well as
the number of entities in the task. Experiment results show that VACL solves a
collection of sparse-reward problems with a large number of agents.
Particularly, using a single desktop machine, VACL achieves 98% coverage rate
with 100 agents in the simple-spread benchmark and reproduces the ramp-use
behavior originally shown in OpenAI's hide-and-seek project. Our project
website is at https://sites.google.com/view/vacl-neurips-2021.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
22
References
0
Citations
NaN
KQI