ExTra: Transfer-guided Exploration.

Anirban Santara,Rishabh Madan,Balaraman Ravindran,Pabitra Mitra

ExTra: Transfer-guided Exploration.

2019

Anirban Santara
Rishabh Madan
Balaraman Ravindran
Pabitra Mitra

The sample efficiency and convergence time of a Reinforcement Learning (RL) algorithm depend heavily on the exploration method used by the agent. In thiswork, we formulate an exploration method that uses prior experiences of an agent at similar tasks in other environments for improving the efficiency of exploration in the current task-environment. We show that given an optimal policy in a related task-environment, its bisimulation distance from the current task-environment gives a lower bound on the optimal advantage of state-action pairs in the current task-environment.

Keywords:

Distributed computing
Computer science
Human–computer interaction

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations