TAAC: Temporally Abstract Actor-Critic for Continuous Control

Haonan Yu,Wei Xu,Haichao Zhang

TAAC: Temporally Abstract Actor-Critic for Continuous Control

2021

Haonan Yu
Wei Xu
Haichao Zhang

We propose temporally abstract actor-critic (TAAC), an off-policy RL algorithm that incorporates closed-loop temporal abstraction into the actor-critic framework in a simple manner. TAAC adds a second-stage binary policy to choose between the previous action and a new action output by an actor. Crucially, its act-or-repeat decision hinges on the actually sampled action instead of the expected behavior of the actor. This post-acting switching scheme let the overall policy make more informed decisions. TAAC has two important features: persistent exploration and a new compare-through Q operator for multi-step TD backup. We demonstrate TAAC's advantages over several strong baselines across 5 different categories of 14 continuous control tasks. Code is available at this https URL.

Keywords:

action
Artificial intelligence
Binary number
Abstraction (linguistics)
Code (cryptography)
control
Backup
Computer science
Scheme (programming language)
Operator (computer programming)

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations