Recurrent Neural Architecture Search based on Randomness-Enhanced Tabu Algorithm

2020 
Deep neural networks have achieved highly competitive performance in multiple tasks in recent years. However, discovering state-of-the-art neural network architectures requires substantial effort from human experts. To speed up the process, neural architecture search (NAS) has been proposed to search promising architectures automatically. Nevertheless, the search process of NAS is computing-expensive and time-consuming, which even costs thousands of GPU days. In this paper, to solve the bottleneck, we apply the randomness-enhanced tabu algorithm as a controller to sample candidate architectures, which balances the global exploration and local exploitation for the architectural solutions. In addition, more aggressive weight-sharing strategy is introduced into our method, which significantly reduces the overhead of evaluating sampled architectures. Our approach discovers the recurrent neural architecture within 0.78 GPU hour, which is 15.3x more efficient than ENAS [1] in terms of search time, and the architecture we discovered achieves the test perplexity of 56.1 on Penn Tree Bank (PTB) dataset, which is lower than ENAS by 2.2. In addition, we further demonstrate the usefulness of the learned architecture by transferring it to wiki-text-2 (WT2) dataset well. Moreover, the extended experiments on the WT2 dataset also show promising results.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    3
    Citations
    NaN
    KQI
    []