Adaptive Learning Rate Adjustment with Short-Term Pre-Training in Data-Parallel Deep Learning

2018 
This paper introduces a method to adaptively choose a learning rate (LR) with short-term pre-training (STPT). This is useful for quick model prototyping in data-parallel deep learning. For unknown models, it is necessary to tune numerous hyperparameters. The proposed method reduces computational time and increases efficiency in finding an appropriate LR; multiple LRs are evaluated by STPT in data-parallel deep learning. STPT means training only with the beginning iterations in an epoch. When eight LRs are evaluated using eight parallel workers, the proposed method can easily reduce the computational time by 87.5% in comparison with the conventional method. The accuracy is also improved by 4.8% in comparison with the conventional method with a reference LR of 0.1; thus, no deterioration in accuracy is observed. For an unknown model, this method shows a better training curve trend than other cases with fixed LRs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    14
    References
    0
    Citations
    NaN
    KQI
    []