Error-Compensated Sparsification for Communication-Efficient Decentralized Training in Edge Environment

2022 
Communication has been considered as a major bottleneck in large-scale decentralized training systems since participating nodes iteratively exchange large amounts of intermediate data with their neighbors. Although compression techniques like sparsification can significantly reduce the communication overhead in each iteration, errors caused by compression will be accumulated, resulting in a severely degraded convergence rate. Recently, the error compensation method for sparsification has been proposed in centralized training to tolerate the accumulated compression errors. However, the analog technique and the corresponding theory about its convergence in decentralized training are still unknown. To fill in the gap, we design a method named ECSD-SGD that significantly accelerates decentralized training via error-compensated sparsification. The novelty lies in that we identify the component of the exchanging information in each iteration (i.e., the sparsified model update) and make targeted error compensation over the component. Our thorough theoretical analysis shows that ECSD-SGD supports arbitrary sparsification ratio and achieves the same convergence rate as the non-sparsified decentralized training methods. We also conduct extensive experiments on multiple deep learning models to validate our theoretical findings. Results show that ECSD-SGD outperforms all the start-of-the-art sparsified methods in terms of both the convergence speed and the final generalization accuracy.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    41
    References
    1
    Citations
    NaN
    KQI
    []