Gradient Statistics Aware Power Control for Over-the-Air Federated Learning

2021 
Federated learning (FL) is a promising technique that enables many edge devices to train a machine learning model collaboratively in wireless networks. By exploiting the superposition nature of wireless waveforms, over-the-air computation (AirComp) can accelerate model aggregation and hence facilitate communication-efficient FL. Due to channel fading, power control is crucial in AirComp. Prior works assume that the signal to be aggregated from each device, i.e., local gradient, can be normalized with zero mean and unit variance. In FL, however, gradient statistics vary over both training iterations and feature dimensions, and are unknown in advance. This paper studies the power control problem for over-the-air FL by taking gradient statistics into account. The goal is to minimize the aggregation error by jointly optimizing the transmit power at each device and the denoising factor at the edge server. We obtain the optimal policy in closed form when gradient statistics are given. Notably, we show that the optimal transmit power at each device is continuous and monotonically decreases with the squared multivariate coefficient of variation (SMCV) of gradient vectors. We also propose an estimation method of gradient statistics with negligible communication cost. Experimental results demonstrate high learning performance by using the proposed scheme.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    13
    Citations
    NaN
    KQI
    []