Training Deep Neural Networks in 8-bit Fixed Point with Dynamic Shared Exponent Management

2021 
The increase in complexity and depth of deep neural networks (DNNs) has created a strong need to improve computing performance. Quantization methods for training DNNs can effectively improve computation throughput and energy efficiency of hardware platforms. We have developed an 8-bit quantization training method representing the weight, activation, and gradient tensors in an 8-bit fixed point data format. The shared exponent for each tensor is managed dynamically on the basis of the distribution of the tensor elements calculated in the previous training phase, not in the current training phase, which improves computation throughput. This method provides up to 3.7 -times computation throughput compared with FP32 computation without accuracy degradation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    1
    References
    0
    Citations
    NaN
    KQI
    []