Zac: Towards Automatic Optimization and Deployment of Quantized Deep Neural Networks on Embedded Devices

2019 
With the development toward commercial and civil use, the need for deploying Deep neural network (DNN) models on resource-constrained embedded devices is growing. Quantization has a para-mount impact on the performance, storage, and energy efficiency. However, to fully realize these benefits, programmers need to manually utilize low precision operations while maintaining accuracy, which is very challenging. Hence, we present a framework Zac to automatically optimize and deploy quantized DNN models on embedded devices. In order to do this, Zac performs necessary data type conversion and chooses proper data types for the intermediate data. Then it automatically customizes the operations according to the chosen types. Experiments demonstrate that by utilizing quantized models, Zac offers up to 19.18X and 25.44X improvement for throughput and energy efficiency compared with full precision designs, respectively. The automatically generated hardware designs from Zac achieve comparable performance to the highly optimized state-of-the-art accelerators which are designed manually.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    6
    Citations
    NaN
    KQI
    []