A 44.1TOPS/W Precision-Scalable Accelerator for Quantized Neural Networks in 28nm CMOS

Sungju Ryu,Hyungjun Kim,Wooseok Yi,Jongeun Koo,Eunhwan Kim,Yulhwa Kim,Taesu Kim,Jae-Joon Kim

A 44.1TOPS/W Precision-Scalable Accelerator for Quantized Neural Networks in 28nm CMOS

2020

Sungju Ryu
Hyungjun Kim
Wooseok Yi
Jongeun Koo
Eunhwan Kim
Yulhwa Kim
Taesu Kim
Jae-Joon Kim

Supporting variable precision for computing quantized neural network in a hardware accelerator is an efficient way to reduce overall computation time and energy. However, in the previous precision-scalable hardware, bit-reconfiguration logic increases the chip area significantly. In this paper, we demonstrate a compact precision-scalable accelerator chip using bitwise summation and channel-wise aligning schemes. The measurement results show that the peak performance per compute area is improved by 5.1-7.7x and system-level energy-efficiency is improved by up to 64% compared to previous precision-scalable accelerators.

Keywords:

quantized neural networks
Electronic engineering
CMOS
Computer science
Scalability
Bitwise operation
Chip
Hardware acceleration
Computation
Artificial neural network
Computer hardware
System on a chip

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations