Block-Circulant Neural Network Accelerator Featuring Fine-Grained Frequency-Domain Quantization and Reconfigurable FFT Modules.
2021
Block-circulant based compression is a popular technique to accelerate neural network inference. Though storage and computing costs can be reduced by transforming weights into block-circulant matrices, this method incurs uneven data distribution in the frequency domain and imbalanced workload. In this paper, we propose RAB: a Reconfigurable Architecture Block-Circulant Neural Network Accelerator to solve the problems via two techniques. First, a fine-grained frequency-domain quantization is proposed to accelerate MAC operations. Second, a reconfigurable architecture is designed to transform FFT/IFFT modules into MAC modules, which alleviates the imbalanced workload and further improves efficiency. Experimental results show that RAB can achieve 1.9x/1.8x area/energy efficiency improvement compared with the state-of-the-art block-circulant compression based accelerator.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
17
References
0
Citations
NaN
KQI