14.3 A 65nm Computing-in-Memory-Based CNN Processor with 2.9-to-35.8TOPS/W System Energy Efficiency Using Dynamic-Sparsity Performance-Scaling Architecture and Energy-Efficient Inter/Intra-Macro Data Reuse

Jinshan Yue,Zhe Yuan,Xiaoyu Feng,Yifan He,Zhixiao Zhang,Xin Si,Ruhui Liu,Meng-Fan Chang,Xueqing Li,Huazhong Yang,Yongpan Liu

14.3 A 65nm Computing-in-Memory-Based CNN Processor with 2.9-to-35.8TOPS/W System Energy Efficiency Using Dynamic-Sparsity Performance-Scaling Architecture and Energy-Efficient Inter/Intra-Macro Data Reuse

2020

Computing-in-Memory (CIM) is a promising solution for energy-efficient neural network (NN) processors. Previous CIM chips [1], [4] mainly focus on the memory macro itself, lacking insight on the overall system integration. Recently, a CIM-based system processor [5] for speech recognition demonstrated promising energy efficiency. No prior work systematically explores sparsity optimization for a CIM processor. Directly mapping sparse NN models onto regular CIM macros is ineffective, since sparse data is usually randomly distributed and CIM macros cannot be power gated even when many zeros exist. For a high compression rate and high efficiency, the granularity of sparsity [6] needs to be explored based on CIM characteristics. Moreover, system-level weight mapping to a CIM macro and data-reuse strategies are not well explored - these directions are important for CIM macro utilization and energy efficiency.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations