A Compute-in-Memory Architecture Compatible with 3D NAND Flash that Parallelly Activates Multi-Layers

Liang Zhao,Chu Yan,Fan Yang,Shifan Gao,Gabriel Rosca,Dan Manea,Zhichao Lu,Yi Zhao

A Compute-in-Memory Architecture Compatible with 3D NAND Flash that Parallelly Activates Multi-Layers

2021

Compute-In-Memory (CIM) architectures based on emerging non-volatile memories have demonstrated great potential in accelerating neural network computation for AI applications. However, the reliability challenges associated with multi-level cells and the lack of mature 3D-integration scheme have limited the model size and energy efficiency of these architectures. In this work, we propose a novel NAND-based architecture to efficiently accelerate the vector-matrix multiplication for deep neural networks. The proposed approach is fully compatible with 3D-NAND and allows multiple layers of wordline (WL) planes to be activated in parallel, as opposed to the previous layer-by-layer activation. The revolutionary linear-V T correction and positive-negative weights techniques help to achieve multilevel weight storage and better computing precision. The feasibility and accuracy of the proposed architecture have been verified using TCAD, SPICE and system-level simulations based on commercial 3D-NAND parameters. Major advantages of the approach include $16 \sim32\mathrm{x}$ increase of array utilization and $64 \sim128\mathrm{x}$ reduction of read power consumption.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations