A Compute-in-Memory Architecture Compatible with 3D NAND Flash that Parallelly Activates Multi-Layers

2021 
Compute-In-Memory (CIM) architectures based on emerging non-volatile memories have demonstrated great potential in accelerating neural network computation for AI applications. However, the reliability challenges associated with multi-level cells and the lack of mature 3D-integration scheme have limited the model size and energy efficiency of these architectures. In this work, we propose a novel NAND-based architecture to efficiently accelerate the vector-matrix multiplication for deep neural networks. The proposed approach is fully compatible with 3D-NAND and allows multiple layers of wordline (WL) planes to be activated in parallel, as opposed to the previous layer-by-layer activation. The revolutionary linear-V T correction and positive-negative weights techniques help to achieve multilevel weight storage and better computing precision. The feasibility and accuracy of the proposed architecture have been verified using TCAD, SPICE and system-level simulations based on commercial 3D-NAND parameters. Major advantages of the approach include $16 \sim32\mathrm{x}$ increase of array utilization and $64 \sim128\mathrm{x}$ reduction of read power consumption.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    13
    References
    0
    Citations
    NaN
    KQI
    []