An In-Flash Binary Neural Network Accelerator with SLC NAND Flash Array

2020 
An SLC NAND array based in-flash computing core is proposed for enabling vector-matrix multiplications in binarized neural network (BNN) and binary weight network (BWN). Two SLC NAND floating gate (FG) cells in the same string store complementary data to encode binarized weight value of a single synapse for realizing the BNN algorithm. By activating multiple BLs and WLs across block and/or plane levels, the performance of vector-matrix multiplications is significantly accelerated. The system architecture using in-flash computing core for BNN/BWN acceleration is introduced. A wide range of BNN/BWN models can be efficiently mapped to a scalable array of in-flash computing core. An in-flash computing core is able to achieve an estimated peak energy efficiency of 2.56 TOP/s/W for BNN, 292.6 GOP/s/W for 8-b input based BWN by customizing the numbers of word-lines (WLs)/BLs/blocks/planes.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    9
    References
    3
    Citations
    NaN
    KQI
    []