Efficient Allocation and Heterogeneous Composition of NVM Crossbar Arrays for Deep Learning Acceleration.

2018 
Deep learning has recently shown extensive usage in a wide variety of applications. To accelerate deep learning in hardware, non-volatile memory (NVM) technologies have recently been used to perform neural network (NN) computation based upon their unique crossbar structure and multiple resistance states in a cell. We observe that the weight matrices in convolutional layers, although being small, are used many times by the input data. As a result, they have better chances to be replicated and co-located in the same crossbar array to improve data processing parallelism. The first scheme proposed in this paper therefore utilizes the shared input in replicating weight matrices and overlapping them to improve parallelism. Furthermore, this paper proposes a heterogeneous accelerator consisting of both large and small crossbar arrays, by mapping fully-connected layers to large crossbar arrays to obtain the area/power reductions and keeping convolutional layers in conventional (small) crossbar arrays to retain performance. Experimental results show significant benefits of the proposed schemes in performance, energy efficiency, and area.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    3
    Citations
    NaN
    KQI
    []