PIM-prune: fine-grain DCNN pruning for crossbar-based process-in-memory architecture

2020 
Deep Convolution Neural network (DCNN) pruning is an efficient way to reduce the resource and power consumption in a DCNN accelerator. Exploiting the sparsity in the weight matrices of DCNNs, however, is nontrivial if we deploy these DCNNs in a crossbar-based Process-In-Memory (PIM) architecture, because of the crossbar structure. Structural pruning-exploiting a coarse-grained sparsity, such as filter/channel-level pruning- can result in a compressed weight matrix that fits the crossbar structure. However, this pruning method inevitably degrades the model accuracy. To solve this problem, in this paper, we propose PIM-PRUNE to exploit the finer-grained sparsity in PIM-architecture, and the resulting compressed weight matrices can significantly reduce the demand of crossbars with negligible accuracy loss.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    0
    Citations
    NaN
    KQI
    []