Processing LSTM in memory using hybrid network expansion model

Yu Gong,Tingting Xu,Bo Liu,Wei Ge,Jinjiang Yang,Jun Yang,Longxing Shi

Processing LSTM in memory using hybrid network expansion model

2017

Yu Gong
Tingting Xu
Bo Liu
Wei Ge
Jinjiang Yang
Jun Yang
Longxing Shi

With the rapidly increasing applications of deep learning, LSTM-RNNs are widely used. Meanwhile, the complex data dependence and intensive computation limit the performance of the accelerators. In this paper, we first proposed a hybrid network expansion model to exploit the finegrained data parallelism. Based on the model, we implemented a Reconfigurable Processing Unit(RPU) using Processing In Memory(PIM) units. Our work shows that the gates and cells in LSTM can be partitioned to fundamental operations and then recombined and mapped into heterogeneous computing components. The experimental results show that, implemented on 45nm CMOS process, the proposed RPU with size of 1.51 mm 2 and power of 413 mw achieves 309 GOPS/W in power efficiency, and is 1.7 χ better than state-of-the-art reconfigurable architecture.

Keywords:

Deep learning
Data parallelism
Logic gate
Parallel computing
Real-time computing
Symmetric multiprocessor system
CMOS
Computer science
Architecture
Complex data type
Data modeling
Artificial intelligence
Electrical efficiency
Computation

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations