FaCT-LSTM: Fast and Compact Ternary Architecture for LSTM Recurrent Neural Networks

2021 
Long Short-Term Memory (LSTM) achieved great success in healthcare applications. However, its extensive computation cost and massive model size have become the major obstacles for the deployment of such a powerful algorithm in resource-limited embedded systems such as wearable devices. Quantization is a promising way to reduce the memory footprint and computational cost of neural networks. Although quantization achieved remarkable success in convolutional neural networks (CNNs), it still suffers from large accuracy loss in LSTM networks, especially in extremely low bitwidths. In this paper, we propose Fast and Compact Ternary LSTM (FaCT-LSTM), which bridges the accuracy gap between the full precision and quantized neural networks. We propose a hardware-friendly circuit to implement ternarized LSTM and eliminate computation-intensive floating-point operations. With the proposed ternarized LSTM architectures, our experiments on the ECG and EMG signals show ~0.88 to 2.04% accuracy loss in comparison to the full-precision counterparts while reducing latency and area for ~111× to 116× and ~29× to 33×, respectively. The proposed architectures also improves the memory footprint and bandwidth of the full precision signal classification, by 17×, and 31×, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []