A stochastic-computing based deep learning framework using adiabatic quantum-flux-parametron superconducting technology

2019 
The Adiabatic Quantum-Flux-Parametron (AQFP) superconducting technology has been recently developed, which achieves the highest energy efficiency among superconducting logic families, potentially 104--105 gain compared with state-of-the-art CMOS. In 2016, the successful fabrication and testing of AQFP-based circuits with the scale of 83,000 JJs have demonstrated the scalability and potential of implementing large-scale systems using AQFP. As a result, it will be promising for AQFP in high-performance computing and deep space applications, with Deep Neural Network (DNN) inference acceleration as an important example. Besides ultra-high energy efficiency, AQFP exhibits two unique characteristics: the deep pipelining nature since each AQFP logic gate is connected with an AC clock signal, which increases the difficulty to avoid RAW hazards; the second is the unique opportunity of true random number generation (RNG) using a single AQFP buffer, far more efficient than RNG in CMOS. We point out that these two characteristics make AQFP especially compatible with the stochastic computing (SC) technique, which uses a time-independent bit sequence for value representation, and is compatible with the deep pipelining nature. Further, the application of SC has been investigated in DNNs in prior work, and the suitability has been illustrated as SC is more compatible with approximate computations. This work is the first to develop an SC-based DNN acceleration framework using AQFP technology. The deep-pipelining nature of AQFP circuits translates into the difficulty in designing accumulators/counters in AQFP, which makes the prior design in SC-based DNN not suitable. We overcome this limitation taking into account different properties in CONV and FC layers: (i) the inner product calculation in FC layers has more number of inputs than that in CONV layers; (ii) accurate activation function is critical in CONV rather than FC layers. Based on these observations, we propose (i) accurate integration of summation and activation function in CONV layers using bitonic sorting network and feedback loop, and (ii) low-complexity categorization block for FC layers based on chain of majority gates. For complete design we also develop (i) ultra-efficient stochastic number generator in AQFP, (ii) a high-accuracy sub-sampling (pooling) block in AQFP, and (iii) majority synthesis for further performance improvement and automatic buffer/splitter insertion for requirement of AQFP circuits. Experimental results suggest that the proposed SC-based DNN using AQFP can achieve up to 6.8 × 104 times higher energy efficiency compared to CMOS-based implementation while maintaining 96% accuracy on the MNIST dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    58
    References
    16
    Citations
    NaN
    KQI
    []