An asynchronous and reconfigurable CNN accelerator

2018 
In this paper, we introduce a novel asynchronous and reconfigurable convolutional neural network(CNN) accelerator including a 5x5 computation array of processing elements(PE) with a dynamically reconfigurable architecture and a pooling unit(PU). With this architecture, the data path, calculation method, the pooling way and pooling size can be changed according to the configurable information for different CNN models. In this accelerator, the global clock is replaced by the local pulse signals from Click elements. And an asynchronous pipeline formed by Click elements in series enables the circuits to work in pipeline mode without any sacrifice of speed because of the self-timed property of asynchronous circuits. A 5x5 registers array together with the computation array is fully connected by an asynchronous Mesh network, which yields a 60% decrease of the access to off-chip memory by reusing the input data. A CNN model, LeNet-5 is simulated with the FPGA of Xilinx VC707. Compared with previous synchronous work, the unit performance achieves a 15.6% increase in speed.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    2
    Citations
    NaN
    KQI
    []